Sample records for building classification models

  1. Combining Unsupervised and Supervised Classification to Build User Models for Exploratory Learning Environments

    ERIC Educational Resources Information Center

    Amershi, Saleema; Conati, Cristina

    2009-01-01

    In this paper, we present a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments. We apply the framework to build student models for two different learning environments and using two different data sources (logged interface and eye-tracking data).…

  2. Automatic 3d Building Model Generations with Airborne LiDAR Data

    NASA Astrophysics Data System (ADS)

    Yastikli, N.; Cetin, Z.

    2017-11-01

    LiDAR systems become more and more popular because of the potential use for obtaining the point clouds of vegetation and man-made objects on the earth surface in an accurate and quick way. Nowadays, these airborne systems have been frequently used in wide range of applications such as DEM/DSM generation, topographic mapping, object extraction, vegetation mapping, 3 dimensional (3D) modelling and simulation, change detection, engineering works, revision of maps, coastal management and bathymetry. The 3D building model generation is the one of the most prominent applications of LiDAR system, which has the major importance for urban planning, illegal construction monitoring, 3D city modelling, environmental simulation, tourism, security, telecommunication and mobile navigation etc. The manual or semi-automatic 3D building model generation is costly and very time-consuming process for these applications. Thus, an approach for automatic 3D building model generation is needed in a simple and quick way for many studies which includes building modelling. In this study, automatic 3D building models generation is aimed with airborne LiDAR data. An approach is proposed for automatic 3D building models generation including the automatic point based classification of raw LiDAR point cloud. The proposed point based classification includes the hierarchical rules, for the automatic production of 3D building models. The detailed analyses for the parameters which used in hierarchical rules have been performed to improve classification results using different test areas identified in the study area. The proposed approach have been tested in the study area which has partly open areas, forest areas and many types of the buildings, in Zekeriyakoy, Istanbul using the TerraScan module of TerraSolid. The 3D building model was generated automatically using the results of the automatic point based classification. The obtained results of this research on study area verified that automatic 3D building models can be generated successfully using raw LiDAR point cloud data.

  3. Ontology for Life-Cycle Modeling of Electrical Distribution Systems: Model View Definition

    DTIC Science & Technology

    2013-06-01

    building information models ( BIM ) at the coordinated design stage of building construction. 1.3 Approach To...standard for exchanging Building Information Modeling ( BIM ) data, which defines hundreds of classes for common use in software, currently supported by...specifications, Construction Operations Building in- formation exchange (COBie), Building Information Modeling ( BIM ) 16. SECURITY CLASSIFICATION OF:

  4. Ontology for Life-Cycle Modeling of Electrical Distribution Systems: Application of Model View Definition Attributes

    DTIC Science & Technology

    2013-06-01

    Building in- formation exchange (COBie), Building Information Modeling ( BIM ) 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...to develop a life-cycle building model have resulted in the definition of a “core” building information model that contains general information de...develop an information -exchange Model View Definition (MVD) for building electrical systems. The objective of the current work was to document the

  5. Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Bassier, M.; Vergauwen, M.; Van Genechten, B.

    2017-08-01

    Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.

  6. A fuzzy hill-climbing algorithm for the development of a compact associative classifier

    NASA Astrophysics Data System (ADS)

    Mitra, Soumyaroop; Lam, Sarah S.

    2012-02-01

    Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.

  7. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    NASA Astrophysics Data System (ADS)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  8. Model-Based Building Detection from Low-Cost Optical Sensors Onboard Unmanned Aerial Vehicles

    NASA Astrophysics Data System (ADS)

    Karantzalos, K.; Koutsourakis, P.; Kalisperakis, I.; Grammatikopoulos, L.

    2015-08-01

    The automated and cost-effective building detection in ultra high spatial resolution is of major importance for various engineering and smart city applications. To this end, in this paper, a model-based building detection technique has been developed able to extract and reconstruct buildings from UAV aerial imagery and low-cost imaging sensors. In particular, the developed approach through advanced structure from motion, bundle adjustment and dense image matching computes a DSM and a true orthomosaic from the numerous GoPro images which are characterised by important geometric distortions and fish-eye effect. An unsupervised multi-region, graphcut segmentation and a rule-based classification is responsible for delivering the initial multi-class classification map. The DTM is then calculated based on inpaininting and mathematical morphology process. A data fusion process between the detected building from the DSM/DTM and the classification map feeds a grammar-based building reconstruction and scene building are extracted and reconstructed. Preliminary experimental results appear quite promising with the quantitative evaluation indicating detection rates at object level of 88% regarding the correctness and above 75% regarding the detection completeness.

  9. A compressed sensing method with analytical results for lidar feature classification

    NASA Astrophysics Data System (ADS)

    Allen, Josef D.; Yuan, Jiangbo; Liu, Xiuwen; Rahmes, Mark

    2011-04-01

    We present an innovative way to autonomously classify LiDAR points into bare earth, building, vegetation, and other categories. One desirable product of LiDAR data is the automatic classification of the points in the scene. Our algorithm automatically classifies scene points using Compressed Sensing Methods via Orthogonal Matching Pursuit algorithms utilizing a generalized K-Means clustering algorithm to extract buildings and foliage from a Digital Surface Models (DSM). This technology reduces manual editing while being cost effective for large scale automated global scene modeling. Quantitative analyses are provided using Receiver Operating Characteristics (ROC) curves to show Probability of Detection and False Alarm of buildings vs. vegetation classification. Histograms are shown with sample size metrics. Our inpainting algorithms then fill the voids where buildings and vegetation were removed, utilizing Computational Fluid Dynamics (CFD) techniques and Partial Differential Equations (PDE) to create an accurate Digital Terrain Model (DTM) [6]. Inpainting preserves building height contour consistency and edge sharpness of identified inpainted regions. Qualitative results illustrate other benefits such as Terrain Inpainting's unique ability to minimize or eliminate undesirable terrain data artifacts.

  10. Objected-oriented remote sensing image classification method based on geographic ontology model

    NASA Astrophysics Data System (ADS)

    Chu, Z.; Liu, Z. J.; Gu, H. Y.

    2016-11-01

    Nowadays, with the development of high resolution remote sensing image and the wide application of laser point cloud data, proceeding objected-oriented remote sensing classification based on the characteristic knowledge of multi-source spatial data has been an important trend on the field of remote sensing image classification, which gradually replaced the traditional method through improving algorithm to optimize image classification results. For this purpose, the paper puts forward a remote sensing image classification method that uses the he characteristic knowledge of multi-source spatial data to build the geographic ontology semantic network model, and carries out the objected-oriented classification experiment to implement urban features classification, the experiment uses protégé software which is developed by Stanford University in the United States, and intelligent image analysis software—eCognition software as the experiment platform, uses hyperspectral image and Lidar data that is obtained through flight in DaFeng City of JiangSu as the main data source, first of all, the experiment uses hyperspectral image to obtain feature knowledge of remote sensing image and related special index, the second, the experiment uses Lidar data to generate nDSM(Normalized DSM, Normalized Digital Surface Model),obtaining elevation information, the last, the experiment bases image feature knowledge, special index and elevation information to build the geographic ontology semantic network model that implement urban features classification, the experiment results show that, this method is significantly higher than the traditional classification algorithm on classification accuracy, especially it performs more evidently on the respect of building classification. The method not only considers the advantage of multi-source spatial data, for example, remote sensing image, Lidar data and so on, but also realizes multi-source spatial data knowledge integration and application of the knowledge to the field of remote sensing image classification, which provides an effective way for objected-oriented remote sensing image classification in the future.

  11. A new method of building footprints detection using airborne laser scanning data and multispectral image

    NASA Astrophysics Data System (ADS)

    Luo, Yiping; Jiang, Ting; Gao, Shengli; Wang, Xin

    2010-10-01

    It presents a new approach for detecting building footprints in a combination of registered aerial image with multispectral bands and airborne laser scanning data synchronously obtained by Leica-Geosystems ALS40 and Applanix DACS-301 on the same platform. A two-step method for building detection was presented consisting of selecting 'building' candidate points and then classifying candidate points. A digital surface model(DSM) derived from last pulse laser scanning data was first filtered and the laser points were classified into classes 'ground' and 'building or tree' based on mathematic morphological filter. Then, 'ground' points were resample into digital elevation model(DEM), and a Normalized DSM(nDSM) was generated from DEM and DSM. The candidate points were selected from 'building or tree' points by height value and area threshold in nDSM. The candidate points were further classified into building points and tree points by using the support vector machines(SVM) classification method. Two classification tests were carried out using features only from laser scanning data and associated features from two input data sources. The features included height, height finite difference, RGB bands value, and so on. The RGB value of points was acquired by matching laser scanning data and image using collinear equation. The features of training points were presented as input data for SVM classification method, and cross validation was used to select best classification parameters. The determinant function could be constructed by the classification parameters and the class of candidate points was determined by determinant function. The result showed that associated features from two input data sources were superior to features only from laser scanning data. The accuracy of more than 90% was achieved for buildings in first kind of features.

  12. Automatic 3D Extraction of Buildings, Vegetation and Roads from LIDAR Data

    NASA Astrophysics Data System (ADS)

    Bellakaout, A.; Cherkaoui, M.; Ettarid, M.; Touzani, A.

    2016-06-01

    Aerial topographic surveys using Light Detection and Ranging (LiDAR) technology collect dense and accurate information from the surface or terrain; it is becoming one of the important tools in the geosciences for studying objects and earth surface. Classification of Lidar data for extracting ground, vegetation, and buildings is a very important step needed in numerous applications such as 3D city modelling, extraction of different derived data for geographical information systems (GIS), mapping, navigation, etc... Regardless of what the scan data will be used for, an automatic process is greatly required to handle the large amount of data collected because the manual process is time consuming and very expensive. This paper is presenting an approach for automatic classification of aerial Lidar data into five groups of items: buildings, trees, roads, linear object and soil using single return Lidar and processing the point cloud without generating DEM. Topological relationship and height variation analysis is adopted to segment, preliminary, the entire point cloud preliminarily into upper and lower contours, uniform and non-uniform surface, non-uniform surfaces, linear objects, and others. This primary classification is used on the one hand to know the upper and lower part of each building in an urban scene, needed to model buildings façades; and on the other hand to extract point cloud of uniform surfaces which contain roofs, roads and ground used in the second phase of classification. A second algorithm is developed to segment the uniform surface into buildings roofs, roads and ground, the second phase of classification based on the topological relationship and height variation analysis, The proposed approach has been tested using two areas : the first is a housing complex and the second is a primary school. The proposed approach led to successful classification results of buildings, vegetation and road classes.

  13. Property Specification Patterns for intelligence building software

    NASA Astrophysics Data System (ADS)

    Chun, Seungsu

    2018-03-01

    In this paper, through the property specification pattern research for Modal MU(μ) logical aspects present a single framework based on the pattern of intelligence building software. In this study, broken down by state property specification pattern classification of Dwyer (S) and action (A) and was subdivided into it again strong (A) and weaknesses (E). Through these means based on a hierarchical pattern classification of the property specification pattern analysis of logical aspects Mu(μ) was applied to the pattern classification of the examples used in the actual model checker. As a result, not only can a more accurate classification than the existing classification systems were easy to create and understand the attributes specified.

  14. The research on medical image classification algorithm based on PLSA-BOW model.

    PubMed

    Cao, C H; Cao, H L

    2016-04-29

    With the rapid development of modern medical imaging technology, medical image classification has become more important for medical diagnosis and treatment. To solve the existence of polysemous words and synonyms problem, this study combines the word bag model with PLSA (Probabilistic Latent Semantic Analysis) and proposes the PLSA-BOW (Probabilistic Latent Semantic Analysis-Bag of Words) model. In this paper we introduce the bag of words model in text field to image field, and build the model of visual bag of words model. The method enables the word bag model-based classification method to be further improved in accuracy. The experimental results show that the PLSA-BOW model for medical image classification can lead to a more accurate classification.

  15. Assessment of Life Cycle Information Exchanges (LCie): Understanding the Value-Added Benefit of a COBie Process

    DTIC Science & Technology

    2013-10-01

    exchange (COBie), Building Information Modeling ( BIM ), value-added analysis, business processes, project management 16. SECURITY CLASSIFICATION OF: 17...equipment. The innovative aspect of Building In- formation Modeling ( BIM ) is that it creates a computable building descrip- tion. The ability to use a...interoperability. In order for the building information to be interoperable, it must also con- form to a common data model , or schema, that defines the class

  16. Roof Type Selection Based on Patch-Based Classification Using Deep Learning for High Resolution Satellite Imagery

    NASA Astrophysics Data System (ADS)

    Partovi, T.; Fraundorfer, F.; Azimi, S.; Marmanis, D.; Reinartz, P.

    2017-05-01

    3D building reconstruction from remote sensing image data from satellites is still an active research topic and very valuable for 3D city modelling. The roof model is the most important component to reconstruct the Level of Details 2 (LoD2) for a building in 3D modelling. While the general solution for roof modelling relies on the detailed cues (such as lines, corners and planes) extracted from a Digital Surface Model (DSM), the correct detection of the roof type and its modelling can fail due to low quality of the DSM generated by dense stereo matching. To reduce dependencies of roof modelling on DSMs, the pansharpened satellite images as a rich resource of information are used in addition. In this paper, two strategies are employed for roof type classification. In the first one, building roof types are classified in a state-of-the-art supervised pre-trained convolutional neural network (CNN) framework. In the second strategy, deep features from deep layers of different pre-trained CNN model are extracted and then an RBF kernel using SVM is employed to classify the building roof type. Based on roof complexity of the scene, a roof library including seven types of roofs is defined. A new semi-automatic method is proposed to generate training and test patches of each roof type in the library. Using the pre-trained CNN model does not only decrease the computation time for training significantly but also increases the classification accuracy.

  17. A Framework for Text Mining in Scientometric Study: A Case Study in Biomedicine Publications

    NASA Astrophysics Data System (ADS)

    Silalahi, V. M. M.; Hardiyati, R.; Nadhiroh, I. M.; Handayani, T.; Rahmaida, R.; Amelia, M.

    2018-04-01

    The data of Indonesians research publications in the domain of biomedicine has been collected to be text mined for the purpose of a scientometric study. The goal is to build a predictive model that provides a classification of research publications on the potency for downstreaming. The model is based on the drug development processes adapted from the literatures. An effort is described to build the conceptual model and the development of a corpus on the research publications in the domain of Indonesian biomedicine. Then an investigation is conducted relating to the problems associated with building a corpus and validating the model. Based on our experience, a framework is proposed to manage the scientometric study based on text mining. Our method shows the effectiveness of conducting a scientometric study based on text mining in order to get a valid classification model. This valid model is mainly supported by the iterative and close interactions with the domain experts starting from identifying the issues, building a conceptual model, to the labelling, validation and results interpretation.

  18. Review of Development Survey of Phase Change Material Models in Building Applications

    PubMed Central

    Akeiber, Hussein J.; Wahid, Mazlan A.; Hussen, Hasanen M.; Mohammad, Abdulrahman Th.

    2014-01-01

    The application of phase change materials (PCMs) in green buildings has been increasing rapidly. PCM applications in green buildings include several development models. This paper briefly surveys the recent research and development activities of PCM technology in building applications. Firstly, a basic description of phase change and their principles is provided; the classification and applications of PCMs are also included. Secondly, PCM models in buildings are reviewed and discussed according to the wall, roof, floor, and cooling systems. Finally, conclusions are presented based on the collected data. PMID:25313367

  19. Building Change Detection from Bi-Temporal Dense-Matching Point Clouds and Aerial Images.

    PubMed

    Pang, Shiyan; Hu, Xiangyun; Cai, Zhongliang; Gong, Jinqi; Zhang, Mi

    2018-03-24

    In this work, a novel building change detection method from bi-temporal dense-matching point clouds and aerial images is proposed to address two major problems, namely, the robust acquisition of the changed objects above ground and the automatic classification of changed objects into buildings or non-buildings. For the acquisition of changed objects above ground, the change detection problem is converted into a binary classification, in which the changed area above ground is regarded as the foreground and the other area as the background. For the gridded points of each period, the graph cuts algorithm is adopted to classify the points into foreground and background, followed by the region-growing algorithm to form candidate changed building objects. A novel structural feature that was extracted from aerial images is constructed to classify the candidate changed building objects into buildings and non-buildings. The changed building objects are further classified as "newly built", "taller", "demolished", and "lower" by combining the classification and the digital surface models of two periods. Finally, three typical areas from a large dataset are used to validate the proposed method. Numerous experiments demonstrate the effectiveness of the proposed algorithm.

  20. Village Building Identification Based on Ensemble Convolutional Neural Networks

    PubMed Central

    Guo, Zhiling; Chen, Qi; Xu, Yongwei; Shibasaki, Ryosuke; Shao, Xiaowei

    2017-01-01

    In this study, we present the Ensemble Convolutional Neural Network (ECNN), an elaborate CNN frame formulated based on ensembling state-of-the-art CNN models, to identify village buildings from open high-resolution remote sensing (HRRS) images. First, to optimize and mine the capability of CNN for village mapping and to ensure compatibility with our classification targets, a few state-of-the-art models were carefully optimized and enhanced based on a series of rigorous analyses and evaluations. Second, rather than directly implementing building identification by using these models, we exploited most of their advantages by ensembling their feature extractor parts into a stronger model called ECNN based on the multiscale feature learning method. Finally, the generated ECNN was applied to a pixel-level classification frame to implement object identification. The proposed method can serve as a viable tool for village building identification with high accuracy and efficiency. The experimental results obtained from the test area in Savannakhet province, Laos, prove that the proposed ECNN model significantly outperforms existing methods, improving overall accuracy from 96.64% to 99.26%, and kappa from 0.57 to 0.86. PMID:29084154

  1. A Temporal Mining Framework for Classifying Un-Evenly Spaced Clinical Data: An Approach for Building Effective Clinical Decision-Making System.

    PubMed

    Jane, Nancy Yesudhas; Nehemiah, Khanna Harichandran; Arputharaj, Kannan

    2016-01-01

    Clinical time-series data acquired from electronic health records (EHR) are liable to temporal complexities such as irregular observations, missing values and time constrained attributes that make the knowledge discovery process challenging. This paper presents a temporal rough set induced neuro-fuzzy (TRiNF) mining framework that handles these complexities and builds an effective clinical decision-making system. TRiNF provides two functionalities namely temporal data acquisition (TDA) and temporal classification. In TDA, a time-series forecasting model is constructed by adopting an improved double exponential smoothing method. The forecasting model is used in missing value imputation and temporal pattern extraction. The relevant attributes are selected using a temporal pattern based rough set approach. In temporal classification, a classification model is built with the selected attributes using a temporal pattern induced neuro-fuzzy classifier. For experimentation, this work uses two clinical time series dataset of hepatitis and thrombosis patients. The experimental result shows that with the proposed TRiNF framework, there is a significant reduction in the error rate, thereby obtaining the classification accuracy on an average of 92.59% for hepatitis and 91.69% for thrombosis dataset. The obtained classification results prove the efficiency of the proposed framework in terms of its improved classification accuracy.

  2. Granular support vector machines with association rules mining for protein homology prediction.

    PubMed

    Tang, Yuchun; Jin, Bo; Zhang, Yan-Qing

    2005-01-01

    Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. A new learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high "purity" and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. The proposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good because it is easy to be implemented. More importantly and more interestingly, GSVM provides a new mechanism to address complex classification problems.

  3. Relative significance of heat transfer processes to quantify tradeoffs between complexity and accuracy of energy simulations with a building energy use patterns classification

    NASA Astrophysics Data System (ADS)

    Heidarinejad, Mohammad

    This dissertation develops rapid and accurate building energy simulations based on a building classification that identifies and focuses modeling efforts on most significant heat transfer processes. The building classification identifies energy use patterns and their contributing parameters for a portfolio of buildings. The dissertation hypothesis is "Building classification can provide minimal required inputs for rapid and accurate energy simulations for a large number of buildings". The critical literature review indicated there is lack of studies to (1) Consider synoptic point of view rather than the case study approach, (2) Analyze influence of different granularities of energy use, (3) Identify key variables based on the heat transfer processes, and (4) Automate the procedure to quantify model complexity with accuracy. Therefore, three dissertation objectives are designed to test out the dissertation hypothesis: (1) Develop different classes of buildings based on their energy use patterns, (2) Develop different building energy simulation approaches for the identified classes of buildings to quantify tradeoffs between model accuracy and complexity, (3) Demonstrate building simulation approaches for case studies. Penn State's and Harvard's campus buildings as well as high performance LEED NC office buildings are test beds for this study to develop different classes of buildings. The campus buildings include detailed chilled water, electricity, and steam data, enabling to classify buildings into externally-load, internally-load, or mixed-load dominated. The energy use of the internally-load buildings is primarily a function of the internal loads and their schedules. Externally-load dominated buildings tend to have an energy use pattern that is a function of building construction materials and outdoor weather conditions. However, most of the commercial medium-sized office buildings have a mixed-load pattern, meaning the HVAC system and operation schedule dictate the indoor condition regardless of the contribution of internal and external loads. To deploy the methodology to another portfolio of buildings, simulated LEED NC office buildings are selected. The advantage of this approach is to isolate energy performance due to inherent building characteristics and location, rather than operational and maintenance factors that can contribute to significant variation in building energy use. A framework for detailed building energy databases with annual energy end-uses is developed to select variables and omit outliers. The results show that the high performance office buildings are internally-load dominated with existence of three different clusters of low-intensity, medium-intensity, and high-intensity energy use pattern for the reviewed office buildings. Low-intensity cluster buildings benefit from small building area, while the medium- and high-intensity clusters have a similar range of floor areas and different energy use intensities. Half of the energy use in the low-intensity buildings is associated with the internal loads, such as lighting and plug loads, indicating that there are opportunities to save energy by using lighting or plug load management systems. A comparison between the frameworks developed for the campus buildings and LEED NC office buildings indicates these two frameworks are complementary to each other. Availability of the information has yielded to two different procedures, suggesting future studies for a portfolio of buildings such as city benchmarking and disclosure ordinance should collect and disclose minimal required inputs suggested by this study with the minimum level of monthly energy consumption granularity. This dissertation developed automated methods using the OpenStudio API (Application Programing Interface) to create energy models based on the building class. ASHRAE Guideline 14 defines well-accepted criteria to measure accuracy of energy simulations; however, there is no well-accepted methodology to quantify the model complexity without the influence of the energy modeler judgment about the model complexity. This study developed a novel method using two weighting factors, including weighting factors based on (1) computational time and (2) easiness of on-site data collection, to measure complexity of the energy models. Therefore, this dissertation enables measurement of both model complexity and accuracy as well as assessment of the inherent tradeoffs between energy simulation model complexity and accuracy. The results of this methodology suggest for most of the internal load contributors such as operation schedules the on-site data collection adds more complexity to the model compared to the computational time. Overall, this study provided specific data on tradeoffs between accuracy and model complexity that points to critical inputs for different building classes, rather than an increase in the volume and detail of model inputs as the current research and consulting practice indicates. (Abstract shortened by UMI.).

  4. Crowd-sourced data collection to support automatic classification of building footprint data

    NASA Astrophysics Data System (ADS)

    Hecht, Robert; Kalla, Matthias; Krüger, Tobias

    2018-05-01

    Human settlements are mainly formed by buildings with their different characteristics and usage. Despite the importance of buildings for the economy and society, complete regional or even national figures of the entire building stock and its spatial distribution are still hardly available. Available digital topographic data sets created by National Mapping Agencies or mapped voluntarily through a crowd via Volunteered Geographic Information (VGI) platforms (e.g. OpenStreetMap) contain building footprint information but often lack additional information on building type, usage, age or number of floors. For this reason, predictive modeling is becoming increasingly important in this context. The capabilities of machine learning allow for the prediction of building types and other building characteristics and thus, the efficient classification and description of the entire building stock of cities and regions. However, such data-driven approaches always require a sufficient amount of ground truth (reference) information for training and validation. The collection of reference data is usually cost-intensive and time-consuming. Experiences from other disciplines have shown that crowdsourcing offers the possibility to support the process of obtaining ground truth data. Therefore, this paper presents the results of an experimental study aiming at assessing the accuracy of non-expert annotations on street view images collected from an internet crowd. The findings provide the basis for a future integration of a crowdsourcing component into the process of land use mapping, particularly the automatic building classification.

  5. Simultaneous Co-Clustering and Classification in Customers Insight

    NASA Astrophysics Data System (ADS)

    Anggistia, M.; Saefuddin, A.; Sartono, B.

    2017-04-01

    Building predictive model based on the heterogeneous dataset may yield many problems, such as less precise in parameter and prediction accuracy. Such problem can be solved by segmenting the data into relatively homogeneous groups and then build a predictive model for each cluster. The advantage of using this strategy usually gives result in simpler models, more interpretable, and more actionable without any loss in accuracy and reliability. This work concerns on marketing data set which recorded a customer behaviour across products. There are some variables describing customer and product as attributes. The basic idea of this approach is to combine co-clustering and classification simultaneously. The objective of this research is to analyse the customer across product characteristics, so the marketing strategy implemented precisely.

  6. Comprehensive decision tree models in bioinformatics.

    PubMed

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.

  7. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics. PMID:22479449

  8. Classification of building infrastructure and automatic building footprint delineation using airborne laser swath mapping data

    NASA Astrophysics Data System (ADS)

    Caceres, Jhon

    Three-dimensional (3D) models of urban infrastructure comprise critical data for planners working on problems in wireless communications, environmental monitoring, civil engineering, and urban planning, among other tasks. Photogrammetric methods have been the most common approach to date to extract building models. However, Airborne Laser Swath Mapping (ALSM) observations offer a competitive alternative because they overcome some of the ambiguities that arise when trying to extract 3D information from 2D images. Regardless of the source data, the building extraction process requires segmentation and classification of the data and building identification. In this work, approaches for classifying ALSM data, separating building and tree points, and delineating ALSM footprints from the classified data are described. Digital aerial photographs are used in some cases to verify results, but the objective of this work is to develop methods that can work on ALSM data alone. A robust approach for separating tree and building points in ALSM data is presented. The method is based on supervised learning of the classes (tree vs. building) in a high dimensional feature space that yields good class separability. Features used for classification are based on the generation of local mappings, from three-dimensional space to two-dimensional space, known as "spin images" for each ALSM point to be classified. The method discriminates ALSM returns in compact spaces and even where the classes are very close together or overlapping spatially. A modified algorithm of the Hough Transform is used to orient the spin images, and the spin image parameters are specified such that the mutual information between the spin image pixel values and class labels is maximized. This new approach to ALSM classification allows us to fully exploit the 3D point information in the ALSM data while still achieving good class separability, which has been a difficult trade-off in the past. Supported by the spin image analysis for obtaining an initial classification, an automatic approach for delineating accurate building footprints is presented. The physical fact that laser pulses that happen to strike building edges can produce very different 1st and last return elevations has been long recognized. However, in older generation ALSM systems (<50 kHz pulse rates) such points were too few and far between to delineate building footprints precisely. Furthermore, without the robust separation of nearby trees and vegetation from the buildings, simply extracting ALSM shots where the elevation of the first return was much higher than the elevation of the last return, was not a reliable means of identifying building footprints. However, with the advent of ALSM systems with pulse rates in excess of 100 kHz, and by using spin-imaged based segmentation, it is now possible to extract building edges from the point cloud. A refined classification resulting from incorporating "on-edge" information is developed for obtaining quadrangular footprints. The footprint fitting process involves line generalization, least squares-based clustering and dominant points finding for segmenting individual building edges. In addition, an algorithm for fitting complex footprints using the segmented edges and data inside footprints is also proposed.

  9. Integrated Change Detection and Classification in Urban Areas Based on Airborne Laser Scanning Point Clouds.

    PubMed

    Tran, Thi Huong Giang; Ressl, Camillo; Pfeifer, Norbert

    2018-02-03

    This paper suggests a new approach for change detection (CD) in 3D point clouds. It combines classification and CD in one step using machine learning. The point cloud data of both epochs are merged for computing features of four types: features describing the point distribution, a feature relating to relative terrain elevation, features specific for the multi-target capability of laser scanning, and features combining the point clouds of both epochs to identify the change. All these features are merged in the points and then training samples are acquired to create the model for supervised classification, which is then applied to the whole study area. The final results reach an overall accuracy of over 90% for both epochs of eight classes: lost tree, new tree, lost building, new building, changed ground, unchanged building, unchanged tree, and unchanged ground.

  10. Selective classification and quantification model of C&D waste from material resources consumed in residential building construction.

    PubMed

    Mercader-Moyano, Pilar; Ramírez-de-Arellano-Agudo, Antonio

    2013-05-01

    The unfortunate economic situation involving Spain and the European Union is, among other factors, the result of intensive construction activity over recent years. The excessive consumption of natural resources, together with the impact caused by the uncontrolled dumping of untreated C&D waste in illegal landfills have caused environmental pollution and a deterioration of the landscape. The objective of this research was to generate a selective classification and quantification model of C&D waste based on the material resources consumed in the construction of residential buildings, either new or renovated, namely the Conventional Constructive Model (CCM). A practical example carried out on ten residential buildings in Seville, Spain, enabled the identification and quantification of the C&D waste generated in their construction and the origin of the waste, in terms of the building material from which it originated and its impact for every m(2) constructed. This model enables other researchers to establish comparisons between the various improvements proposed for the minimization of the environmental impact produced by building a CCM, new corrective measures to be proposed in future policies that regulate the production and management of C&D waste generated in construction from the design stage to the completion of the construction process, and the establishment of sustainable management for C&D waste and for the selection of materials for the construction on projected or renovated buildings.

  11. Hybrid Automatic Building Interpretation System

    NASA Astrophysics Data System (ADS)

    Pakzad, K.; Klink, A.; Müterthies, A.; Gröger, G.; Stroh, V.; Plümer, L.

    2011-09-01

    HABIS (Hybrid Automatic Building Interpretation System) is a system for an automatic reconstruction of building roofs used in virtual 3D building models. Unlike most of the commercially available systems, HABIS is able to work to a high degree automatically. The hybrid method uses different sources intending to exploit the advantages of the particular sources. 3D point clouds usually provide good height and surface data, whereas spatial high resolution aerial images provide important information for edges and detail information for roof objects like dormers or chimneys. The cadastral data provide important basis information about the building ground plans. The approach used in HABIS works with a multi-stage-process, which starts with a coarse roof classification based on 3D point clouds. After that it continues with an image based verification of these predicted roofs. In a further step a final classification and adjustment of the roofs is done. In addition some roof objects like dormers and chimneys are also extracted based on aerial images and added to the models. In this paper the used methods are described and some results are presented.

  12. Automated detection of radioisotopes from an aircraft platform by pattern recognition analysis of gamma-ray spectra.

    PubMed

    Dess, Brian W; Cardarelli, John; Thomas, Mark J; Stapleton, Jeff; Kroutil, Robert T; Miller, David; Curry, Timothy; Small, Gary W

    2018-03-08

    A generalized methodology was developed for automating the detection of radioisotopes from gamma-ray spectra collected from an aircraft platform using sodium-iodide detectors. Employing data provided by the U.S Environmental Protection Agency Airborne Spectral Photometric Environmental Collection Technology (ASPECT) program, multivariate classification models based on nonparametric linear discriminant analysis were developed for application to spectra that were preprocessed through a combination of altitude-based scaling and digital filtering. Training sets of spectra for use in building classification models were assembled from a combination of background spectra collected in the field and synthesized spectra obtained by superimposing laboratory-collected spectra of target radioisotopes onto field backgrounds. This approach eliminated the need for field experimentation with radioactive sources for use in building classification models. Through a bi-Gaussian modeling procedure, the discriminant scores that served as the outputs from the classification models were related to associated confidence levels. This provided an easily interpreted result regarding the presence or absence of the signature of a specific radioisotope in each collected spectrum. Through the use of this approach, classifiers were built for cesium-137 ( 137 Cs) and cobalt-60 ( 60 Co), two radioisotopes that are of interest in airborne radiological monitoring applications. The optimized classifiers were tested with field data collected from a set of six geographically diverse sites, three of which contained either 137 Cs, 60 Co, or both. When the optimized classification models were applied, the overall percentages of correct classifications for spectra collected at these sites were 99.9 and 97.9% for the 60 Co and 137 Cs classifiers, respectively. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. 29 CFR 779.355 - Classification of lumber and building materials sales.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 29 Labor 3 2012-07-01 2012-07-01 false Classification of lumber and building materials sales. 779... Service Establishments Lumber and Building Materials Dealers § 779.355 Classification of lumber and building materials sales. (a) General. In determining, for purposes of the section 13(a)(2) and (4...

  14. 29 CFR 779.355 - Classification of lumber and building materials sales.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 29 Labor 3 2013-07-01 2013-07-01 false Classification of lumber and building materials sales. 779... Service Establishments Lumber and Building Materials Dealers § 779.355 Classification of lumber and building materials sales. (a) General. In determining, for purposes of the section 13(a)(2) and (4...

  15. 29 CFR 779.355 - Classification of lumber and building materials sales.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 29 Labor 3 2014-07-01 2014-07-01 false Classification of lumber and building materials sales. 779... Service Establishments Lumber and Building Materials Dealers § 779.355 Classification of lumber and building materials sales. (a) General. In determining, for purposes of the section 13(a)(2) and (4...

  16. Compact and Hybrid Feature Description for Building Extraction

    NASA Astrophysics Data System (ADS)

    Li, Z.; Liu, Y.; Hu, Y.; Li, P.; Ding, Y.

    2017-05-01

    Building extraction in aerial orthophotos is crucial for various applications. Currently, deep learning has been shown to be successful in addressing building extraction with high accuracy and high robustness. However, quite a large number of samples is required in training a classifier when using deep learning model. In order to realize accurate and semi-interactive labelling, the performance of feature description is crucial, as it has significant effect on the accuracy of classification. In this paper, we bring forward a compact and hybrid feature description method, in order to guarantees desirable classification accuracy of the corners on the building roof contours. The proposed descriptor is a hybrid description of an image patch constructed from 4 sets of binary intensity tests. Experiments show that benefiting from binary description and making full use of color channels, this descriptor is not only computationally frugal, but also accurate than SURF for building extraction.

  17. Jobs in Construction. Job Family Series.

    ERIC Educational Resources Information Center

    Science Research Associates, Inc., Chicago, IL.

    The booklet describes jobs in the construction industry under the classifications of public and private building. Separate chapters discuss the process of building a city hospital, a model home, and a State highway. Chapters outline miscellaneous jobs in the industry such as elevator constructors, lathers, plasterers, roofers, and sheet metal…

  18. [Severity classification of chronic obstructive pulmonary disease based on deep learning].

    PubMed

    Ying, Jun; Yang, Ceyuan; Li, Quanzheng; Xue, Wanguo; Li, Tanshi; Cao, Wenzhe

    2017-12-01

    In this paper, a deep learning method has been raised to build an automatic classification algorithm of severity of chronic obstructive pulmonary disease. Large sample clinical data as input feature were analyzed for their weights in classification. Through feature selection, model training, parameter optimization and model testing, a classification prediction model based on deep belief network was built to predict severity classification criteria raised by the Global Initiative for Chronic Obstructive Lung Disease (GOLD). We get accuracy over 90% in prediction for two different standardized versions of severity criteria raised in 2007 and 2011 respectively. Moreover, we also got the contribution ranking of different input features through analyzing the model coefficient matrix and confirmed that there was a certain degree of agreement between the more contributive input features and the clinical diagnostic knowledge. The validity of the deep belief network model was proved by this result. This study provides an effective solution for the application of deep learning method in automatic diagnostic decision making.

  19. Active Learning of Classification Models with Likert-Scale Feedback.

    PubMed

    Xue, Yanbing; Hauskrecht, Milos

    2017-01-01

    Annotation of classification data by humans can be a time-consuming and tedious process. Finding ways of reducing the annotation effort is critical for building the classification models in practice and for applying them to a variety of classification tasks. In this paper, we develop a new active learning framework that combines two strategies to reduce the annotation effort. First, it relies on label uncertainty information obtained from the human in terms of the Likert-scale feedback. Second, it uses active learning to annotate examples with the greatest expected change. We propose a Bayesian approach to calculate the expectation and an incremental SVM solver to reduce the time complexity of the solvers. We show the combination of our active learning strategy and the Likert-scale feedback can learn classification models more rapidly and with a smaller number of labeled instances than methods that rely on either Likert-scale labels or active learning alone.

  20. Active Learning of Classification Models with Likert-Scale Feedback

    PubMed Central

    Xue, Yanbing; Hauskrecht, Milos

    2017-01-01

    Annotation of classification data by humans can be a time-consuming and tedious process. Finding ways of reducing the annotation effort is critical for building the classification models in practice and for applying them to a variety of classification tasks. In this paper, we develop a new active learning framework that combines two strategies to reduce the annotation effort. First, it relies on label uncertainty information obtained from the human in terms of the Likert-scale feedback. Second, it uses active learning to annotate examples with the greatest expected change. We propose a Bayesian approach to calculate the expectation and an incremental SVM solver to reduce the time complexity of the solvers. We show the combination of our active learning strategy and the Likert-scale feedback can learn classification models more rapidly and with a smaller number of labeled instances than methods that rely on either Likert-scale labels or active learning alone. PMID:28979827

  1. [A cold/heat property classification strategy based on bio-effects of herbal medicines].

    PubMed

    Jiang, Miao; Lv, Ai-Ping

    2014-06-01

    The property theory of Chinese herbal medicine (CHM) is regarded as the core and basic of Chinese medical theory, however, the underlying mechanism of the properties in CHMs remains unclear, which impedes a barrier for the modernization of Chinese herbal medicine. The properties of CHM are often categorized into cold and heat according to the theory of Chinese medicine, which are essential to guide the clinical application of CHMs. There is an urgent demand to build a cold/heat property classification model to facilitate the property theory of Chinese herbal medicine, as well as to clarify the controversial properties of some herbs. Based on previous studies on the cold/heat properties of CHM, in this paper, we described a novel strategy on building a cold/heat property classification model based on herbal bio-effect. The interdisciplinary cooperation of systems biology, pharmacological network, and pattern recognition technique might lighten the study on cold/heat property theory, provide a scientific model for determination the cold/heat property of herbal medicines, and a new strategy for expanding the Chinese herbal medicine resources as well.

  2. Multi-element analysis of wines by ICP-MS and ICP-OES and their classification according to geographical origin in Slovenia.

    PubMed

    Selih, Vid S; Sala, Martin; Drgan, Viktor

    2014-06-15

    Inductively coupled plasma mass spectrometry and optical emission were used to determine the multi-element composition of 272 bottled Slovenian wines. To achieve geographical classification of the wines by their elemental composition, principal component analysis (PCA) and counter-propagation artificial neural networks (CPANN) have been used. From 49 elements measured, 19 were used to build the final classification models. CPANN was used for the final predictions because of its superior results. The best model gave 82% correct predictions for external set of the white wine samples. Taking into account the small size of whole Slovenian wine growing regions, we consider the classification results were very good. For the red wines, which were mostly represented from one region, even-sub region classification was possible with great precision. From the level maps of the CPANN model, some of the most important elements for classification were identified. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus.

    PubMed

    Jouhet, Vianney; Mougin, Fleur; Bréchat, Bérénice; Thiessard, Frantz

    2017-02-07

    Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a "derivative" model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the "derivative" model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program. The model integrated 42% of neoplasm ICD-10 codes (excluding metastasis), 98% of ICD-O-3 morphology codes (excluding metastasis) and 68% of ICD-O-3 topography codes. For every codes instantiating at least a class in the "derivative" model, comparison with SEER mappings reveals that all mappings were actually available in the model as a link between the corresponding codes. We have proposed a method to automatically build a model for integrating ICD-10 and ICD-O-3 based on the NCIt. The resulting "derivative" model is a machine understandable resource that enables an integrated view of these heterogeneous terminologies. The NCIt structure and the available relationships can help to bridge disease classifications taking into account their structural and granular heterogeneities. However, (i) inconsistencies exist within the NCIt leading to misclassifications in the "derivative" model, (ii) the "derivative" model only integrates a part of ICD-10 and ICD-O-3. The NCIt is not sufficient for integration purpose and further work based on other termino-ontological resources is needed in order to enrich the model and avoid identified inconsistencies.

  4. Information support of monitoring of technical condition of buildings in construction risk area

    NASA Astrophysics Data System (ADS)

    Skachkova, M. E.; Lepihina, O. Y.; Ignatova, V. V.

    2018-05-01

    The paper presents the results of the research devoted to the development of a model of information support of monitoring buildings technical condition; these buildings are located in the construction risk area. As a result of the visual and instrumental survey, as well as the analysis of existing approaches and techniques, attributive and cartographic databases have been created. These databases allow monitoring defects and damages of buildings located in a 30-meter risk area from the object under construction. The classification of structures and defects of these buildings under survey is presented. The functional capabilities of the developed model and the field of it practical applications are determined.

  5. Data Field Modeling and Spectral-Spatial Feature Fusion for Hyperspectral Data Classification.

    PubMed

    Liu, Da; Li, Jianxun

    2016-12-16

    Classification is a significant subject in hyperspectral remote sensing image processing. This study proposes a spectral-spatial feature fusion algorithm for the classification of hyperspectral images (HSI). Unlike existing spectral-spatial classification methods, the influences and interactions of the surroundings on each measured pixel were taken into consideration in this paper. Data field theory was employed as the mathematical realization of the field theory concept in physics, and both the spectral and spatial domains of HSI were considered as data fields. Therefore, the inherent dependency of interacting pixels was modeled. Using data field modeling, spatial and spectral features were transformed into a unified radiation form and further fused into a new feature by using a linear model. In contrast to the current spectral-spatial classification methods, which usually simply stack spectral and spatial features together, the proposed method builds the inner connection between the spectral and spatial features, and explores the hidden information that contributed to classification. Therefore, new information is included for classification. The final classification result was obtained using a random forest (RF) classifier. The proposed method was tested with the University of Pavia and Indian Pines, two well-known standard hyperspectral datasets. The experimental results demonstrate that the proposed method has higher classification accuracies than those obtained by the traditional approaches.

  6. Decoding memory features from hippocampal spiking activities using sparse classification models.

    PubMed

    Dong Song; Hampson, Robert E; Robinson, Brian S; Marmarelis, Vasilis Z; Deadwyler, Sam A; Berger, Theodore W

    2016-08-01

    To understand how memory information is encoded in the hippocampus, we build classification models to decode memory features from hippocampal CA3 and CA1 spatio-temporal patterns of spikes recorded from epilepsy patients performing a memory-dependent delayed match-to-sample task. The classification model consists of a set of B-spline basis functions for extracting memory features from the spike patterns, and a sparse logistic regression classifier for generating binary categorical output of memory features. Results show that classification models can extract significant amount of memory information with respects to types of memory tasks and categories of sample images used in the task, despite the high level of variability in prediction accuracy due to the small sample size. These results support the hypothesis that memories are encoded in the hippocampal activities and have important implication to the development of hippocampal memory prostheses.

  7. Classification and Sequential Pattern Analysis for Improving Managerial Efficiency and Providing Better Medical Service in Public Healthcare Centers

    PubMed Central

    Chung, Sukhoon; Rhee, Hyunsill; Suh, Yongmoo

    2010-01-01

    Objectives This study sought to find answers to the following questions: 1) Can we predict whether a patient will revisit a healthcare center? 2) Can we anticipate diseases of patients who revisit the center? Methods For the first question, we applied 5 classification algorithms (decision tree, artificial neural network, logistic regression, Bayesian networks, and Naïve Bayes) and the stacking-bagging method for building classification models. To solve the second question, we performed sequential pattern analysis. Results We determined: 1) In general, the most influential variables which impact whether a patient of a public healthcare center will revisit it or not are personal burden, insurance bill, period of prescription, age, systolic pressure, name of disease, and postal code. 2) The best plain classification model is dependent on the dataset. 3) Based on average of classification accuracy, the proposed stacking-bagging method outperformed all traditional classification models and our sequential pattern analysis revealed 16 sequential patterns. Conclusions Classification models and sequential patterns can help public healthcare centers plan and implement healthcare service programs and businesses that are more appropriate to local residents, encouraging them to revisit public health centers. PMID:21818426

  8. Quantification of urban structure on building block level utilizing multisensoral remote sensing data

    NASA Astrophysics Data System (ADS)

    Wurm, Michael; Taubenböck, Hannes; Dech, Stefan

    2010-10-01

    Dynamics of urban environments are a challenge to a sustainable development. Urban areas promise wealth, realization of individual dreams and power. Hence, many cities are characterized by a population growth as well as physical development. Traditional, visual mapping and updating of urban structure information of cities is a very laborious and cost-intensive task, especially for large urban areas. For this purpose, we developed a workflow for the extraction of the relevant information by means of object-based image classification. In this manner, multisensoral remote sensing data has been analyzed in terms of very high resolution optical satellite imagery together with height information by a digital surface model to retrieve a detailed 3D city model with the relevant land-use / land-cover information. This information has been aggregated on the level of the building block to describe the urban structure by physical indicators. A comparison between the indicators derived by the classification and a reference classification has been accomplished to show the correlation between the individual indicators and a reference classification of urban structure types. The indicators have been used to apply a cluster analysis to group the individual blocks into similar clusters.

  9. A Hierarchical Object-oriented Urban Land Cover Classification Using WorldView-2 Imagery and Airborne LiDAR data

    NASA Astrophysics Data System (ADS)

    Wu, M. F.; Sun, Z. C.; Yang, B.; Yu, S. S.

    2016-11-01

    In order to reduce the “salt and pepper” in pixel-based urban land cover classification and expand the application of fusion of multi-source data in the field of urban remote sensing, WorldView-2 imagery and airborne Light Detection and Ranging (LiDAR) data were used to improve the classification of urban land cover. An approach of object- oriented hierarchical classification was proposed in our study. The processing of proposed method consisted of two hierarchies. (1) In the first hierarchy, LiDAR Normalized Digital Surface Model (nDSM) image was segmented to objects. The NDVI, Costal Blue and nDSM thresholds were set for extracting building objects. (2) In the second hierarchy, after removing building objects, WorldView-2 fused imagery was obtained by Haze-ratio-based (HR) fusion, and was segmented. A SVM classifier was applied to generate road/parking lot, vegetation and bare soil objects. (3) Trees and grasslands were split based on an nDSM threshold (2.4 meter). The results showed that compared with pixel-based and non-hierarchical object-oriented approach, proposed method provided a better performance of urban land cover classification, the overall accuracy (OA) and overall kappa (OK) improved up to 92.75% and 0.90. Furthermore, proposed method reduced “salt and pepper” in pixel-based classification, improved the extraction accuracy of buildings based on LiDAR nDSM image segmentation, and reduced the confusion between trees and grasslands through setting nDSM threshold.

  10. An estimation framework for building information modeling (BIM)-based demolition waste by type.

    PubMed

    Kim, Young-Chan; Hong, Won-Hwa; Park, Jae-Woo; Cha, Gi-Wook

    2017-12-01

    Most existing studies on demolition waste (DW) quantification do not have an official standard to estimate the amount and type of DW. Therefore, there are limitations in the existing literature for estimating DW with a consistent classification system. Building information modeling (BIM) is a technology that can generate and manage all the information required during the life cycle of a building, from design to demolition. Nevertheless, there has been a lack of research regarding its application to the demolition stage of a building. For an effective waste management plan, the estimation of the type and volume of DW should begin from the building design stage. However, the lack of tools hinders an early estimation. This study proposes a BIM-based framework that estimates DW in the early design stages, to achieve an effective and streamlined planning, processing, and management. Specifically, the input of construction materials in the Korean construction classification system and those in the BIM library were matched. Based on this matching integration, the estimates of DW by type were calculated by applying the weight/unit volume factors and the rates of DW volume change. To verify the framework, its operation was demonstrated by means of an actual BIM modeling and by comparing its results with those available in the literature. This study is expected to contribute not only to the estimation of DW at the building level, but also to the automated estimation of DW at the district level.

  11. Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks.

    PubMed

    Taamneh, Madhar; Taamneh, Salah; Alkheder, Sharaf

    2017-09-01

    Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.

  12. The impact of modeling the dependencies among patient findings on classification accuracy and calibration.

    PubMed Central

    Monti, S.; Cooper, G. F.

    1998-01-01

    We present a new Bayesian classifier for computer-aided diagnosis. The new classifier builds upon the naive-Bayes classifier, and models the dependencies among patient findings in an attempt to improve its performance, both in terms of classification accuracy and in terms of calibration of the estimated probabilities. This work finds motivation in the argument that highly calibrated probabilities are necessary for the clinician to be able to rely on the model's recommendations. Experimental results are presented, supporting the conclusion that modeling the dependencies among findings improves calibration. PMID:9929288

  13. Structural Validation of Nursing Terminologies

    PubMed Central

    Hardiker, Nicholas R.; Rector, Alan L.

    2001-01-01

    Objective: The purpose of the study is twofold: 1) to explore the applicability of combinatorial terminologies as the basis for building enumerated classifications, and 2) to investigate the usefulness of formal terminological systems for performing such classification and for assisting in the refinement of both combinatorial terminologies and enumerated classifications. Design: A formal model of the beta version of the International Classification for Nursing Practice (ICNP) was constructed in the compositional terminological language GRAIL (GALEN Representation and Integration Language). Terms drawn from the North American Nursing Diagnosis Association Taxonomy I (NANDA taxonomy) were mapped into the model and classified automatically using GALEN technology. Measurements: The resulting generated hierarchy was compared with the NANDA taxonomy to assess coverage and accuracy of classification. Results: In terms of coverage, in this study ICNP was able to capture 77 percent of NANDA terms using concepts drawn from five of its eight axes. Three axes—Body Site, Topology, and Frequency—were not needed. In terms of accuracy, where hierarchic relationships existed in the generated hierarchy or the NANDA taxonomy, or both, 6 were identical, 19 existed in the generated hierarchy alone (2 of these were considered suitable for incorporation into the NANDA taxonomy and 17 were considered inaccurate), and 23 appeared in the NANDA taxonomy alone (8 of these were considered suitable for incorporation into ICNP, 9 were considered inaccurate, and 6 reflected different, equally valid perspectives). Sixty terms appeared at the top level, with no indenting, in both the generated hierarchy and the NANDA taxonomy. Conclusions: With appropriate refinement, combinatorial terminologies such as ICNP have the potential to provide a useful foundation for representing enumerated classifications such as NANDA. Technologies such as GALEN make possible the process of building automatically enumerated classifications while providing a useful means of validating and refining both combinatorial terminologies and enumerated classifications. PMID:11320066

  14. CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules.

    PubMed

    Cestarelli, Valerio; Fiscon, Giulia; Felici, Giovanni; Bertolazzi, Paola; Weitschek, Emanuel

    2016-03-01

    Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class. We propose CAMUR, a new method that extracts multiple and equivalent classification models. CAMUR iteratively computes a rule-based classification model, calculates the power set of the genes present in the rules, iteratively eliminates those combinations from the data set, and performs again the classification procedure until a stopping criterion is verified. CAMUR includes an ad-hoc knowledge repository (database) and a querying tool.We analyze three different types of RNA-seq data sets (Breast, Head and Neck, and Stomach Cancer) from The Cancer Genome Atlas (TCGA) and we validate CAMUR and its models also on non-TCGA data. Our experimental results show the efficacy of CAMUR: we obtain several reliable equivalent classification models, from which the most frequent genes, their relationships, and the relation with a particular cancer are deduced. dmb.iasi.cnr.it/camur.php emanuel@iasi.cnr.it Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  15. Estimating probabilities of infestation and extent of damage by the roundheaded pine beetle in ponderosa pine in the Sacramento Mountains, New Mexico

    Treesearch

    Jose Negron

    1997-01-01

    Classification trees and linear regression analysis were used to build models to predict probabilities of infestation and amount of tree mortality in terms of basal area resulting from roundheaded pine beetle, Dendroctonus adjunctus Blandford, activity in ponderosa pine, Pinus ponderosa Laws., in the Sacramento Mountains, New Mexico. Classification trees were built for...

  16. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    NASA Astrophysics Data System (ADS)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  17. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  18. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    PubMed Central

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  19. Spectral-spatial classification of hyperspectral imagery with cooperative game

    NASA Astrophysics Data System (ADS)

    Zhao, Ji; Zhong, Yanfei; Jia, Tianyi; Wang, Xinyu; Xu, Yao; Shu, Hong; Zhang, Liangpei

    2018-01-01

    Spectral-spatial classification is known to be an effective way to improve classification performance by integrating spectral information and spatial cues for hyperspectral imagery. In this paper, a game-theoretic spectral-spatial classification algorithm (GTA) using a conditional random field (CRF) model is presented, in which CRF is used to model the image considering the spatial contextual information, and a cooperative game is designed to obtain the labels. The algorithm establishes a one-to-one correspondence between image classification and game theory. The pixels of the image are considered as the players, and the labels are considered as the strategies in a game. Similar to the idea of soft classification, the uncertainty is considered to build the expected energy model in the first step. The local expected energy can be quickly calculated, based on a mixed strategy for the pixels, to establish the foundation for a cooperative game. Coalitions can then be formed by the designed merge rule based on the local expected energy, so that a majority game can be performed to make a coalition decision to obtain the label of each pixel. The experimental results on three hyperspectral data sets demonstrate the effectiveness of the proposed classification algorithm.

  20. Land Covers Classification Based on Random Forest Method Using Features from Full-Waveform LIDAR Data

    NASA Astrophysics Data System (ADS)

    Ma, L.; Zhou, M.; Li, C.

    2017-09-01

    In this study, a Random Forest (RF) based land covers classification method is presented to predict the types of land covers in Miyun area. The returned full-waveforms which were acquired by a LiteMapper 5600 airborne LiDAR system were processed, including waveform filtering, waveform decomposition and features extraction. The commonly used features that were distance, intensity, Full Width at Half Maximum (FWHM), skewness and kurtosis were extracted. These waveform features were used as attributes of training data for generating the RF prediction model. The RF prediction model was applied to predict the types of land covers in Miyun area as trees, buildings, farmland and ground. The classification results of these four types of land covers were obtained according to the ground truth information acquired from CCD image data of the same region. The RF classification results were compared with that of SVM method and show better results. The RF classification accuracy reached 89.73% and the classification Kappa was 0.8631.

  1. Impact of input data uncertainty on environmental exposure assessment models: A case study for electromagnetic field modelling from mobile phone base stations.

    PubMed

    Beekhuizen, Johan; Heuvelink, Gerard B M; Huss, Anke; Bürgi, Alfred; Kromhout, Hans; Vermeulen, Roel

    2014-11-01

    With the increased availability of spatial data and computing power, spatial prediction approaches have become a standard tool for exposure assessment in environmental epidemiology. However, such models are largely dependent on accurate input data. Uncertainties in the input data can therefore have a large effect on model predictions, but are rarely quantified. With Monte Carlo simulation we assessed the effect of input uncertainty on the prediction of radio-frequency electromagnetic fields (RF-EMF) from mobile phone base stations at 252 receptor sites in Amsterdam, The Netherlands. The impact on ranking and classification was determined by computing the Spearman correlations and weighted Cohen's Kappas (based on tertiles of the RF-EMF exposure distribution) between modelled values and RF-EMF measurements performed at the receptor sites. The uncertainty in modelled RF-EMF levels was large with a median coefficient of variation of 1.5. Uncertainty in receptor site height, building damping and building height contributed most to model output uncertainty. For exposure ranking and classification, the heights of buildings and receptor sites were the most important sources of uncertainty, followed by building damping, antenna- and site location. Uncertainty in antenna power, tilt, height and direction had a smaller impact on model performance. We quantified the effect of input data uncertainty on the prediction accuracy of an RF-EMF environmental exposure model, thereby identifying the most important sources of uncertainty and estimating the total uncertainty stemming from potential errors in the input data. This approach can be used to optimize the model and better interpret model output. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Object-based analysis of multispectral airborne laser scanner data for land cover classification and map updating

    NASA Astrophysics Data System (ADS)

    Matikainen, Leena; Karila, Kirsi; Hyyppä, Juha; Litkey, Paula; Puttonen, Eetu; Ahokas, Eero

    2017-06-01

    During the last 20 years, airborne laser scanning (ALS), often combined with passive multispectral information from aerial images, has shown its high feasibility for automated mapping processes. The main benefits have been achieved in the mapping of elevated objects such as buildings and trees. Recently, the first multispectral airborne laser scanners have been launched, and active multispectral information is for the first time available for 3D ALS point clouds from a single sensor. This article discusses the potential of this new technology in map updating, especially in automated object-based land cover classification and change detection in a suburban area. For our study, Optech Titan multispectral ALS data over a suburban area in Finland were acquired. Results from an object-based random forests analysis suggest that the multispectral ALS data are very useful for land cover classification, considering both elevated classes and ground-level classes. The overall accuracy of the land cover classification results with six classes was 96% compared with validation points. The classes under study included building, tree, asphalt, gravel, rocky area and low vegetation. Compared to classification of single-channel data, the main improvements were achieved for ground-level classes. According to feature importance analyses, multispectral intensity features based on several channels were more useful than those based on one channel. Automatic change detection for buildings and roads was also demonstrated by utilising the new multispectral ALS data in combination with old map vectors. In change detection of buildings, an old digital surface model (DSM) based on single-channel ALS data was also used. Overall, our analyses suggest that the new data have high potential for further increasing the automation level in mapping. Unlike passive aerial imaging commonly used in mapping, the multispectral ALS technology is independent of external illumination conditions, and there are no shadows on intensity images produced from the data. These are significant advantages in developing automated classification and change detection procedures.

  3. Spectral-spatial classification of hyperspectral image using three-dimensional convolution network

    NASA Astrophysics Data System (ADS)

    Liu, Bing; Yu, Xuchu; Zhang, Pengqiang; Tan, Xiong; Wang, Ruirui; Zhi, Lu

    2018-01-01

    Recently, hyperspectral image (HSI) classification has become a focus of research. However, the complex structure of an HSI makes feature extraction difficult to achieve. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. The design of an improved 3-D convolutional neural network (3D-CNN) model for HSI classification is described. This model extracts features from both the spectral and spatial dimensions through the application of 3-D convolutions, thereby capturing the important discrimination information encoded in multiple adjacent bands. The designed model views the HSI cube data altogether without relying on any pre- or postprocessing. In addition, the model is trained in an end-to-end fashion without any handcrafted features. The designed model was applied to three widely used HSI datasets. The experimental results demonstrate that the 3D-CNN-based method outperforms conventional methods even with limited labeled training samples.

  4. Analysis of Traffic Signals on a Software-Defined Network for Detection and Classification of a Man-in-the-Middle Attack

    DTIC Science & Technology

    2017-09-01

    unique characteristics of reported anomalies in the collected traffic signals to build a classification framework. Other cyber events, such as a...Furthermore, we identify unique characteristics of reported anomalies in the collected traffic signals to build a classification framework. Other cyber...2]. The applications build flow rules using network topology information provided by the control plane [1]. Since the control plane is able to

  5. Interactive Classification of Construction Materials: Feedback Driven Framework for Annotation and Analysis of 3d Point Clouds

    NASA Astrophysics Data System (ADS)

    Hess, M. R.; Petrovic, V.; Kuester, F.

    2017-08-01

    Digital documentation of cultural heritage structures is increasingly more common through the application of different imaging techniques. Many works have focused on the application of laser scanning and photogrammetry techniques for the acquisition of threedimensional (3D) geometry detailing cultural heritage sites and structures. With an abundance of these 3D data assets, there must be a digital environment where these data can be visualized and analyzed. Presented here is a feedback driven visualization framework that seamlessly enables interactive exploration and manipulation of massive point cloud data. The focus of this work is on the classification of different building materials with the goal of building more accurate as-built information models of historical structures. User defined functions have been tested within the interactive point cloud visualization framework to evaluate automated and semi-automated classification of 3D point data. These functions include decisions based on observed color, laser intensity, normal vector or local surface geometry. Multiple case studies are presented here to demonstrate the flexibility and utility of the presented point cloud visualization framework to achieve classification objectives.

  6. Metabolomics for organic food authentication: Results from a long-term field study in carrots.

    PubMed

    Cubero-Leon, Elena; De Rudder, Olivier; Maquet, Alain

    2018-01-15

    Increasing demand for organic products and their premium prices make them an attractive target for fraudulent malpractices. In this study, a large-scale comparative metabolomics approach was applied to investigate the effect of the agronomic production system on the metabolite composition of carrots and to build statistical models for prediction purposes. Orthogonal projections to latent structures-discriminant analysis (OPLS-DA) was applied successfully to predict the origin of the agricultural system of the harvested carrots on the basis of features determined by liquid chromatography-mass spectrometry. When the training set used to build the OPLS-DA models contained samples representative of each harvest year, the models were able to classify unknown samples correctly (100% correct classification). If a harvest year was left out of the training sets and used for predictions, the correct classification rates achieved ranged from 76% to 100%. The results therefore highlight the potential of metabolomic fingerprinting for organic food authentication purposes. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  7. A web-based land cover classification system based on ontology model of different classification systems

    NASA Astrophysics Data System (ADS)

    Lin, Y.; Chen, X.

    2016-12-01

    Land cover classification systems used in remote sensing image data have been developed to meet the needs for depicting land covers in scientific investigations and policy decisions. However, accuracy assessments of a spate of data sets demonstrate that compared with the real physiognomy, each of the thematic map of specific land cover classification system contains some unavoidable flaws and unintended deviation. This work proposes a web-based land cover classification system, an integrated prototype, based on an ontology model of various classification systems, each of which is assigned the same weight in the final determination of land cover type. Ontology, a formal explication of specific concepts and relations, is employed in this prototype to build up the connections among different systems to resolve the naming conflicts. The process is initialized by measuring semantic similarity between terminologies in the systems and the search key to produce certain set of satisfied classifications, and carries on through searching the predefined relations in concepts of all classification systems to generate classification maps with user-specified land cover type highlighted, based on probability calculated by votes from data sets with different classification system adopted. The present system is verified and validated by comparing the classification results with those most common systems. Due to full consideration and meaningful expression of each classification system using ontology and the convenience that the web brings with itself, this system, as a preliminary model, proposes a flexible and extensible architecture for classification system integration and data fusion, thereby providing a strong foundation for the future work.

  8. The Iterated Classification Game: A New Model of the Cultural Transmission of Language

    PubMed Central

    Swarup, Samarth; Gasser, Les

    2010-01-01

    The Iterated Classification Game (ICG) combines the Classification Game with the Iterated Learning Model (ILM) to create a more realistic model of the cultural transmission of language through generations. It includes both learning from parents and learning from peers. Further, it eliminates some of the chief criticisms of the ILM: that it does not study grounded languages, that it does not include peer learning, and that it builds in a bias for compositional languages. We show that, over the span of a few generations, a stable linguistic system emerges that can be acquired very quickly by each generation, is compositional, and helps the agents to solve the classification problem with which they are faced. The ICG also leads to a different interpretation of the language acquisition process. It suggests that the role of parents is to initialize the linguistic system of the child in such a way that subsequent interaction with peers results in rapid convergence to the correct language. PMID:20190877

  9. One input-class and two input-class classifications for differentiating olive oil from other edible vegetable oils by use of the normal-phase liquid chromatography fingerprint of the methyl-transesterified fraction.

    PubMed

    Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis

    2017-04-15

    A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Application of partial least squares near-infrared spectral classification in diabetic identification

    NASA Astrophysics Data System (ADS)

    Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang

    2014-11-01

    In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.

  11. The road map towards providing a robust Raman spectroscopy-based cancer diagnostic platform and integration into clinic

    NASA Astrophysics Data System (ADS)

    Lau, Katherine; Isabelle, Martin; Lloyd, Gavin R.; Old, Oliver; Shepherd, Neil; Bell, Ian M.; Dorney, Jennifer; Lewis, Aaran; Gaifulina, Riana; Rodriguez-Justo, Manuel; Kendall, Catherine; Stone, Nicolas; Thomas, Geraint; Reece, David

    2016-03-01

    Despite the demonstrated potential as an accurate cancer diagnostic tool, Raman spectroscopy (RS) is yet to be adopted by the clinic for histopathology reviews. The Stratified Medicine through Advanced Raman Technologies (SMART) consortium has begun to address some of the hurdles in its adoption for cancer diagnosis. These hurdles include awareness and acceptance of the technology, practicality of integration into the histopathology workflow, data reproducibility and availability of transferrable models. We have formed a consortium, in joint efforts, to develop optimised protocols for tissue sample preparation, data collection and analysis. These protocols will be supported by provision of suitable hardware and software tools to allow statistically sound classification models to be built and transferred for use on different systems. In addition, we are building a validated gastrointestinal (GI) cancers model, which can be trialled as part of the histopathology workflow at hospitals, and a classification tool. At the end of the project, we aim to deliver a robust Raman based diagnostic platform to enable clinical researchers to stage cancer, define tumour margin, build cancer diagnostic models and discover novel disease bio markers.

  12. Interactive classification and content-based retrieval of tissue images

    NASA Astrophysics Data System (ADS)

    Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof

    2002-11-01

    We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.

  13. 29 CFR 779.355 - Classification of lumber and building materials sales.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 29 Labor 3 2011-07-01 2011-07-01 false Classification of lumber and building materials sales. 779... building materials sales. (a) General. In determining, for purposes of the section 13(a)(2) and (4) exemptions, whether 75 percent of the annual dollar volume of the establishment's sales which are not for...

  14. 29 CFR 779.355 - Classification of lumber and building materials sales.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 29 Labor 3 2010-07-01 2010-07-01 false Classification of lumber and building materials sales. 779... building materials sales. (a) General. In determining, for purposes of the section 13(a)(2) and (4) exemptions, whether 75 percent of the annual dollar volume of the establishment's sales which are not for...

  15. Building and Solving Odd-One-Out Classification Problems: A Systematic Approach

    ERIC Educational Resources Information Center

    Ruiz, Philippe E.

    2011-01-01

    Classification problems ("find the odd-one-out") are frequently used as tests of inductive reasoning to evaluate human or animal intelligence. This paper introduces a systematic method for building the set of all possible classification problems, followed by a simple algorithm for solving the problems of the R-ASCM, a psychometric test derived…

  16. Land use and land cover classification for rural residential areas in China using soft-probability cascading of multifeatures

    NASA Astrophysics Data System (ADS)

    Zhang, Bin; Liu, Yueyan; Zhang, Zuyu; Shen, Yonglin

    2017-10-01

    A multifeature soft-probability cascading scheme to solve the problem of land use and land cover (LULC) classification using high-spatial-resolution images to map rural residential areas in China is proposed. The proposed method is used to build midlevel LULC features. Local features are frequently considered as low-level feature descriptors in a midlevel feature learning method. However, spectral and textural features, which are very effective low-level features, are neglected. The acquisition of the dictionary of sparse coding is unsupervised, and this phenomenon reduces the discriminative power of the midlevel feature. Thus, we propose to learn supervised features based on sparse coding, a support vector machine (SVM) classifier, and a conditional random field (CRF) model to utilize the different effective low-level features and improve the discriminability of midlevel feature descriptors. First, three kinds of typical low-level features, namely, dense scale-invariant feature transform, gray-level co-occurrence matrix, and spectral features, are extracted separately. Second, combined with sparse coding and the SVM classifier, the probabilities of the different LULC classes are inferred to build supervised feature descriptors. Finally, the CRF model, which consists of two parts: unary potential and pairwise potential, is employed to construct an LULC classification map. Experimental results show that the proposed classification scheme can achieve impressive performance when the total accuracy reached about 87%.

  17. Culto: AN Ontology-Based Annotation Tool for Data Curation in Cultural Heritage

    NASA Astrophysics Data System (ADS)

    Garozzo, R.; Murabito, F.; Santagati, C.; Pino, C.; Spampinato, C.

    2017-08-01

    This paper proposes CulTO, a software tool relying on a computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations. It is specifically thought to support annotation, automatic indexing, classification and curation of photographic data and text documents of historical buildings. CULTO also serves as a useful tool for Historical Building Information Modeling (H-BIM) by enabling semantic 3D data modeling and further enrichment with non-geometrical information of historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements into BIM platforms. CulTO is the result of a joint research effort between the Laboratory of Surveying and Architectural Photogrammetry "Luigi Andreozzi" and the PeRCeiVe Lab (Pattern Recognition and Computer Vision Lab) of the University of Catania,

  18. Reduction in training time of a deep learning model in detection of lesions in CT

    NASA Astrophysics Data System (ADS)

    Makkinejad, Nazanin; Tajbakhsh, Nima; Zarshenas, Amin; Khokhar, Ashfaq; Suzuki, Kenji

    2018-02-01

    Deep learning (DL) emerged as a powerful tool for object detection and classification in medical images. Building a well-performing DL model, however, requires a huge number of images for training, and it takes days to train a DL model even on a cutting edge high-performance computing platform. This study is aimed at developing a method for selecting a "small" number of representative samples from a large collection of training samples to train a DL model for the could be used to detect polyps in CT colonography (CTC), without compromising the classification performance. Our proposed method for representative sample selection (RSS) consists of a K-means clustering algorithm. For the performance evaluation, we applied the proposed method to select samples for the training of a massive training artificial neural network based DL model, to be used for the classification of polyps and non-polyps in CTC. Our results show that the proposed method reduce the training time by a factor of 15, while maintaining the classification performance equivalent to the model trained using the full training set. We compare the performance using area under the receiveroperating- characteristic curve (AUC).

  19. Neural Networks for the Classification of Building Use from Street-View Imagery

    NASA Astrophysics Data System (ADS)

    Laupheimer, D.; Tutzauer, P.; Haala, N.; Spicker, M.

    2018-05-01

    Within this paper we propose an end-to-end approach for classifying terrestrial images of building facades into five different utility classes (commercial, hybrid, residential, specialUse, underConstruction) by using Convolutional Neural Networks (CNNs). For our examples we use images provided by Google Street View. These images are automatically linked to a coarse city model, including the outlines of the buildings as well as their respective use classes. By these means an extensive dataset is available for training and evaluation of our Deep Learning pipeline. The paper describes the implemented end-to-end approach for classifying street-level images of building facades and discusses our experiments with various CNNs. In addition to the classification results, so-called Class Activation Maps (CAMs) are evaluated. These maps give further insights into decisive facade parts that are learned as features during the training process. Furthermore, they can be used for the generation of abstract presentations which facilitate the comprehension of semantic image content. The abstract representations are a result of the stippling method, an importance-based image rendering.

  20. Satellite Image Classification of Building Damages Using Airborne and Satellite Image Samples in a Deep Learning Approach

    NASA Astrophysics Data System (ADS)

    Duarte, D.; Nex, F.; Kerle, N.; Vosselman, G.

    2018-05-01

    The localization and detailed assessment of damaged buildings after a disastrous event is of utmost importance to guide response operations, recovery tasks or for insurance purposes. Several remote sensing platforms and sensors are currently used for the manual detection of building damages. However, there is an overall interest in the use of automated methods to perform this task, regardless of the used platform. Owing to its synoptic coverage and predictable availability, satellite imagery is currently used as input for the identification of building damages by the International Charter, as well as the Copernicus Emergency Management Service for the production of damage grading and reference maps. Recently proposed methods to perform image classification of building damages rely on convolutional neural networks (CNN). These are usually trained with only satellite image samples in a binary classification problem, however the number of samples derived from these images is often limited, affecting the quality of the classification results. The use of up/down-sampling image samples during the training of a CNN, has demonstrated to improve several image recognition tasks in remote sensing. However, it is currently unclear if this multi resolution information can also be captured from images with different spatial resolutions like satellite and airborne imagery (from both manned and unmanned platforms). In this paper, a CNN framework using residual connections and dilated convolutions is used considering both manned and unmanned aerial image samples to perform the satellite image classification of building damages. Three network configurations, trained with multi-resolution image samples are compared against two benchmark networks where only satellite image samples are used. Combining feature maps generated from airborne and satellite image samples, and refining these using only the satellite image samples, improved nearly 4 % the overall satellite image classification of building damages.

  1. Caracterisation des occupations du sol en milieu urbain par imagerie radar

    NASA Astrophysics Data System (ADS)

    Codjia, Claude

    This study aims to test the relevance of medium and high-resolution SAR images on the characterization of the types of land use in urban areas. To this end, we have relied on textural approaches based on second-order statistics. Specifically, we look for texture parameters most relevant for discriminating urban objects. We have used in this regard Radarsat-1 in fine polarization mode and Radarsat-2 HH fine mode in dual and quad polarization and ultrafine mode HH polarization. The land uses sought were dense building, medium density building, low density building, industrial and institutional buildings, low density vegetation, dense vegetation and water. We have identified nine texture parameters for analysis, grouped into families according to their mathematical definitions in a first step. The parameters of similarity / dissimilarity include Homogeneity, Contrast, the Differential Inverse Moment and Dissimilarity. The parameters of disorder are Entropy and the Second Angular Momentum. The Standard Deviation and Correlation are the dispersion parameters and the Average is a separate family. It is clear from experience that certain combinations of texture parameters from different family used in classifications yield good results while others produce kappa of very little interest. Furthermore, we realize that if the use of several texture parameters improves classifications, its performance ceils from three parameters. The calculation of correlations between the textures and their principal axes confirm the results. Despite the good performance of this approach based on the complementarity of texture parameters, systematic errors due to the cardinal effects remain on classifications. To overcome this problem, a radiometric compensation model was developed based on the radar cross section (SER). A radar simulation from the digital surface model of the environment allowed us to extract the building backscatter zones and to analyze the related backscatter. Thus, we were able to devise a strategy of compensation of cardinal effects solely based on the responses of the objects according to their orientation from the plane of illumination through the radar's beam. It appeared that a compensation algorithm based on the radar cross section was appropriate. Some examples of the application of this algorithm on HH polarized RADARSAT-2 images are presented as well. Application of this algorithm will allow considerable gains with regard to certain forms of automation (classification and segmentation) at the level of radar imagery thus generating a higher level of quality in regard to visual interpretation. Application of this algorithm on RADARSAT-1 and RADARSAT-2 images with HH, HV, VH, and VV polarisations helped make considerable gains and eliminate most of the classification errors due to the cardinal effects.

  2. Surface characteristics modeling and performance evaluation of urban building materials using LiDAR data.

    PubMed

    Li, Xiaolu; Liang, Yu

    2015-05-20

    Analysis of light detection and ranging (LiDAR) intensity data to extract surface features is of great interest in remote sensing research. One potential application of LiDAR intensity data is target classification. A new bidirectional reflectance distribution function (BRDF) model is derived for target characterization of rough and smooth surfaces. Based on the geometry of our coaxial full-waveform LiDAR system, the integration method is improved through coordinate transformation to establish the relationship between the BRDF model and intensity data of LiDAR. A series of experiments using typical urban building materials are implemented to validate the proposed BRDF model and integration method. The fitting results show that three parameters extracted from the proposed BRDF model can distinguish the urban building materials from perspectives of roughness, specular reflectance, and diffuse reflectance. A comprehensive analysis of these parameters will help characterize surface features in a physically rigorous manner.

  3. Building a Shared Definitional Model of Long Duration Human Spaceflight

    NASA Technical Reports Server (NTRS)

    Arias, Diana; Orr, Martin; Whitmire, Alexandra; Leveton, Lauren; Sandoval, Luis

    2012-01-01

    Objective: To establish the need for a shared definitional model of long duration human spaceflight, that would provide a framework and vision to facilitate communication, research and practice In 1956, on the eve of human space travel, Hubertus Strughold first proposed a "simple classification of the present and future stages of manned flight" that identified key factors, risks and developmental stages for the evolutionary journey ahead. As we look to new destinations, we need a current shared working definitional model of long duration human space flight to help guide our path. Here we describe our preliminary findings and outline potential approaches for the future development of a definition and broader classification system

  4. Enhancement of the Logistics Battle Command Model: Architecture Upgrades and Attrition Module Development

    DTIC Science & Technology

    2017-01-05

    module. 15. SUBJECT TERMS Logistics, attrition, discrete event simulation, Simkit, LBC 16. SECURITY CLASSIFICATION OF: Unclassified 17. LIMITATION...stochastics, and discrete event model programmed in Java building largely on the Simkit library. The primary purpose of the LBC model is to support...equations makes them incompatible with the discrete event construct of LBC. Bullard further advances this methodology by developing a stochastic

  5. Creating a three level building classification using topographic and address-based data for Manchester

    NASA Astrophysics Data System (ADS)

    Hussain, M.; Chen, D.

    2014-11-01

    Buildings, the basic unit of an urban landscape, host most of its socio-economic activities and play an important role in the creation of urban land-use patterns. The spatial arrangement of different building types creates varied urban land-use clusters which can provide an insight to understand the relationships between social, economic, and living spaces. The classification of such urban clusters can help in policy-making and resource management. In many countries including the UK no national-level cadastral database containing information on individual building types exists in public domain. In this paper, we present a framework for inferring functional types of buildings based on the analysis of their form (e.g. geometrical properties, such as area and perimeter, layout) and spatial relationship from large topographic and address-based GIS database. Machine learning algorithms along with exploratory spatial analysis techniques are used to create the classification rules. The classification is extended to two further levels based on the functions (use) of buildings derived from address-based data. The developed methodology was applied to the Manchester metropolitan area using the Ordnance Survey's MasterMap®, a large-scale topographic and address-based data available for the UK.

  6. Modeling Poroelastic Wave Propagation in a Real 2-D Complex Geological Structure Obtained via Self-Organizing Maps

    NASA Astrophysics Data System (ADS)

    Itzá Balam, Reymundo; Iturrarán-Viveros, Ursula; Parra, Jorge O.

    2018-03-01

    Two main stages of seismic modeling are geological model building and numerical computation of seismic response for the model. The quality of the computed seismic response is partly related to the type of model that is built. Therefore, the model building approaches become as important as seismic forward numerical methods. For this purpose, three petrophysical facies (sands, shales and limestones) are extracted from reflection seismic data and some seismic attributes via the clustering method called Self-Organizing Maps (SOM), which, in this context, serves as a geological model building tool. This model with all its properties is the input to the Optimal Implicit Staggered Finite Difference (OISFD) algorithm to create synthetic seismograms for poroelastic, poroacoustic and elastic media. The results show a good agreement between observed and 2-D synthetic seismograms. This demonstrates that the SOM classification method enables us to extract facies from seismic data and allows us to integrate the lithology at the borehole scale with the 2-D seismic data.

  7. Automatic Building Detection based on Supervised Classification using High Resolution Google Earth Images

    NASA Astrophysics Data System (ADS)

    Ghaffarian, S.; Ghaffarian, S.

    2014-08-01

    This paper presents a novel approach to detect the buildings by automization of the training area collecting stage for supervised classification. The method based on the fact that a 3d building structure should cast a shadow under suitable imaging conditions. Therefore, the methodology begins with the detection and masking out the shadow areas using luminance component of the LAB color space, which indicates the lightness of the image, and a novel double thresholding technique. Further, the training areas for supervised classification are selected by automatically determining a buffer zone on each building whose shadow is detected by using the shadow shape and the sun illumination direction. Thereafter, by calculating the statistic values of each buffer zone which is collected from the building areas the Improved Parallelepiped Supervised Classification is executed to detect the buildings. Standard deviation thresholding applied to the Parallelepiped classification method to improve its accuracy. Finally, simple morphological operations conducted for releasing the noises and increasing the accuracy of the results. The experiments were performed on set of high resolution Google Earth images. The performance of the proposed approach was assessed by comparing the results of the proposed approach with the reference data by using well-known quality measurements (Precision, Recall and F1-score) to evaluate the pixel-based and object-based performances of the proposed approach. Evaluation of the results illustrates that buildings detected from dense and suburban districts with divers characteristics and color combinations using our proposed method have 88.4 % and 853 % overall pixel-based and object-based precision performances, respectively.

  8. Designing an activity-based costing model for a non-admitted prisoner healthcare setting.

    PubMed

    Cai, Xiao; Moore, Elizabeth; McNamara, Martin

    2013-09-01

    To design and deliver an activity-based costing model within a non-admitted prisoner healthcare setting. Key phases from the NSW Health clinical redesign methodology were utilised: diagnostic, solution design and implementation. The diagnostic phase utilised a range of strategies to identify issues requiring attention in the development of the costing model. The solution design phase conceptualised distinct 'building blocks' of activity and cost based on the speciality of clinicians providing care. These building blocks enabled the classification of activity and comparisons of costs between similar facilities. The implementation phase validated the model. The project generated an activity-based costing model based on actual activity performed, gained acceptability among clinicians and managers, and provided the basis for ongoing efficiency and benchmarking efforts.

  9. exprso: an R-package for the rapid implementation of machine learning algorithms.

    PubMed

    Quinn, Thomas; Tylee, Daniel; Glatt, Stephen

    2016-01-01

    Machine learning plays a major role in many scientific investigations. However, non-expert programmers may struggle to implement the elaborate pipelines necessary to build highly accurate and generalizable models. We introduce exprso , a new R package that is an intuitive machine learning suite designed specifically for non-expert programmers. Built initially for the classification of high-dimensional data, exprso uses an object-oriented framework to encapsulate a number of common analytical methods into a series of interchangeable modules. This includes modules for feature selection, classification, high-throughput parameter grid-searching, elaborate cross-validation schemes (e.g., Monte Carlo and nested cross-validation), ensemble classification, and prediction. In addition, exprso also supports multi-class classification (through the 1-vs-all generalization of binary classifiers) and the prediction of continuous outcomes.

  10. Damage estimation of subterranean building constructions due to groundwater inundation - the GIS-based model approach GRUWAD

    NASA Astrophysics Data System (ADS)

    Schinke, R.; Neubert, M.; Hennersdorf, J.; Stodolny, U.; Sommer, T.; Naumann, T.

    2012-09-01

    The analysis and management of flood risk commonly focuses on surface water floods, because these types are often associated with high economic losses due to damage to buildings and settlements. The rising groundwater as a secondary effect of these floods induces additional damage, particularly in the basements of buildings. Mostly, these losses remain underestimated, because they are difficult to assess, especially for the entire building stock of flood-prone urban areas. For this purpose an appropriate methodology has been developed and lead to a groundwater damage simulation model named GRUWAD. The overall methodology combines various engineering and geoinformatic methods to calculate major damage processes by high groundwater levels. It considers a classification of buildings by building types, synthetic depth-damage functions for groundwater inundation as well as the results of a groundwater-flow model. The modular structure of this procedure can be adapted in the level of detail. Hence, the model allows damage calculations from the local to the regional scale. Among others it can be used to prepare risk maps, for ex-ante analysis of future risks, and to simulate the effects of mitigation measures. Therefore, the model is a multifarious tool for determining urban resilience with respect to high groundwater levels.

  11. Classification of Birds and Bats Using Flight Tracks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cullinan, Valerie I.; Matzner, Shari; Duberstein, Corey A.

    Classification of birds and bats that use areas targeted for offshore wind farm development and the inference of their behavior is essential to evaluating the potential effects of development. The current approach to assessing the number and distribution of birds at sea involves transect surveys using trained individuals in boats or airplanes or using high-resolution imagery. These approaches are costly and have safety concerns. Based on a limited annotated library extracted from a single-camera thermal video, we provide a framework for building models that classify birds and bats and their associated behaviors. As an example, we developed a discriminant modelmore » for theoretical flight paths and applied it to data (N = 64 tracks) extracted from 5-min video clips. The agreement between model- and observer-classified path types was initially only 41%, but it increased to 73% when small-scale jitter was censored and path types were combined. Classification of 46 tracks of bats, swallows, gulls, and terns on average was 82% accurate, based on a jackknife cross-validation. Model classification of bats and terns (N = 4 and 2, respectively) was 94% and 91% correct, respectively; however, the variance associated with the tracks from these targets is poorly estimated. Model classification of gulls and swallows (N ≥ 18) was on average 73% and 85% correct, respectively. The models developed here should be considered preliminary because they are based on a small data set both in terms of the numbers of species and the identified flight tracks. Future classification models would be greatly improved by including a measure of distance between the camera and the target.« less

  12. Material classification and automatic content enrichment of images using supervised learning and knowledge bases

    NASA Astrophysics Data System (ADS)

    Mallepudi, Sri Abhishikth; Calix, Ricardo A.; Knapp, Gerald M.

    2011-02-01

    In recent years there has been a rapid increase in the size of video and image databases. Effective searching and retrieving of images from these databases is a significant current research area. In particular, there is a growing interest in query capabilities based on semantic image features such as objects, locations, and materials, known as content-based image retrieval. This study investigated mechanisms for identifying materials present in an image. These capabilities provide additional information impacting conditional probabilities about images (e.g. objects made of steel are more likely to be buildings). These capabilities are useful in Building Information Modeling (BIM) and in automatic enrichment of images. I2T methodologies are a way to enrich an image by generating text descriptions based on image analysis. In this work, a learning model is trained to detect certain materials in images. To train the model, an image dataset was constructed containing single material images of bricks, cloth, grass, sand, stones, and wood. For generalization purposes, an additional set of 50 images containing multiple materials (some not used in training) was constructed. Two different supervised learning classification models were investigated: a single multi-class SVM classifier, and multiple binary SVM classifiers (one per material). Image features included Gabor filter parameters for texture, and color histogram data for RGB components. All classification accuracy scores using the SVM-based method were above 85%. The second model helped in gathering more information from the images since it assigned multiple classes to the images. A framework for the I2T methodology is presented.

  13. The Application of Typology Method in Historical Building Information Modelling (hbim) Taking the Information Surveying and Mapping of Jiayuguan Fortress Town as AN Example

    NASA Astrophysics Data System (ADS)

    Li, D. Y.; Li, K.; Wu, C.

    2017-08-01

    With the promotion of fine degree of the heritage building surveying and mapping, building information modelling technology(BIM) begins to be used in surveying and mapping, renovation, recording and research of heritage building, called historical building information modelling(HBIM). The hierarchical frameworks of parametric component library of BIM, belonging to the same type with the same parameters, has the same internal logic with archaeological typology which is more and more popular in the age identification of ancient buildings. Compared with the common materials, 2D drawings and photos, typology with HBIM has two advantages — (1) comprehensive building information both in collection and representation and (2) uniform and reasonable classification criteria This paper will take the information surveying and mapping of Jiayuguan Fortress Town as an example to introduce the field work method of information surveying and mapping based on HBIM technology and the construction of Revit family library.And then in order to prove the feasibility and advantage of HBIM technology used in typology method, this paper will identify the age of Guanghua gate tower, Rouyuan gate tower, Wenchang pavilion and the theater building of Jiayuguan Fortress Town with HBIM technology and typology method.

  14. Lightweight Biometric Sensing for Walker Classification Using Narrowband RF Links

    PubMed Central

    Liang, Zhuo-qian

    2017-01-01

    This article proposes a lightweight biometric sensing system using ubiquitous narrowband radio frequency (RF) links for path-dependent walker classification. The fluctuated received signal strength (RSS) sequence generated by human motion is used for feature representation. To capture the most discriminative characteristics of individuals, a three-layer RF sensing network is organized for building multiple sampling links at the most common heights of upper limbs, thighs, and lower legs. The optimal parameters of sensing configuration, such as the height of link location and number of fused links, are investigated to improve sensory data distinctions among subjects, and the experimental results suggest that the synergistic sensing by using multiple links can contribute a better performance. This is the new consideration of using RF links in building a biometric sensing system. In addition, two types of classification methods involving vector quantization (VQ) and hidden Markov models (HMMs) are developed and compared for closed-set walker recognition and verification. Experimental studies in indoor line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios are conducted to validate the proposed method. PMID:29206188

  15. Lightweight Biometric Sensing for Walker Classification Using Narrowband RF Links.

    PubMed

    Liu, Tong; Liang, Zhuo-Qian

    2017-12-05

    This article proposes a lightweight biometric sensing system using ubiquitous narrowband radio frequency (RF) links for path-dependent walker classification. The fluctuated received signal strength (RSS) sequence generated by human motion is used for feature representation. To capture the most discriminative characteristics of individuals, a three-layer RF sensing network is organized for building multiple sampling links at the most common heights of upper limbs, thighs, and lower legs. The optimal parameters of sensing configuration, such as the height of link location and number of fused links, are investigated to improve sensory data distinctions among subjects, and the experimental results suggest that the synergistic sensing by using multiple links can contribute a better performance. This is the new consideration of using RF links in building a biometric sensing system. In addition, two types of classification methods involving vector quantization (VQ) and hidden Markov models (HMMs) are developed and compared for closed-set walker recognition and verification. Experimental studies in indoor line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios are conducted to validate the proposed method.

  16. lazar: a modular predictive toxicology framework

    PubMed Central

    Maunz, Andreas; Gütlein, Martin; Rautenberg, Micha; Vorgrimmler, David; Gebele, Denis; Helma, Christoph

    2013-01-01

    lazar (lazy structure–activity relationships) is a modular framework for predictive toxicology. Similar to the read across procedure in toxicological risk assessment, lazar creates local QSAR (quantitative structure–activity relationship) models for each compound to be predicted. Model developers can choose between a large variety of algorithms for descriptor calculation and selection, chemical similarity indices, and model building. This paper presents a high level description of the lazar framework and discusses the performance of example classification and regression models. PMID:23761761

  17. Progress toward the determination of correct classification rates in fire debris analysis.

    PubMed

    Waddell, Erin E; Song, Emma T; Rinke, Caitlin N; Williams, Mary R; Sigman, Michael E

    2013-07-01

    Principal components analysis (PCA), linear discriminant analysis (LDA), and quadratic discriminant analysis (QDA) were used to develop a multistep classification procedure for determining the presence of ignitable liquid residue in fire debris and assigning any ignitable liquid residue present into the classes defined under the American Society for Testing and Materials (ASTM) E 1618-10 standard method. A multistep classification procedure was tested by cross-validation based on model data sets comprised of the time-averaged mass spectra (also referred to as total ion spectra) of commercial ignitable liquids and pyrolysis products from common building materials and household furnishings (referred to simply as substrates). Fire debris samples from laboratory-scale and field test burns were also used to test the model. The optimal model's true-positive rate was 81.3% for cross-validation samples and 70.9% for fire debris samples. The false-positive rate was 9.9% for cross-validation samples and 8.9% for fire debris samples. © 2013 American Academy of Forensic Sciences.

  18. Fusion with Language Models Improves Spelling Accuracy for ERP-based Brain Computer Interface Spellers

    PubMed Central

    Orhan, Umut; Erdogmus, Deniz; Roark, Brian; Purwar, Shalini; Hild, Kenneth E.; Oken, Barry; Nezamfar, Hooman; Fried-Oken, Melanie

    2013-01-01

    Event related potentials (ERP) corresponding to a stimulus in electroencephalography (EEG) can be used to detect the intent of a person for brain computer interfaces (BCI). This paradigm is widely utilized to build letter-by-letter text input systems using BCI. Nevertheless using a BCI-typewriter depending only on EEG responses will not be sufficiently accurate for single-trial operation in general, and existing systems utilize many-trial schemes to achieve accuracy at the cost of speed. Hence incorporation of a language model based prior or additional evidence is vital to improve accuracy and speed. In this paper, we study the effects of Bayesian fusion of an n-gram language model with a regularized discriminant analysis ERP detector for EEG-based BCIs. The letter classification accuracies are rigorously evaluated for varying language model orders as well as number of ERP-inducing trials. The results demonstrate that the language models contribute significantly to letter classification accuracy. Specifically, we find that a BCI-speller supported by a 4-gram language model may achieve the same performance using 3-trial ERP classification for the initial letters of the words and using single trial ERP classification for the subsequent ones. Overall, fusion of evidence from EEG and language models yields a significant opportunity to increase the word rate of a BCI based typing system. PMID:22255652

  19. The Survey of Cultural Heritage after AN Earthquake: the Case of Emilia-Lombardia in 2012

    NASA Astrophysics Data System (ADS)

    Adami, A.; Chiarini, S.; Cremonesi, S.; Fregonese, L.; Taffurelli, L.; Valente, M. V.

    2016-06-01

    In recent years many earthquakes hit Italy and its Cultural Heritage. The topic of survey of buildings damaged by seismic events and their interpretation has become very relevant and involved many research groups and Italian Civil Protection. The damage survey has different roles: in the first stage, immediately after the emergency, the documentation is necessary for the shoring and protection of damaged structures (AEDES forms of Civil Protection). The aim of the second stage is the study and the documentation for the restoration, reconstruction and retrofitting of buildings. In this context, this study presents methods and instruments used in the survey of 24 churches in the province of Mantua, Lombardy, after the 2012 earthquake sequence. The paper examines the difficulties in surveying damaged buildings and presents the classification used to define, time by time, the most suitable survey approach in the field of Geomatics. In this classification, many aspects are taken into account, such as logistical and practical problems, safety conditions, time preserving methods, economic decisions, complexity of building and required results. The accurate documentation obtained as a three-dimensional architectural database allows for the observation and analysis of the damage, the definition of interpretative models and the development of intervention projects. Different results are obtained from the point cloud database: traditional 2D representations for architectural projects as well as 3D models for structural analysis or for the development of BIM.

  20. Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.

    2013-04-01

    In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check the consistency of the developed ontologies, and logical reasoning is performed to infer implicit relations between defined concepts. The ontology for the definition of building is specified using the Ontology Web Language (OWL). It is the most widely used ontology language that is based on Description Logics (DL). DL allows the description of internal properties of modelled concepts (roof typology, shape, area, height etc.) and relationships between objects (IS_A, MEMBER_OF/INSTANCE_OF). It captures terminological knowledge (TBox) as well as assertional knowledge (ABox) - that represents facts about concept instances, i.e. the buildings in airborne LiDAR data. To assess the classification accuracy, ground truth data generated by visual interpretation and calculated classification results in terms of precision and recall are used. The advantages of this approach are: (i) flexibility, (ii) transferability, and (iii) extendibility - i.e. ontology can be extended with further concepts, data properties and object properties.

  1. Processing of Crawled Urban Imagery for Building Use Classification

    NASA Astrophysics Data System (ADS)

    Tutzauer, P.; Haala, N.

    2017-05-01

    Recent years have shown a shift from pure geometric 3D city models to data with semantics. This is induced by new applications (e.g. Virtual/Augmented Reality) and also a requirement for concepts like Smart Cities. However, essential urban semantic data like building use categories is often not available. We present a first step in bridging this gap by proposing a pipeline to use crawled urban imagery and link it with ground truth cadastral data as an input for automatic building use classification. We aim to extract this city-relevant semantic information automatically from Street View (SV) imagery. Convolutional Neural Networks (CNNs) proved to be extremely successful for image interpretation, however, require a huge amount of training data. Main contribution of the paper is the automatic provision of such training datasets by linking semantic information as already available from databases provided from national mapping agencies or city administrations to the corresponding façade images extracted from SV. Finally, we present first investigations with a CNN and an alternative classifier as a proof of concept.

  2. 3D texture analysis for classification of second harmonic generation images of human ovarian cancer

    NASA Astrophysics Data System (ADS)

    Wen, Bruce; Campbell, Kirby R.; Tilbury, Karissa; Nadiarnykh, Oleg; Brewer, Molly A.; Patankar, Manish; Singh, Vikas; Eliceiri, Kevin. W.; Campagnola, Paul J.

    2016-10-01

    Remodeling of the collagen architecture in the extracellular matrix (ECM) has been implicated in ovarian cancer. To quantify these alterations we implemented a form of 3D texture analysis to delineate the fibrillar morphology observed in 3D Second Harmonic Generation (SHG) microscopy image data of normal (1) and high risk (2) ovarian stroma, benign ovarian tumors (3), low grade (4) and high grade (5) serous tumors, and endometrioid tumors (6). We developed a tailored set of 3D filters which extract textural features in the 3D image sets to build (or learn) statistical models of each tissue class. By applying k-nearest neighbor classification using these learned models, we achieved 83-91% accuracies for the six classes. The 3D method outperformed the analogous 2D classification on the same tissues, where we suggest this is due the increased information content. This classification based on ECM structural changes will complement conventional classification based on genetic profiles and can serve as an additional biomarker. Moreover, the texture analysis algorithm is quite general, as it does not rely on single morphological metrics such as fiber alignment, length, and width but their combined convolution with a customizable basis set.

  3. Basis of Criminalistic Classification of a Person in Republic Kazakhstan and Republic Mongolia

    ERIC Educational Resources Information Center

    Abdilov, Kanat S.; Zusbaev, Baurzan T.; Naurysbaev, Erlan A.; Nukiev, Berik A.; Nurkina, Zanar B.; Myrzahanov, Erlan N.; Urazalin, Galym T.

    2016-01-01

    In this article reviewed problems of the criminalistic classification building of a person. In the work were used legal formal, logical, comparative legal methods. The author describes classification kinds. Reveal the meaning of classification in criminalistic systematics. Shows types of grounds of criminalistic classification of a person.…

  4. Breast density characterization using texton distributions.

    PubMed

    Petroudi, Styliani; Brady, Michael

    2011-01-01

    Breast density has been shown to be one of the most significant risks for developing breast cancer, with women with dense breasts at four to six times higher risk. The Breast Imaging Reporting and Data System (BI-RADS) has a four class classification scheme that describes the different breast densities. However, there is great inter and intra observer variability among clinicians in reporting a mammogram's density class. This work presents a novel texture classification method and its application for the development of a completely automated breast density classification system. The new method represents the mammogram using textons, which can be thought of as the building blocks of texture under the operational definition of Leung and Malik as clustered filter responses. The new proposed method characterizes the mammographic appearance of the different density patterns by evaluating the texton spatial dependence matrix (TDSM) in the breast region's corresponding texton map. The TSDM is a texture model that captures both statistical and structural texture characteristics. The normalized TSDM matrices are evaluated for mammograms from the different density classes and corresponding texture models are established. Classification is achieved using a chi-square distance measure. The fully automated TSDM breast density classification method is quantitatively evaluated on mammograms from all density classes from the Oxford Mammogram Database. The incorporation of texton spatial dependencies allows for classification accuracy reaching over 82%. The breast density classification accuracy is better using texton TSDM compared to simple texton histograms.

  5. Rotation-invariant convolutional neural networks for galaxy morphology prediction

    NASA Astrophysics Data System (ADS)

    Dieleman, Sander; Willett, Kyle W.; Dambre, Joni

    2015-06-01

    Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time consuming and does not scale to large (≳104) numbers of images. Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images. We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project. For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy (>99 per cent) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts' workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the Large Synoptic Survey Telescope.

  6. The biopsychosocial domains and the functions of the medical interview in primary care: construct validity of the Verona Medical Interview Classification System.

    PubMed

    Del Piccolo, Lidia; Putnam, Samuel M; Mazzi, Maria Angela; Zimmermann, Christa

    2004-04-01

    Factor analysis (FA) is a powerful method of testing the construct validity of coding systems of the medical interview. The study uses FA to test the underlying assumptions of the Verona Medical Interview Classification System (VR-MICS). The relationship between factor scores and patient characteristics was also examined. The VR-MICS coding categories consider the three domains of the biopsychosocial model and the main functions of the medical interview-data gathering, relationship building and patient education. FA was performed on the frequencies of the VR-MICS categories based on 238 medical interviews. Seven factors (62.5% of variance explained) distinguished different strategies patients and physicians use to exchange information, build a relationship and negotiate treatment within the domains of the biopsychosocial model. Three factors, Psychological, Social Inquiry and Management of Patient Agenda, were related to patient data: sociodemographic (female gender, age and employment), social (stressful events), clinical (GHQ-12 score), personality (chance external health locus of control) and clinical characteristics (psychiatric history, chronic illness, attributed presence of emotional distress).

  7. CARSVM: a class association rule-based classification framework and its application to gene expression data.

    PubMed

    Kianmehr, Keivan; Alhajj, Reda

    2008-09-01

    In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.

  8. Evaluation of gene expression classification studies: factors associated with classification performance.

    PubMed

    Novianti, Putri W; Roes, Kit C B; Eijkemans, Marinus J C

    2014-01-01

    Classification methods used in microarray studies for gene expression are diverse in the way they deal with the underlying complexity of the data, as well as in the technique used to build the classification model. The MAQC II study on cancer classification problems has found that performance was affected by factors such as the classification algorithm, cross validation method, number of genes, and gene selection method. In this paper, we study the hypothesis that the disease under study significantly determines which method is optimal, and that additionally sample size, class imbalance, type of medical question (diagnostic, prognostic or treatment response), and microarray platform are potentially influential. A systematic literature review was used to extract the information from 48 published articles on non-cancer microarray classification studies. The impact of the various factors on the reported classification accuracy was analyzed through random-intercept logistic regression. The type of medical question and method of cross validation dominated the explained variation in accuracy among studies, followed by disease category and microarray platform. In total, 42% of the between study variation was explained by all the study specific and problem specific factors that we studied together.

  9. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    NASA Astrophysics Data System (ADS)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2 = 0.253 (0.260). The unit with the landslide susceptibility value > 0.5 (≦ 0.5) will be classified as a predicted landslide unit (not landslide unit). The AUC, i.e. the area under the relative operating characteristic curve, of or-LRLSM in the Chishan watershed is 0.72, while that of lr-LRLSM is 0.77. Furthermore, the average correct ratio of lr-LRLSM (73.3%) is better than that of or-LRLSM (68.3%). The research analyzed in detail the error sources from the two models. In continuous variables, using the landslide ratio-based classification in building the lr-LRLSM can let the distribution of weighted value more similar to distribution of landslide ratio in the range of continuous variable than that in building the or-LRLSM. In categorical variables, the meaning of using the landslide ratio-based classification in building the lr-LRLSM is to gather the parameters with approximate landslide ratio together. The mean correct ratio in continuous variables (categorical variables) by using the lr-LRLSM is better than that in or-LRLSM by 0.6 ~ 2.6% (1.7% ~ 6.0%). Building the landslide susceptibility model by using landslide ratio-based classification is practical and of better performance than that by using the original logistic regression.

  10. Detecting blind building façades from highly overlapping wide angle aerial imagery

    NASA Astrophysics Data System (ADS)

    Burochin, Jean-Pascal; Vallet, Bruno; Brédif, Mathieu; Mallet, Clément; Brosset, Thomas; Paparoditis, Nicolas

    2014-10-01

    This paper deals with the identification of blind building façades, i.e. façades which have no openings, in wide angle aerial images with a decimeter pixel size, acquired by nadir looking cameras. This blindness characterization is in general crucial for real estate estimation and has, at least in France, a particular importance on the evaluation of legal permission of constructing on a parcel due to local urban planning schemes. We assume that we have at our disposal an aerial survey with a relatively high stereo overlap along-track and across-track and a 3D city model of LoD 1, that can have been generated with the input images. The 3D model is textured with the aerial imagery by taking into account the 3D occlusions and by selecting for each façade the best available resolution texture seeing the whole façade. We then parse all 3D façades textures by looking for evidence of openings (windows or doors). This evidence is characterized by a comprehensive set of basic radiometric and geometrical features. The blindness prognostic is then elaborated through an (SVM) supervised classification. Despite the relatively low resolution of the images, we reach a classification accuracy of around 85% on decimeter resolution imagery with 60 × 40 % stereo overlap. On the one hand, we show that the results are very sensitive to the texturing resampling process and to vegetation presence on façade textures. On the other hand, the most relevant features for our classification framework are related to texture uniformity and horizontal aspect and to the maximal contrast of the opening detections. We conclude that standard aerial imagery used to build 3D city models can also be exploited to some extent and at no additional cost for facade blindness characterisation.

  11. Classification of Informal Settlements Through the Integration of 2d and 3d Features Extracted from Uav Data

    NASA Astrophysics Data System (ADS)

    Gevaert, C. M.; Persello, C.; Sliuzas, R.; Vosselman, G.

    2016-06-01

    Unmanned Aerial Vehicles (UAVs) are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Especially the dense buildings and steeply sloped terrain cause difficulties in identifying elevated objects. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. It compares the utility of pixel-based and segment-based features obtained from an orthomosaic and DSM with point-based and segment-based features extracted from the point cloud to classify an unplanned settlement in Kigali, Rwanda. Findings show that the integration of 2D and 3D features leads to higher classification accuracies.

  12. Deep Convolutional Neural Networks for Classifying Body Constitution Based on Face Image.

    PubMed

    Huan, Er-Yang; Wen, Gui-Hua; Zhang, Shi-Jun; Li, Dan-Yang; Hu, Yang; Chang, Tian-Yuan; Wang, Qing; Huang, Bing-Lin

    2017-01-01

    Body constitution classification is the basis and core content of traditional Chinese medicine constitution research. It is to extract the relevant laws from the complex constitution phenomenon and finally build the constitution classification system. Traditional identification methods have the disadvantages of inefficiency and low accuracy, for instance, questionnaires. This paper proposed a body constitution recognition algorithm based on deep convolutional neural network, which can classify individual constitution types according to face images. The proposed model first uses the convolutional neural network to extract the features of face image and then combines the extracted features with the color features. Finally, the fusion features are input to the Softmax classifier to get the classification result. Different comparison experiments show that the algorithm proposed in this paper can achieve the accuracy of 65.29% about the constitution classification. And its performance was accepted by Chinese medicine practitioners.

  13. Urban Density Indices Using Mean Shift-Based Upsampled Elevetion Data

    NASA Astrophysics Data System (ADS)

    Charou, E.; Gyftakis, S.; Bratsolis, E.; Tsenoglou, T.; Papadopoulou, Th. D.; Vassilas, N.

    2015-04-01

    Urban density is an important factor for several fields, e.g. urban design, planning and land management. Modern remote sensors deliver ample information for the estimation of specific urban land classification classes (2D indicators), and the height of urban land classification objects (3D indicators) within an Area of Interest (AOI). In this research, two of these indicators, Building Coverage Ratio (BCR) and Floor Area Ratio (FAR) are numerically and automatically derived from high-resolution airborne RGB orthophotos and LiDAR data. In the pre-processing step the low resolution elevation data are fused with the high resolution optical data through a mean-shift based discontinuity preserving smoothing algorithm. The outcome is an improved normalized digital surface model (nDSM) is an upsampled elevation data with considerable improvement regarding region filling and "straightness" of elevation discontinuities. In a following step, a Multilayer Feedforward Neural Network (MFNN) is used to classify all pixels of the AOI to building or non-building categories. For the total surface of the block and the buildings we consider the number of their pixels and the surface of the unit pixel. Comparisons of the automatically derived BCR and FAR indicators with manually derived ones shows the applicability and effectiveness of the methodology proposed.

  14. 14 CFR Section 3 - Chart of Balance Sheet Accounts

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... buildings and equipment 1639 1739 General classification Buildings 1640 1740 Maintenance buildings and... 1654 1754 Furniture, fixtures, and office equipment 1656 1756 Buildings 1660 1760 Maintenance buildings... 1510.3 Other investments and receivables 1530 Special funds 1550 Property and equipment 1600-1700...

  15. 14 CFR Section 3 - Chart of Balance Sheet Accounts

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... buildings and equipment 1639 1739 General classification Buildings 1640 1740 Maintenance buildings and... 1654 1754 Furniture, fixtures, and office equipment 1656 1756 Buildings 1660 1760 Maintenance buildings... 1510.3 Other investments and receivables 1530 Special funds 1550 Property and equipment 1600-1700...

  16. 14 CFR 3 - Chart of Balance Sheet Accounts

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... buildings and equipment 1639 1739 General classification Buildings 1640 1740 Maintenance buildings and... 1654 1754 Furniture, fixtures, and office equipment 1656 1756 Buildings 1660 1760 Maintenance buildings... 1510.3 Other investments and receivables 1530 Special funds 1550 Property and equipment 1600-1700...

  17. Comparison of Object-Based Image Analysis Approaches to Mapping New Buildings in Accra, Ghana Using Multi-Temporal QuickBird Satellite Imagery

    PubMed Central

    Tsai, Yu Hsin; Stow, Douglas; Weeks, John

    2013-01-01

    The goal of this study was to map and quantify the number of newly constructed buildings in Accra, Ghana between 2002 and 2010 based on high spatial resolution satellite image data. Two semi-automated feature detection approaches for detecting and mapping newly constructed buildings based on QuickBird very high spatial resolution satellite imagery were analyzed: (1) post-classification comparison; and (2) bi-temporal layerstack classification. Feature Analyst software based on a spatial contextual classifier and ENVI Feature Extraction that uses a true object-based image analysis approach of image segmentation and segment classification were evaluated. Final map products representing new building objects were compared and assessed for accuracy using two object-based accuracy measures, completeness and correctness. The bi-temporal layerstack method generated more accurate results compared to the post-classification comparison method due to less confusion with background objects. The spectral/spatial contextual approach (Feature Analyst) outperformed the true object-based feature delineation approach (ENVI Feature Extraction) due to its ability to more reliably delineate individual buildings of various sizes. Semi-automated, object-based detection followed by manual editing appears to be a reliable and efficient approach for detecting and enumerating new building objects. A bivariate regression analysis was performed using neighborhood-level estimates of new building density regressed on a census-derived measure of socio-economic status, yielding an inverse relationship with R2 = 0.31 (n = 27; p = 0.00). The primary utility of the new building delineation results is to support spatial analyses of land cover and land use and demographic change. PMID:24415810

  18. Joint Feature Selection and Classification for Multilabel Learning.

    PubMed

    Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong

    2018-03-01

    Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.

  19. First Steps to Automated Interior Reconstruction from Semantically Enriched Point Clouds and Imagery

    NASA Astrophysics Data System (ADS)

    Obrock, L. S.; Gülch, E.

    2018-05-01

    The automated generation of a BIM-Model from sensor data is a huge challenge for the modeling of existing buildings. Currently the measurements and analyses are time consuming, allow little automation and require expensive equipment. We do lack an automated acquisition of semantical information of objects in a building. We are presenting first results of our approach based on imagery and derived products aiming at a more automated modeling of interior for a BIM building model. We examine the building parts and objects visible in the collected images using Deep Learning Methods based on Convolutional Neural Networks. For localization and classification of building parts we apply the FCN8s-Model for pixel-wise Semantic Segmentation. We, so far, reach a Pixel Accuracy of 77.2 % and a mean Intersection over Union of 44.2 %. We finally use the network for further reasoning on the images of the interior room. We combine the segmented images with the original images and use photogrammetric methods to produce a three-dimensional point cloud. We code the extracted object types as colours of the 3D-points. We thus are able to uniquely classify the points in three-dimensional space. We preliminary investigate a simple extraction method for colour and material of building parts. It is shown, that the combined images are very well suited to further extract more semantic information for the BIM-Model. With the presented methods we see a sound basis for further automation of acquisition and modeling of semantic and geometric information of interior rooms for a BIM-Model.

  20. Preliminary Results of Earthquake-Induced Building Damage Detection with Object-Based Image Classification

    NASA Astrophysics Data System (ADS)

    Sabuncu, A.; Uca Avci, Z. D.; Sunar, F.

    2016-06-01

    Earthquakes are the most destructive natural disasters, which result in massive loss of life, infrastructure damages and financial losses. Earthquake-induced building damage detection is a very important step after earthquakes since earthquake-induced building damage is one of the most critical threats to cities and countries in terms of the area of damage, rate of collapsed buildings, the damage grade near the epicenters and also building damage types for all constructions. Van-Ercis (Turkey) earthquake (Mw= 7.1) was occurred on October 23th, 2011; at 10:41 UTC (13:41 local time) centered at 38.75 N 43.36 E that places the epicenter about 30 kilometers northern part of the city of Van. It is recorded that, 604 people died and approximately 4000 buildings collapsed or seriously damaged by the earthquake. In this study, high-resolution satellite images of Van-Ercis, acquired by Quickbird-2 (Digital Globe Inc.) after the earthquake, were used to detect the debris areas using an object-based image classification. Two different land surfaces, having homogeneous and heterogeneous land covers, were selected as case study areas. As a first step of the object-based image processing, segmentation was applied with a convenient scale parameter and homogeneity criterion parameters. As a next step, condition based classification was used. In the final step of this preliminary study, outputs were compared with streetview/ortophotos for the verification and evaluation of the classification accuracy.

  1. Structural classification of CDR-H3 revisited: a lesson in antibody modeling.

    PubMed

    Kuroda, Daisuke; Shirai, Hiroki; Kobori, Masato; Nakamura, Haruki

    2008-11-15

    Among the six complementarity-determining regions (CDRs) in the variable domains of an antibody, the third CDR of the heavy chain (CDR-H3), which lies in the center of the antigen-binding site, plays a particularly important role in antigen recognition. CDR-H3 shows significant variability in its length, sequence, and structure. Although difficult, model building of this segment is the most critical step in antibody modeling. Since our first proposal of the "H3-rules," which classify CDR-H3 structure based on amino acid sequence, the number of experimentally determined antibody structures has increased. Here, we revise these H3-rules and propose an improved classification scheme for CDR-H3 structure modeling. In addition, we determine the common features of CDR-H3 in antibody drugs as well as discuss the concept of "antibody druggability," which can be applied as an indicator of antibody evaluation during drug discovery.

  2. Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.

    PubMed

    Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa

    2016-03-01

    In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. The evaluative imaging of mental models - Visual representations of complexity

    NASA Technical Reports Server (NTRS)

    Dede, Christopher

    1989-01-01

    The paper deals with some design issues involved in building a system that could visually represent the semantic structures of training materials and their underlying mental models. In particular, hypermedia-based semantic networks that instantiate classification problem solving strategies are thought to be a useful formalism for such representations; the complexity of these web structures can be best managed through visual depictions. It is also noted that a useful approach to implement in these hypermedia models would be some metrics of conceptual distance.

  4. Building rooftop classification using random forests for large-scale PV deployment

    NASA Astrophysics Data System (ADS)

    Assouline, Dan; Mohajeri, Nahid; Scartezzini, Jean-Louis

    2017-10-01

    Large scale solar Photovoltaic (PV) deployment on existing building rooftops has proven to be one of the most efficient and viable sources of renewable energy in urban areas. As it usually requires a potential analysis over the area of interest, a crucial step is to estimate the geometric characteristics of the building rooftops. In this paper, we introduce a multi-layer machine learning methodology to classify 6 roof types, 9 aspect (azimuth) classes and 5 slope (tilt) classes for all building rooftops in Switzerland, using GIS processing. We train Random Forests (RF), an ensemble learning algorithm, to build the classifiers. We use (2 × 2) [m2 ] LiDAR data (considering buildings and vegetation) to extract several rooftop features, and a generalised footprint polygon data to localize buildings. The roof classifier is trained and tested with 1252 labeled roofs from three different urban areas, namely Baden, Luzern, and Winterthur. The results for roof type classification show an average accuracy of 67%. The aspect and slope classifiers are trained and tested with 11449 labeled roofs in the Zurich periphery area. The results for aspect and slope classification show different accuracies depending on the classes: while some classes are well identified, other under-represented classes remain challenging to detect.

  5. The Classification of Romanian High-Schools

    ERIC Educational Resources Information Center

    Ivan, Ion; Milodin, Daniel; Naie, Lucian

    2006-01-01

    The article tries to tackle the issue of high-schools classification from one city, district or from Romania. The classification criteria are presented. The National Database of Education is also presented and the application of criteria is illustrated. An algorithm for high-school multi-rang classification is proposed in order to build classes of…

  6. Building a Shared Definitional Model of Long Duration Human Spaceflight

    NASA Technical Reports Server (NTRS)

    Orr, M.; Whitmire, A.; Sandoval, L.; Leveton, L.; Arias, D.

    2011-01-01

    In 1956, on the eve of human space travel Strughold first proposed a simple classification of the present and future stages of manned flight that identified key factors, risks and developmental stages for the evolutionary journey ahead. As we look to optimize the potential of the ISS as a gateway to new destinations, we need a current shared working definitional model of long duration human space flight to help guide our path. Initial search of formal and grey literature augmented by liaison with subject matter experts. Search strategy focused on both the use of term long duration mission and long duration spaceflight, and also broader related current and historical definitions and classification models of spaceflight. The related sea and air travel literature was also subsequently explored with a view to identifying analogous models or classification systems. There are multiple different definitions and classification systems for spaceflight including phase and type of mission, craft and payload and related risk management models. However the frequently used concepts of long duration mission and long duration spaceflight are infrequently operationally defined by authors, and no commonly referenced classical or gold standard definition or model of these terms emerged from the search. The categorization (Cat) system for sailing was found to be of potential analogous utility, with its focus on understanding the need for crew and craft autonomy at various levels of potential adversity and inability to gain outside support or return to a safe location, due to factors of time, distance and location.

  7. Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification.

    PubMed

    Dai, Mengxi; Zheng, Dezhi; Liu, Shucong; Zhang, Pengju

    2018-01-01

    Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods.

  8. Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification

    PubMed Central

    Dai, Mengxi; Liu, Shucong; Zhang, Pengju

    2018-01-01

    Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods. PMID:29743934

  9. A classification of marked hijaiyah letters' pronunciation using hidden Markov model

    NASA Astrophysics Data System (ADS)

    Wisesty, Untari N.; Mubarok, M. Syahrul; Adiwijaya

    2017-08-01

    Hijaiyah letters are the letters that arrange the words in Al Qur'an consisting of 28 letters. They symbolize the consonant sounds. On the other hand, the vowel sounds are symbolized by harokat/marks. Speech recognition system is a system used to process the sound signal to be data so that it can be recognized by computer. To build the system, some stages are needed i.e characteristics/feature extraction and classification. In this research, LPC and MFCC extraction method, K-Means Quantization vector and Hidden Markov Model classification are used. The data used are the 28 letters and 6 harakat with the total class of 168. After several are testing done, it can be concluded that the system can recognize the pronunciation pattern of marked hijaiyah letter very well in the training data with its highest accuracy of 96.1% using the feature of LPC extraction and 94% using the MFCC. Meanwhile, when testing system is used, the accuracy decreases up to 41%.

  10. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition.

    PubMed

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2017-02-27

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called 'shadow features' are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.

  11. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition

    PubMed Central

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K. L.

    2017-01-01

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research. PMID:28264470

  12. A new unified framework for the early detection of the progression to diabetic retinopathy from fundus images.

    PubMed

    Leontidis, Georgios

    2017-11-01

    Human retina is a diverse and important tissue, vastly studied for various retinal and other diseases. Diabetic retinopathy (DR), a leading cause of blindness, is one of them. This work proposes a novel and complete framework for the accurate and robust extraction and analysis of a series of retinal vascular geometric features. It focuses on studying the registered bifurcations in successive years of progression from diabetes (no DR) to DR, in order to identify the vascular alterations. Retinal fundus images are utilised, and multiple experimental designs are employed. The framework includes various steps, such as image registration and segmentation, extraction of features, statistical analysis and classification models. Linear mixed models are utilised for making the statistical inferences, alongside the elastic-net logistic regression, boruta algorithm, and regularised random forests for the feature selection and classification phases, in order to evaluate the discriminative potential of the investigated features and also build classification models. A number of geometric features, such as the central retinal artery and vein equivalents, are found to differ significantly across the experiments and also have good discriminative potential. The classification systems yield promising results with the area under the curve values ranging from 0.821 to 0.968, across the four different investigated combinations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Single-trial EEG RSVP classification using convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Shamwell, Jared; Lee, Hyungtae; Kwon, Heesung; Marathe, Amar R.; Lawhern, Vernon; Nothwang, William

    2016-05-01

    Traditionally, Brain-Computer Interfaces (BCI) have been explored as a means to return function to paralyzed or otherwise debilitated individuals. An emerging use for BCIs is in human-autonomy sensor fusion where physiological data from healthy subjects is combined with machine-generated information to enhance the capabilities of artificial systems. While human-autonomy fusion of physiological data and computer vision have been shown to improve classification during visual search tasks, to date these approaches have relied on separately trained classification models for each modality. We aim to improve human-autonomy classification performance by developing a single framework that builds codependent models of human electroencephalograph (EEG) and image data to generate fused target estimates. As a first step, we developed a novel convolutional neural network (CNN) architecture and applied it to EEG recordings of subjects classifying target and non-target image presentations during a rapid serial visual presentation (RSVP) image triage task. The low signal-to-noise ratio (SNR) of EEG inherently limits the accuracy of single-trial classification and when combined with the high dimensionality of EEG recordings, extremely large training sets are needed to prevent overfitting and achieve accurate classification from raw EEG data. This paper explores a new deep CNN architecture for generalized multi-class, single-trial EEG classification across subjects. We compare classification performance from the generalized CNN architecture trained across all subjects to the individualized XDAWN, HDCA, and CSP neural classifiers which are trained and tested on single subjects. Preliminary results show that our CNN meets and slightly exceeds the performance of the other classifiers despite being trained across subjects.

  14. Authentication of Organically and Conventionally Grown Basils by Gas Chromatography/Mass Spectrometry Chemical Profiles

    PubMed Central

    Wang, Zhengfang; Chen, Pei; Yu, Liangli; Harrington, Peter de B.

    2013-01-01

    Basil plants cultivated by organic and conventional farming practices were accurately classified by pattern recognition of gas chromatography/mass spectrometry (GC/MS) data. A novel extraction procedure was devised to extract characteristic compounds from ground basil powders. Two in-house fuzzy classifiers, i.e., the fuzzy rule-building expert system (FuRES) and the fuzzy optimal associative memory (FOAM) for the first time, were used to build classification models. Two crisp classifiers, i.e., soft independent modeling by class analogy (SIMCA) and the partial least-squares discriminant analysis (PLS-DA), were used as control methods. Prior to data processing, baseline correction and retention time alignment were performed. Classifiers were built with the two-way data sets, the total ion chromatogram representation of data sets, and the total mass spectrum representation of data sets, separately. Bootstrapped Latin partition (BLP) was used as an unbiased evaluation of the classifiers. By using two-way data sets, average classification rates with FuRES, FOAM, SIMCA, and PLS-DA were 100 ± 0%, 94.4 ± 0.4%, 93.3 ± 0.4%, and 100 ± 0%, respectively, for 100 independent evaluations. The established classifiers were used to classify a new validation set collected 2.5 months later with no parametric changes except that the training set and validation set were individually mean-centered. For the new two-way validation set, classification rates with FuRES, FOAM, SIMCA, and PLS-DA were 100%, 83%, 97%, and 100%, respectively. Thereby, the GC/MS analysis was demonstrated as a viable approach for organic basil authentication. It is the first time that a FOAM has been applied to classification. A novel baseline correction method was used also for the first time. The FuRES and the FOAM are demonstrated as powerful tools for modeling and classifying GC/MS data of complex samples and the data pretreatments are demonstrated to be useful to improve the performance of classifiers. PMID:23398171

  15. Data preprocessing methods of FT-NIR spectral data for the classification cooking oil

    NASA Astrophysics Data System (ADS)

    Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli

    2014-12-01

    This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.

  16. Probabilistic topic modeling for the analysis and classification of genomic sequences

    PubMed Central

    2015-01-01

    Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734

  17. ANN modeling of DNA sequences: new strategies using DNA shape code.

    PubMed

    Parbhane, R V; Tambe, S S; Kulkarni, B D

    2000-09-01

    Two new encoding strategies, namely, wedge and twist codes, which are based on the DNA helical parameters, are introduced to represent DNA sequences in artificial neural network (ANN)-based modeling of biological systems. The performance of the new coding strategies has been evaluated by conducting three case studies involving mapping (modeling) and classification applications of ANNs. The proposed coding schemes have been compared rigorously and shown to outperform the existing coding strategies especially in situations wherein limited data are available for building the ANN models.

  18. Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity.

    PubMed

    Schneider, Nadine; Lowe, Daniel M; Sayle, Roger A; Landrum, Gregory A

    2015-01-26

    Fingerprint methods applied to molecules have proven to be useful for similarity determination and as inputs to machine-learning models. Here, we present the development of a new fingerprint for chemical reactions and validate its usefulness in building machine-learning models and in similarity assessment. Our final fingerprint is constructed as the difference of the atom-pair fingerprints of products and reactants and includes agents via calculated physicochemical properties. We validated the fingerprints on a large data set of reactions text-mined from granted United States patents from the last 40 years that have been classified using a substructure-based expert system. We applied machine learning to build a 50-class predictive model for reaction-type classification that correctly predicts 97% of the reactions in an external test set. Impressive accuracies were also observed when applying the classifier to reactions from an in-house electronic laboratory notebook. The performance of the novel fingerprint for assessing reaction similarity was evaluated by a cluster analysis that recovered 48 out of 50 of the reaction classes with a median F-score of 0.63 for the clusters. The data sets used for training and primary validation as well as all python scripts required to reproduce the analysis are provided in the Supporting Information.

  19. Contextually guided very-high-resolution imagery classification with semantic segments

    NASA Astrophysics Data System (ADS)

    Zhao, Wenzhi; Du, Shihong; Wang, Qiao; Emery, William J.

    2017-10-01

    Contextual information, revealing relationships and dependencies between image objects, is one of the most important information for the successful interpretation of very-high-resolution (VHR) remote sensing imagery. Over the last decade, geographic object-based image analysis (GEOBIA) technique has been widely used to first divide images into homogeneous parts, and then to assign semantic labels according to the properties of image segments. However, due to the complexity and heterogeneity of VHR images, segments without semantic labels (i.e., semantic-free segments) generated with low-level features often fail to represent geographic entities (such as building roofs usually be partitioned into chimney/antenna/shadow parts). As a result, it is hard to capture contextual information across geographic entities when using semantic-free segments. In contrast to low-level features, "deep" features can be used to build robust segments with accurate labels (i.e., semantic segments) in order to represent geographic entities at higher levels. Based on these semantic segments, semantic graphs can be constructed to capture contextual information in VHR images. In this paper, semantic segments were first explored with convolutional neural networks (CNN) and a conditional random field (CRF) model was then applied to model the contextual information between semantic segments. Experimental results on two challenging VHR datasets (i.e., the Vaihingen and Beijing scenes) indicate that the proposed method is an improvement over existing image classification techniques in classification performance (overall accuracy ranges from 82% to 96%).

  20. Application of Convolution Neural Network to the forecasts of flare classification and occurrence using SOHO MDI data

    NASA Astrophysics Data System (ADS)

    Park, Eunsu; Moon, Yong-Jae

    2017-08-01

    A Convolutional Neural Network(CNN) is one of the well-known deep-learning methods in image processing and computer vision area. In this study, we apply CNN to two kinds of flare forecasting models: flare classification and occurrence. For this, we consider several pre-trained models (e.g., AlexNet, GoogLeNet, and ResNet) and customize them by changing several options such as the number of layers, activation function, and optimizer. Our inputs are the same number of SOHO)/MDI images for each flare class (None, C, M and X) at 00:00 UT from Jan 1996 to Dec 2010 (total 1600 images). Outputs are the results of daily flare forecasting for flare class and occurrence. We build, train, and test the models on TensorFlow, which is well-known machine learning software library developed by Google. Our major results from this study are as follows. First, most of the models have accuracies more than 0.7. Second, ResNet developed by Microsoft has the best accuracies : 0.86 for flare classification and 0.84 for flare occurrence. Third, the accuracies of these models vary greatly with changing parameters. We discuss several possibilities to improve the models.

  1. Automatic classification of animal vocalizations

    NASA Astrophysics Data System (ADS)

    Clemins, Patrick J.

    2005-11-01

    Bioacoustics, the study of animal vocalizations, has begun to use increasingly sophisticated analysis techniques in recent years. Some common tasks in bioacoustics are repertoire determination, call detection, individual identification, stress detection, and behavior correlation. Each research study, however, uses a wide variety of different measured variables, called features, and classification systems to accomplish these tasks. The well-established field of human speech processing has developed a number of different techniques to perform many of the aforementioned bioacoustics tasks. Melfrequency cepstral coefficients (MFCCs) and perceptual linear prediction (PLP) coefficients are two popular feature sets. The hidden Markov model (HMM), a statistical model similar to a finite autonoma machine, is the most commonly used supervised classification model and is capable of modeling both temporal and spectral variations. This research designs a framework that applies models from human speech processing for bioacoustic analysis tasks. The development of the generalized perceptual linear prediction (gPLP) feature extraction model is one of the more important novel contributions of the framework. Perceptual information from the species under study can be incorporated into the gPLP feature extraction model to represent the vocalizations as the animals might perceive them. By including this perceptual information and modifying parameters of the HMM classification system, this framework can be applied to a wide range of species. The effectiveness of the framework is shown by analyzing African elephant and beluga whale vocalizations. The features extracted from the African elephant data are used as input to a supervised classification system and compared to results from traditional statistical tests. The gPLP features extracted from the beluga whale data are used in an unsupervised classification system and the results are compared to labels assigned by experts. The development of a framework from which to build animal vocalization classifiers will provide bioacoustics researchers with a consistent platform to analyze and classify vocalizations. A common framework will also allow studies to compare results across species and institutions. In addition, the use of automated classification techniques can speed analysis and uncover behavioral correlations not readily apparent using traditional techniques.

  2. Determining urban land uses through building-associated element attributes derived from lidar and aerial photographs

    NASA Astrophysics Data System (ADS)

    Meng, Xuelian

    Urban land-use research is a key component in analyzing the interactions between human activities and environmental change. Researchers have conducted many experiments to classify urban or built-up land, forest, water, agriculture, and other land-use and land-cover types. Separating residential land uses from other land uses within urban areas, however, has proven to be surprisingly troublesome. Although high-resolution images have recently become more available for land-use classification, an increase in spatial resolution does not guarantee improved classification accuracy by traditional classifiers due to the increase of class complexity. This research presents an approach to detect and separate residential land uses on a building scale directly from remotely sensed imagery to enhance urban land-use analysis. Specifically, the proposed methodology applies a multi-directional ground filter to generate a bare ground surface from lidar data, then utilizes a morphology-based building detection algorithm to identify buildings from lidar and aerial photographs, and finally separates residential buildings using a supervised C4.5 decision tree analysis based on the seven selected building land-use indicators. Successful execution of this study produces three independent methods, each corresponding to the steps of the methodology: lidar ground filtering, building detection, and building-based object-oriented land-use classification. Furthermore, this research provides a prototype as one of the few early explorations of building-based land-use analysis and successful separation of more than 85% of residential buildings based on an experiment on an 8.25-km2 study site located in Austin, Texas.

  3. Micro-bias and macro-performance.

    PubMed

    Seaver, S M D; Moreira, A A; Sales-Pardo, M; Malmgren, R D; Diermeier, D; Amaral, L A N

    2009-02-01

    We use agent-based modeling to investigate the effect of conservatism and partisanship on the efficiency with which large populations solve the density classification task - a paradigmatic problem for information aggregation and consensus building. We find that conservative agents enhance the populations' ability to efficiently solve the density classification task despite large levels of noise in the system. In contrast, we find that the presence of even a small fraction of partisans holding the minority position will result in deadlock or a consensus on an incorrect answer. Our results provide a possible explanation for the emergence of conservatism and suggest that even low levels of partisanship can lead to significant social costs.

  4. Comprehensive assessment of the efficiency of high-rise construction projects in the form of urban blocks

    NASA Astrophysics Data System (ADS)

    Orlov, Alexandr; Chubarkina, Irina

    2018-03-01

    The paper is dedicated to main modern trends in the area of high-rise construction. The classification of buildings and structures by height is given. Functional distribution by the height of buildings is presented. A review of positive and negative aspects of high-rise construction from the economic point of view is given. On the basis of the data obtained, it is proposed to build up residential microdistricts in the form of urban blocks. A plan of microdistricts development is presented. It takes into account urban blocks and includes their main characteristics. An economic and mathematical model was developed to carry out a comprehensive assessment of the effectiveness of high-rise construction projects.

  5. Recommendations for Safe Separation Distances from the Kennedy Space Center (KSC) Vehicle Assembly Building (VAB) Using a Heat-Flux-Based Analytical Approach (Abridged)

    NASA Technical Reports Server (NTRS)

    Cragg, Clinton H.; Bowman, Howard; Wilson, John E.

    2011-01-01

    The NASA Engineering and Safety Center (NESC) was requested to provide computational modeling to support the establishment of a safe separation distance surrounding the Kennedy Space Center (KSC) Vehicle Assembly Building (VAB). The two major objectives of the study were 1) establish a methodology based on thermal flux to determine safe separation distances from the Kennedy Space Center's (KSC's) Vehicle Assembly Building (VAB) with large numbers of solid propellant boosters containing hazard division 1.3 classification propellants, in case of inadvertent ignition; and 2) apply this methodology to the consideration of housing eight 5-segment solid propellant boosters in the VAB. The results of the study are contained in this report.

  6. Variable Selection for Road Segmentation in Aerial Images

    NASA Astrophysics Data System (ADS)

    Warnke, S.; Bulatov, D.

    2017-05-01

    For extraction of road pixels from combined image and elevation data, Wegner et al. (2015) proposed classification of superpixels into road and non-road, after which a refinement of the classification results using minimum cost paths and non-local optimization methods took place. We believed that the variable set used for classification was to a certain extent suboptimal, because many variables were redundant while several features known as useful in Photogrammetry and Remote Sensing are missed. This motivated us to implement a variable selection approach which builds a model for classification using portions of training data and subsets of features, evaluates this model, updates the feature set, and terminates when a stopping criterion is satisfied. The choice of classifier is flexible; however, we tested the approach with Logistic Regression and Random Forests, and taylored the evaluation module to the chosen classifier. To guarantee a fair comparison, we kept the segment-based approach and most of the variables from the related work, but we extended them by additional, mostly higher-level features. Applying these superior features, removing the redundant ones, as well as using more accurately acquired 3D data allowed to keep stable or even to reduce the misclassification error in a challenging dataset.

  7. Ensemble Sparse Classification of Alzheimer’s Disease

    PubMed Central

    Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang

    2012-01-01

    The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer’s disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classification (SRC) method, which has shown effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352

  8. Building Diversified Multiple Trees for classification in high dimensional noisy biomedical data.

    PubMed

    Li, Jiuyong; Liu, Lin; Liu, Jixue; Green, Ryan

    2017-12-01

    It is common that a trained classification model is applied to the operating data that is deviated from the training data because of noise. This paper will test an ensemble method, Diversified Multiple Tree (DMT), on its capability for classifying instances in a new laboratory using the classifier built on the instances of another laboratory. DMT is tested on three real world biomedical data sets from different laboratories in comparison with four benchmark ensemble methods, AdaBoost, Bagging, Random Forests, and Random Trees. Experiments have also been conducted on studying the limitation of DMT and its possible variations. Experimental results show that DMT is significantly more accurate than other benchmark ensemble classifiers on classifying new instances of a different laboratory from the laboratory where instances are used to build the classifier. This paper demonstrates that an ensemble classifier, DMT, is more robust in classifying noisy data than other widely used ensemble methods. DMT works on the data set that supports multiple simple trees.

  9. Typological diversity of tall buildings and complexes in relation to their functional structure

    NASA Astrophysics Data System (ADS)

    Generalov, Viktor P.; Generalova, Elena M.; Kalinkina, Nadezhda A.; Zhdanova, Irina V.

    2018-03-01

    The paper focuses on peculiarities of tall buildings and complexes, their typology and its formation in relation to their functional structure. The research is based on the analysis of tall buildings and complexes and identifies the following main functional elements of their formation: residential, administrative (office), hotel elements. The paper also considers the following services as «disseminated» in the space-planning structure: shops, medicine, entertainment, kids and sports facilities, etc., their location in the structure of the total bulk of the building and their impact on typological diversity. Research results include suggestions to add such concepts as «single-function tall buildings» and «mixed-use tall buildings and complexes» into the classification of tall buildings. In addition, if a single-function building or complex performs serving functions, it is proposed to add such concepts as «a residential tall building (complex) with provision of services», «an administrative (public) tall building (complex) with provision of services» into the classification of tall buildings. For mixed-use buildings and complexes the following terms are suggested: «a mixed-use tall building with provision of services», «a mixed-use tall complex with provision of services».

  10. Simulation of seagrass bed mapping by satellite images based on the radiative transfer model

    NASA Astrophysics Data System (ADS)

    Sagawa, Tatsuyuki; Komatsu, Teruhisa

    2015-06-01

    Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.

  11. A study of earthquake-induced building detection by object oriented classification approach

    NASA Astrophysics Data System (ADS)

    Sabuncu, Asli; Damla Uca Avci, Zehra; Sunar, Filiz

    2017-04-01

    Among the natural hazards, earthquakes are the most destructive disasters and cause huge loss of lives, heavily infrastructure damages and great financial losses every year all around the world. According to the statistics about the earthquakes, more than a million earthquakes occur which is equal to two earthquakes per minute in the world. Natural disasters have brought more than 780.000 deaths approximately % 60 of all mortality is due to the earthquakes after 2001. A great earthquake took place at 38.75 N 43.36 E in the eastern part of Turkey in Van Province on On October 23th, 2011. 604 people died and about 4000 buildings seriously damaged and collapsed after this earthquake. In recent years, the use of object oriented classification approach based on different object features, such as spectral, textural, shape and spatial information, has gained importance and became widespread for the classification of high-resolution satellite images and orthophotos. The motivation of this study is to detect the collapsed buildings and debris areas after the earthquake by using very high-resolution satellite images and orthophotos with the object oriented classification and also see how well remote sensing technology was carried out in determining the collapsed buildings. In this study, two different land surfaces were selected as homogenous and heterogeneous case study areas. In the first step of application, multi-resolution segmentation was applied and optimum parameters were selected to obtain the objects in each area after testing different color/shape and compactness/smoothness values. In the next step, two different classification approaches, namely "supervised" and "unsupervised" approaches were applied and their classification performances were compared. Object-based Image Analysis (OBIA) was performed using e-Cognition software.

  12. Feasibility of Active Machine Learning for Multiclass Compound Classification.

    PubMed

    Lang, Tobias; Flachsenberg, Florian; von Luxburg, Ulrike; Rarey, Matthias

    2016-01-25

    A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.

  13. Partonomies for interactive explorable 3D-models of anatomy.

    PubMed

    Schubert, R; Höhne, K H

    1998-01-01

    We introduce a concept to model subtle part-whole-semantics for the use with interactive 3d-models of human anatomy. Similar to experiences with modeling partonomies for physical artifacts like machines or buildings we found one unique part-whole-relation to be insufficient to represent anatomical reality. This claim will be illustrated with anatomical examples. According to the requirements these examples demand, a semantic classification of part-whole-relations is introduced. Initial results in modeling anatomical partonomies for a 3d-visualization environment proved this approach to be an promising way to represent anatomy and to enable powerful complex inferences.

  14. Rapid identification and classification of bacteria by 16S rDNA restriction fragment melting curve analyses (RFMCA).

    PubMed

    Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T

    2007-08-01

    The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.

  15. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  16. Music-Elicited Emotion Identification Using Optical Flow Analysis of Human Face

    NASA Astrophysics Data System (ADS)

    Kniaz, V. V.; Smirnova, Z. N.

    2015-05-01

    Human emotion identification from image sequences is highly demanded nowadays. The range of possible applications can vary from an automatic smile shutter function of consumer grade digital cameras to Biofied Building technologies, which enables communication between building space and residents. The highly perceptual nature of human emotions leads to the complexity of their classification and identification. The main question arises from the subjective quality of emotional classification of events that elicit human emotions. A variety of methods for formal classification of emotions were developed in musical psychology. This work is focused on identification of human emotions evoked by musical pieces using human face tracking and optical flow analysis. Facial feature tracking algorithm used for facial feature speed and position estimation is presented. Facial features were extracted from each image sequence using human face tracking with local binary patterns (LBP) features. Accurate relative speeds of facial features were estimated using optical flow analysis. Obtained relative positions and speeds were used as the output facial emotion vector. The algorithm was tested using original software and recorded image sequences. The proposed technique proves to give a robust identification of human emotions elicited by musical pieces. The estimated models could be used for human emotion identification from image sequences in such fields as emotion based musical background or mood dependent radio.

  17. Multispectral LiDAR Data for Land Cover Classification of Urban Areas

    PubMed Central

    Morsy, Salem; Shaker, Ahmed; El-Rabbany, Ahmed

    2017-01-01

    Airborne Light Detection And Ranging (LiDAR) systems usually operate at a monochromatic wavelength measuring the range and the strength of the reflected energy (intensity) from objects. Recently, multispectral LiDAR sensors, which acquire data at different wavelengths, have emerged. This allows for recording of a diversity of spectral reflectance from objects. In this context, we aim to investigate the use of multispectral LiDAR data in land cover classification using two different techniques. The first is image-based classification, where intensity and height images are created from LiDAR points and then a maximum likelihood classifier is applied. The second is point-based classification, where ground filtering and Normalized Difference Vegetation Indices (NDVIs) computation are conducted. A dataset of an urban area located in Oshawa, Ontario, Canada, is classified into four classes: buildings, trees, roads and grass. An overall accuracy of up to 89.9% and 92.7% is achieved from image classification and 3D point classification, respectively. A radiometric correction model is also applied to the intensity data in order to remove the attenuation due to the system distortion and terrain height variation. The classification process is then repeated, and the results demonstrate that there are no significant improvements achieved in the overall accuracy. PMID:28445432

  18. Multispectral LiDAR Data for Land Cover Classification of Urban Areas.

    PubMed

    Morsy, Salem; Shaker, Ahmed; El-Rabbany, Ahmed

    2017-04-26

    Airborne Light Detection And Ranging (LiDAR) systems usually operate at a monochromatic wavelength measuring the range and the strength of the reflected energy (intensity) from objects. Recently, multispectral LiDAR sensors, which acquire data at different wavelengths, have emerged. This allows for recording of a diversity of spectral reflectance from objects. In this context, we aim to investigate the use of multispectral LiDAR data in land cover classification using two different techniques. The first is image-based classification, where intensity and height images are created from LiDAR points and then a maximum likelihood classifier is applied. The second is point-based classification, where ground filtering and Normalized Difference Vegetation Indices (NDVIs) computation are conducted. A dataset of an urban area located in Oshawa, Ontario, Canada, is classified into four classes: buildings, trees, roads and grass. An overall accuracy of up to 89.9% and 92.7% is achieved from image classification and 3D point classification, respectively. A radiometric correction model is also applied to the intensity data in order to remove the attenuation due to the system distortion and terrain height variation. The classification process is then repeated, and the results demonstrate that there are no significant improvements achieved in the overall accuracy.

  19. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    PubMed Central

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  20. Hybrid Optimization of Object-Based Classification in High-Resolution Images Using Continous ANT Colony Algorithm with Emphasis on Building Detection

    NASA Astrophysics Data System (ADS)

    Tamimi, E.; Ebadi, H.; Kiani, A.

    2017-09-01

    Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.

  1. Heuristic Classification.

    DTIC Science & Technology

    1985-08-01

    learned not only a specific thing but also a model for understanding other things like it that one may encounter. ( Bruner , 1960) Abstract ’- A broad...motivation of wanting to formalize what we have learned about building expert systems. How can we classify problems? How can we select problems that are...nor sufficient characteristic of an EDUCATED-PERSON). Illustrating the power of a knowledge-level analysis, we discover that the people and book

  2. The Design of Archives Buildings.

    ERIC Educational Resources Information Center

    Faye, Bernard

    1982-01-01

    Studies specific problems arising from design of archives buildings and examines three main purposes of this type of building, namely conservation, classification and restoration of archives, and the provision of access to them by administrators and research workers. Three references are listed. (Author/EJS)

  3. Designing and Implementation of River Classification Assistant Management System

    NASA Astrophysics Data System (ADS)

    Zhao, Yinjun; Jiang, Wenyuan; Yang, Rujun; Yang, Nan; Liu, Haiyan

    2018-03-01

    In an earlier publication, we proposed a new Decision Classifier (DCF) for Chinese river classification based on their structures. To expand, enhance and promote the application of the DCF, we build a computer system to support river classification named River Classification Assistant Management System. Based on ArcEngine and ArcServer platform, this system implements many functions such as data management, extraction of river network, river classification, and results publication under combining Client / Server with Browser / Server framework.

  4. Localized Segment Based Processing for Automatic Building Extraction from LiDAR Data

    NASA Astrophysics Data System (ADS)

    Parida, G.; Rajan, K. S.

    2017-05-01

    The current methods of object segmentation and extraction and classification of aerial LiDAR data is manual and tedious task. This work proposes a technique for object segmentation out of LiDAR data. A bottom-up geometric rule based approach was used initially to devise a way to segment buildings out of the LiDAR datasets. For curved wall surfaces, comparison of localized surface normals was done to segment buildings. The algorithm has been applied to both synthetic datasets as well as real world dataset of Vaihingen, Germany. Preliminary results show successful segmentation of the buildings objects from a given scene in case of synthetic datasets and promissory results in case of real world data. The advantages of the proposed work is non-dependence on any other form of data required except LiDAR. It is an unsupervised method of building segmentation, thus requires no model training as seen in supervised techniques. It focuses on extracting the walls of the buildings to construct the footprint, rather than focussing on roof. The focus on extracting the wall to reconstruct the buildings from a LiDAR scene is crux of the method proposed. The current segmentation approach can be used to get 2D footprints of the buildings, with further scope to generate 3D models. Thus, the proposed method can be used as a tool to get footprints of buildings in urban landscapes, helping in urban planning and the smart cities endeavour.

  5. Progressive Classification Using Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.

  6. Cyberpsychology: a human-interaction perspective based on cognitive modeling.

    PubMed

    Emond, Bruno; West, Robert L

    2003-10-01

    This paper argues for the relevance of cognitive modeling and cognitive architectures to cyberpsychology. From a human-computer interaction point of view, cognitive modeling can have benefits both for theory and model building, and for the design and evaluation of sociotechnical systems usability. Cognitive modeling research applied to human-computer interaction has two complimentary objectives: (1) to develop theories and computational models of human interactive behavior with information and collaborative technologies, and (2) to use the computational models as building blocks for the design, implementation, and evaluation of interactive technologies. From the perspective of building theories and models, cognitive modeling offers the possibility to anchor cyberpsychology theories and models into cognitive architectures. From the perspective of the design and evaluation of socio-technical systems, cognitive models can provide the basis for simulated users, which can play an important role in usability testing. As an example of application of cognitive modeling to technology design, the paper presents a simulation of interactive behavior with five different adaptive menu algorithms: random, fixed, stacked, frequency based, and activation based. Results of the simulation indicate that fixed menu positions seem to offer the best support for classification like tasks such as filing e-mails. This research is part of the Human-Computer Interaction, and the Broadband Visual Communication research programs at the National Research Council of Canada, in collaboration with the Carleton Cognitive Modeling Lab at Carleton University.

  7. [A systematic review of worldwide natural history models of colorectal cancer: classification, transition rate and a recommendation for developing Chinese population-specific model].

    PubMed

    Li, Z F; Huang, H Y; Shi, J F; Guo, C G; Zou, S M; Liu, C C; Wang, Y; Wang, L; Zhu, S L; Wu, S L; Dai, M

    2017-02-10

    Objective: To review the worldwide studies on natural history models among colorectal cancer (CRC), and to inform building a Chinese population-specific CRC model and developing a platform for further evaluation of CRC screening and other interventions in population in China. Methods: A structured literature search process was conducted in PubMed and the target publication dates were from January 1995 to December 2014. Information about classification systems on both colorectal cancer and precancer on corresponding transition rate, were extracted and summarized. Indicators were mainly expressed by the medians and ranges of annual progression or regression rate. Results: A total of 24 studies were extracted from 1 022 studies, most were from America ( n =9), but 2 from China including 1 from the mainland area, mainly based on Markov model ( n =22). Classification systems for adenomas included progression risk ( n =9) and the sizes of adenoma ( n =13, divided into two ways) as follows: 1) Based on studies where adenoma was risk-dependent, the median annual transition rates, from ' normal status' to ' non-advanced adenoma', 'non-advanced' to ' advanced' and ' advanced adenoma' to CRC were 0.016 0 (range: 0.002 2-0.020 0), 0.020 (range: 0.002-0.177) and 0.044 (range: 0.005-0.063), respectively. 2) Median annual transition rates, based on studies where adenoma were classified by sizes, into <10 mm and ≥10 mm ( n =7), from ' normal' to adenoma <10 mm, from adenoma <10 mm to adenoma ≥10 mm and adenoma ≥ 10 mm to CRC, were 0.016 7 (range: 0.015 0-0.037 0), 0.020 (range: 0.015-0.035) and 0.040 0 (range: 0.008 5-0.050 0), respectively. 3) Median annual transition rates, based on studies where adenoma, were classified by sizes into diminutive (≤5 mm), small (6-9 mm) and large adenoma (≥10 mm) ( n =6), from ' normal' to diminutive adenoma,'diminutive' to ' small','small' to ' large', and large adenoma to CRC were 0.013 (range: 0.009-0.019), 0.043 (range: 0.020-0.085), 0.044 (range: 0.020-0.125) and 0.033 5 (range: 0.030-0.040), respectively. Staging system of CRC mainly included LRD (localized/regional/distant, n =10), Dukes' ( n =7) and TNM ( n =3). When using the LRD classification, the median annual transition rates from ' localized' to ' regional' and ' regional' to 'distant' were 0.28 (range: 0.20-0.33) and 0.40 (range: 0.24-0.63), respectively. Under the Dukes' classification, the median annual transition rates appeared as 0.583 (range: 0.050-0.910), 0.656 (range: 0.280-0.720) and 0.830 (range: 0.630-0.865) from Dukes' A to B, B to C and C to Dukes' D, respectively. Again, when using the TNM classification, very limited transition rate was reported. Serrated pathway was only described in one study. Conclusions: Studies on the natural history model of colorectal cancer was still limited worldwide. Adenoma seemed the most common status setting for precancer model, and the risk-dependent classification for adenoma was consistent with the most commonly used system in clinical practice as well as major cancer screening programs in China. Since the staging systems of cancers varied, and shortage of transition rates based on TNM classification (commonly used in China), there will be a challenge for building Chinese population-specific natural history model of colorectal cancer, information from other classification systems could be conditionally applied.

  8. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  9. Semantic Building FAÇADE Segmentation from Airborne Oblique Images

    NASA Astrophysics Data System (ADS)

    Lin, Y.; Nex, F.; Yang, M. Y.

    2018-05-01

    With the introduction of airborne oblique camera systems and the improvement of photogrammetric techniques, high-resolution 2D and 3D data can be acquired in urban areas. This high-resolution data allows us to perform detailed investigations on building roofs and façades which can contribute to LoD3 city modeling. Normally, façade segmentation is achieved from terrestrial views. In this paper, we address the problem from aerial views by using high resolution oblique aerial images as the data source in urban areas. In addition to traditional image features, such as RGB and SIFT, normal vector and planarity are also extracted from dense matching point clouds. Then, these 3D geometrical features are projected back to 2D space to assist façade interpretation. Random forest is trained and applied to label façade pixels. Fully connected conditional random field (CRF), capturing long-range spatial interactions, is used as a post-processing to refine our classification results. Its pairwise potential is defined by a linear combination of Gaussian kernels and the CRF model is efficiently solved by mean field approximation. Experiments show that 3D features can significantly improve classification results. Also, fully connected CRF performs well in correcting noisy pixels.

  10. Global Dynamic Exposure and the OpenBuildingMap

    NASA Astrophysics Data System (ADS)

    Schorlemmer, D.; Beutin, T.; Hirata, N.; Hao, K. X.; Wyss, M.; Cotton, F.; Prehn, K.

    2015-12-01

    Detailed understanding of local risk factors regarding natural catastrophes requires in-depth characterization of the local exposure. Current exposure capture techniques have to find the balance between resolution and coverage. We aim at bridging this gap by employing a crowd-sourced approach to exposure capturing focusing on risk related to earthquake hazard. OpenStreetMap (OSM), the rich and constantly growing geographical database, is an ideal foundation for us. More than 2.5 billion geographical nodes, more than 150 million building footprints (growing by ~100'000 per day), and a plethora of information about school, hospital, and other critical facility locations allow us to exploit this dataset for risk-related computations. We will harvest this dataset by collecting exposure and vulnerability indicators from explicitly provided data (e.g. hospital locations), implicitly provided data (e.g. building shapes and positions), and semantically derived data, i.e. interpretation applying expert knowledge. With this approach, we can increase the resolution of existing exposure models from fragility classes distribution via block-by-block specifications to building-by-building vulnerability. To increase coverage, we will provide a framework for collecting building data by any person or community. We will implement a double crowd-sourced approach to bring together the interest and enthusiasm of communities with the knowledge of earthquake and engineering experts. The first crowd-sourced approach aims at collecting building properties in a community by local people and activists. This will be supported by tailored building capture tools for mobile devices for simple and fast building property capturing. The second crowd-sourced approach involves local experts in estimating building vulnerability that will provide building classification rules that translate building properties into vulnerability and exposure indicators as defined in the Building Taxonomy 2.0 developed by the Global Earthquake Model (GEM). These indicators will then be combined with a hazard model using the GEM OpenQuake engine to compute a risk model. The free/open framework we will provide can be used on commodity hardware for local to regional exposure capturing and for communities to understand their earthquake risk.

  11. Using geometrical, textural, and contextual information of land parcels for classification of detailed urban land use

    USGS Publications Warehouse

    Wu, S.-S.; Qiu, X.; Usery, E.L.; Wang, L.

    2009-01-01

    Detailed urban land use data are important to government officials, researchers, and businesspeople for a variety of purposes. This article presents an approach to classifying detailed urban land use based on geometrical, textural, and contextual information of land parcels. An area of 6 by 14 km in Austin, Texas, with land parcel boundaries delineated by the Travis Central Appraisal District of Travis County, Texas, is tested for the approach. We derive fifty parcel attributes from relevant geographic information system (GIS) and remote sensing data and use them to discriminate among nine urban land uses: single family, multifamily, commercial, office, industrial, civic, open space, transportation, and undeveloped. Half of the 33,025 parcels in the study area are used as training data for land use classification and the other half are used as testing data for accuracy assessment. The best result with a decision tree classification algorithm has an overall accuracy of 96 percent and a kappa coefficient of 0.78, and two naive, baseline models based on the majority rule and the spatial autocorrelation rule have overall accuracy of 89 percent and 79 percent, respectively. The algorithm is relatively good at classifying single-family, multifamily, commercial, open space, and undeveloped land uses and relatively poor at classifying office, industrial, civic, and transportation land uses. The most important attributes for land use classification are the geometrical attributes, particularly those related to building areas. Next are the contextual attributes, particularly those relevant to the spatial relationship between buildings, then the textural attributes, particularly the semivariance texture statistic from 0.61-m resolution images.

  12. Probabilistic multiple sclerosis lesion classification based on modeling regional intensity variability and local neighborhood information.

    PubMed

    Harmouche, Rola; Subbanna, Nagesh K; Collins, D Louis; Arnold, Douglas L; Arbel, Tal

    2015-05-01

    In this paper, a fully automatic probabilistic method for multiple sclerosis (MS) lesion classification is presented, whereby the posterior probability density function over healthy tissues and two types of lesions (T1-hypointense and T2-hyperintense) is generated at every voxel. During training, the system explicitly models the spatial variability of the intensity distributions throughout the brain by first segmenting it into distinct anatomical regions and then building regional likelihood distributions for each tissue class based on multimodal magnetic resonance image (MRI) intensities. Local class smoothness is ensured by incorporating neighboring voxel information in the prior probability through Markov random fields. The system is tested on two datasets from real multisite clinical trials consisting of multimodal MRIs from a total of 100 patients with MS. Lesion classification results based on the framework are compared with and without the regional information, as well as with other state-of-the-art methods against the labels from expert manual raters. The metrics for comparison include Dice overlap, sensitivity, and positive predictive rates for both voxel and lesion classifications. Statistically significant improvements in Dice values ( ), for voxel-based and lesion-based sensitivity values ( ), and positive predictive rates ( and respectively) are shown when the proposed method is compared to the method without regional information, and to a widely used method [1]. This holds particularly true in the posterior fossa, an area where classification is very challenging. The proposed method allows us to provide clinicians with accurate tissue labels for T1-hypointense and T2-hyperintense lesions, two types of lesions that differ in appearance and clinical ramifications, and with a confidence level in the classification, which helps clinicians assess the classification results.

  13. Design of AN Intelligent Individual Evacuation Model for High Rise Building Fires Based on Neural Network Within the Scope of 3d GIS

    NASA Astrophysics Data System (ADS)

    Atila, U.; Karas, I. R.; Turan, M. K.; Rahman, A. A.

    2013-09-01

    One of the most dangerous disaster threatening the high rise and complex buildings of today's world including thousands of occupants inside is fire with no doubt. When we consider high population and the complexity of such buildings it is clear to see that performing a rapid and safe evacuation seems hard and human being does not have good memories in case of such disasters like world trade center 9/11. Therefore, it is very important to design knowledge based realtime interactive evacuation methods instead of classical strategies which lack of flexibility. This paper presents a 3D-GIS implementation which simulates the behaviour of an intelligent indoor pedestrian navigation model proposed for a self -evacuation of a person in case of fire. The model is based on Multilayer Perceptron (MLP) which is one of the most preferred artificial neural network architecture in classification and prediction problems. A sample fire scenario following through predefined instructions has been performed on 3D model of the Corporation Complex in Putrajaya (Malaysia) and the intelligent evacuation process has been realized within a proposed 3D-GIS based simulation.

  14. Automatic topic identification of health-related messages in online health community using text classification.

    PubMed

    Lu, Yingjie

    2013-01-01

    To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.

  15. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

    PubMed

    Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

    2013-03-01

    Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Decision tree methods: applications for classification and prediction.

    PubMed

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.

  17. A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.

    PubMed

    Ni, Qianwu; Chen, Lei

    2017-01-01

    Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Creating a behavioural classification module for acceleration data: using a captive surrogate for difficult to observe species.

    PubMed

    Campbell, Hamish A; Gao, Lianli; Bidder, Owen R; Hunter, Jane; Franklin, Craig E

    2013-12-15

    Distinguishing specific behavioural modes from data collected by animal-borne tri-axial accelerometers can be a time-consuming and subjective process. Data synthesis can be further inhibited when the tri-axial acceleration data cannot be paired with the corresponding behavioural mode through direct observation. Here, we explored the use of a tame surrogate (domestic dog) to build a behavioural classification module, and then used that module to accurately identify and quantify behavioural modes within acceleration collected from other individuals/species. Tri-axial acceleration data were recorded from a domestic dog whilst it was commanded to walk, run, sit, stand and lie-down. Through video synchronisation, each tri-axial acceleration sample was annotated with its associated behavioural mode; the feature vectors were extracted and used to build the classification module through the application of support vector machines (SVMs). This behavioural classification module was then used to identify and quantify the same behavioural modes in acceleration collected from a range of other species (alligator, badger, cheetah, dingo, echidna, kangaroo and wombat). Evaluation of the module performance, using a binary classification system, showed there was a high capacity (>90%) for behaviour recognition between individuals of the same species. Furthermore, a positive correlation existed between SVM capacity and the similarity of the individual's spinal length-to-height above the ground ratio (SL:SH) to that of the surrogate. The study describes how to build a behavioural classification module and highlights the value of using a surrogate for studying cryptic, rare or endangered species.

  19. Classification method, spectral diversity, band combination and accuracy assessment evaluation for urban feature detection

    NASA Astrophysics Data System (ADS)

    Erener, A.

    2013-04-01

    Automatic extraction of urban features from high resolution satellite images is one of the main applications in remote sensing. It is useful for wide scale applications, namely: urban planning, urban mapping, disaster management, GIS (geographic information systems) updating, and military target detection. One common approach to detecting urban features from high resolution images is to use automatic classification methods. This paper has four main objectives with respect to detecting buildings. The first objective is to compare the performance of the most notable supervised classification algorithms, including the maximum likelihood classifier (MLC) and the support vector machine (SVM). In this experiment the primary consideration is the impact of kernel configuration on the performance of the SVM. The second objective of the study is to explore the suitability of integrating additional bands, namely first principal component (1st PC) and the intensity image, for original data for multi classification approaches. The performance evaluation of classification results is done using two different accuracy assessment methods: pixel based and object based approaches, which reflect the third aim of the study. The objective here is to demonstrate the differences in the evaluation of accuracies of classification methods. Considering consistency, the same set of ground truth data which is produced by labeling the building boundaries in the GIS environment is used for accuracy assessment. Lastly, the fourth aim is to experimentally evaluate variation in the accuracy of classifiers for six different real situations in order to identify the impact of spatial and spectral diversity on results. The method is applied to Quickbird images for various urban complexity levels, extending from simple to complex urban patterns. The simple surface type includes a regular urban area with low density and systematic buildings with brick rooftops. The complex surface type involves almost all kinds of challenges, such as high dense build up areas, regions with bare soil, and small and large buildings with different rooftops, such as concrete, brick, and metal. Using the pixel based accuracy assessment it was shown that the percent building detection (PBD) and quality percent (QP) of the MLC and SVM depend on the complexity and texture variation of the region. Generally, PBD values range between 70% and 90% for the MLC and SVM, respectively. No substantial improvements were observed when the SVM and MLC classifications were developed by the addition of more variables, instead of the use of only four bands. In the evaluation of object based accuracy assessment, it was demonstrated that while MLC and SVM provide higher rates of correct detection, they also provide higher rates of false alarms.

  20. Predicting Drug-induced Hepatotoxicity Using QSAR and Toxicogenomics Approaches

    PubMed Central

    Low, Yen; Uehara, Takeki; Minowa, Yohsuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro; Sedykh, Alexander; Muratov, Eugene; Fourches, Denis; Zhu, Hao; Rusyn, Ivan; Tropsha, Alexander

    2014-01-01

    Quantitative Structure-Activity Relationship (QSAR) modeling and toxicogenomics are used independently as predictive tools in toxicology. In this study, we evaluated the power of several statistical models for predicting drug hepatotoxicity in rats using different descriptors of drug molecules, namely their chemical descriptors and toxicogenomic profiles. The records were taken from the Toxicogenomics Project rat liver microarray database containing information on 127 drugs (http://toxico.nibio.go.jp/datalist.html). The model endpoint was hepatotoxicity in the rat following 28 days of exposure, established by liver histopathology and serum chemistry. First, we developed multiple conventional QSAR classification models using a comprehensive set of chemical descriptors and several classification methods (k nearest neighbor, support vector machines, random forests, and distance weighted discrimination). With chemical descriptors alone, external predictivity (Correct Classification Rate, CCR) from 5-fold external cross-validation was 61%. Next, the same classification methods were employed to build models using only toxicogenomic data (24h after a single exposure) treated as biological descriptors. The optimized models used only 85 selected toxicogenomic descriptors and had CCR as high as 76%. Finally, hybrid models combining both chemical descriptors and transcripts were developed; their CCRs were between 68 and 77%. Although the accuracy of hybrid models did not exceed that of the models based on toxicogenomic data alone, the use of both chemical and biological descriptors enriched the interpretation of the models. In addition to finding 85 transcripts that were predictive and highly relevant to the mechanisms of drug-induced liver injury, chemical structural alerts for hepatotoxicity were also identified. These results suggest that concurrent exploration of the chemical features and acute treatment-induced changes in transcript levels will both enrich the mechanistic understanding of sub-chronic liver injury and afford models capable of accurate prediction of hepatotoxicity from chemical structure and short-term assay results. PMID:21699217

  1. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait--a cohort study.

    PubMed

    Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2013-05-14

    We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Incident type 2 diabetes, hypertension and comorbidity. Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign 'high' risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned 'low' risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case-control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait.

  2. Building a Multi-Discipline Digital Library Through Extending the Dienst Protocol

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.; Maly, Kurt; Shen, Stewart N. T.

    1997-01-01

    The purpose of this project is to establish multi-discipline capability for a unified, canonical digital library service for scientific and technical information (STI). This is accomplished by extending the Dienst Protocol to be aware of subject classification of a servers holdings. We propose a hierarchical, general, and extendible subject classification that can encapsulate existing classification systems.

  3. Risk assessment of storm surge disaster based on numerical models and remote sensing

    NASA Astrophysics Data System (ADS)

    Liu, Qingrong; Ruan, Chengqing; Zhong, Shan; Li, Jian; Yin, Zhonghui; Lian, Xihu

    2018-06-01

    Storm surge is one of the most serious ocean disasters in the world. Risk assessment of storm surge disaster for coastal areas has important implications for planning economic development and reducing disaster losses. Based on risk assessment theory, this paper uses coastal hydrological observations, a numerical storm surge model and multi-source remote sensing data, proposes methods for valuing hazard and vulnerability for storm surge and builds a storm surge risk assessment model. Storm surges in different recurrence periods are simulated in numerical models and the flooding areas and depth are calculated, which are used for assessing the hazard of storm surge; remote sensing data and GIS technology are used for extraction of coastal key objects and classification of coastal land use are identified, which is used for vulnerability assessment of storm surge disaster. The storm surge risk assessment model is applied for a typical coastal city, and the result shows the reliability and validity of the risk assessment model. The building and application of storm surge risk assessment model provides some basis reference for the city development plan and strengthens disaster prevention and mitigation.

  4. Classification of Parkinson's disease utilizing multi-edit nearest-neighbor and ensemble learning algorithms with speech samples.

    PubMed

    Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang

    2016-11-16

    The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.

  5. An approach for combining airborne LiDAR and high-resolution aerial color imagery using Gaussian processes

    NASA Astrophysics Data System (ADS)

    Liu, Yansong; Monteiro, Sildomar T.; Saber, Eli

    2015-10-01

    Changes in vegetation cover, building construction, road network and traffic conditions caused by urban expansion affect the human habitat as well as the natural environment in rapidly developing cities. It is crucial to assess these changes and respond accordingly by identifying man-made and natural structures with accurate classification algorithms. With the increase in use of multi-sensor remote sensing systems, researchers are able to obtain a more complete description of the scene of interest. By utilizing multi-sensor data, the accuracy of classification algorithms can be improved. In this paper, we propose a method for combining 3D LiDAR point clouds and high-resolution color images to classify urban areas using Gaussian processes (GP). GP classification is a powerful non-parametric classification method that yields probabilistic classification results. It makes predictions in a way that addresses the uncertainty of real world. In this paper, we attempt to identify man-made and natural objects in urban areas including buildings, roads, trees, grass, water and vehicles. LiDAR features are derived from the 3D point clouds and the spatial and color features are extracted from RGB images. For classification, we use the Laplacian approximation for GP binary classification on the new combined feature space. The multiclass classification has been implemented by using one-vs-all binary classification strategy. The result of applying support vector machines (SVMs) and logistic regression (LR) classifier is also provided for comparison. Our experiments show a clear improvement of classification results by using the two sensors combined instead of each sensor separately. Also we found the advantage of applying GP approach to handle the uncertainty in classification result without compromising accuracy compared to SVM, which is considered as the state-of-the-art classification method.

  6. Detection of motor imagery of swallow EEG signals based on the dual-tree complex wavelet transform and adaptive model selection

    NASA Astrophysics Data System (ADS)

    Yang, Huijuan; Guan, Cuntai; Sui Geok Chua, Karen; San Chok, See; Wang, Chuan Chu; Kok Soon, Phua; Tang, Christina Ka Yin; Keng Ang, Kai

    2014-06-01

    Objective. Detection of motor imagery of hand/arm has been extensively studied for stroke rehabilitation. This paper firstly investigates the detection of motor imagery of swallow (MI-SW) and motor imagery of tongue protrusion (MI-Ton) in an attempt to find a novel solution for post-stroke dysphagia rehabilitation. Detection of MI-SW from a simple yet relevant modality such as MI-Ton is then investigated, motivated by the similarity in activation patterns between tongue movements and swallowing and there being fewer movement artifacts in performing tongue movements compared to swallowing. Approach. Novel features were extracted based on the coefficients of the dual-tree complex wavelet transform to build multiple training models for detecting MI-SW. The session-to-session classification accuracy was boosted by adaptively selecting the training model to maximize the ratio of between-classes distances versus within-class distances, using features of training and evaluation data. Main results. Our proposed method yielded averaged cross-validation (CV) classification accuracies of 70.89% and 73.79% for MI-SW and MI-Ton for ten healthy subjects, which are significantly better than the results from existing methods. In addition, averaged CV accuracies of 66.40% and 70.24% for MI-SW and MI-Ton were obtained for one stroke patient, demonstrating the detectability of MI-SW and MI-Ton from the idle state. Furthermore, averaged session-to-session classification accuracies of 72.08% and 70% were achieved for ten healthy subjects and one stroke patient using the MI-Ton model. Significance. These results and the subjectwise strong correlations in classification accuracies between MI-SW and MI-Ton demonstrated the feasibility of detecting MI-SW from MI-Ton models.

  7. Automatically updating predictive modeling workflows support decision-making in drug design.

    PubMed

    Muegge, Ingo; Bentzien, Jörg; Mukherjee, Prasenjit; Hughes, Robert O

    2016-09-01

    Using predictive models for early decision-making in drug discovery has become standard practice. We suggest that model building needs to be automated with minimum input and low technical maintenance requirements. Models perform best when tailored to answering specific compound optimization related questions. If qualitative answers are required, 2-bin classification models are preferred. Integrating predictive modeling results with structural information stimulates better decision making. For in silico models supporting rapid structure-activity relationship cycles the performance deteriorates within weeks. Frequent automated updates of predictive models ensure best predictions. Consensus between multiple modeling approaches increases the prediction confidence. Combining qualified and nonqualified data optimally uses all available information. Dose predictions provide a holistic alternative to multiple individual property predictions for reaching complex decisions.

  8. Place-classification analysis of community vulnerability to near-field tsunami threats in the U.S. Pacific Northwest (Invited)

    NASA Astrophysics Data System (ADS)

    Wood, N. J.; Jones, J.; Spielman, S.

    2013-12-01

    Near-field tsunami hazards are credible threats to many coastal communities throughout the world. Along the U.S. Pacific Northwest coast, low-lying areas could be inundated by a series of catastrophic tsunami waves that begin to arrive in a matter of minutes following a Cascadia subduction zone (CSZ) earthquake. This presentation summarizes analytical efforts to classify communities with similar characteristics of community vulnerability to tsunami hazards. This work builds on past State-focused inventories of community exposure to CSZ-related tsunami hazards in northern California, Oregon, and Washington. Attributes used in the classification, or cluster analysis, include demography of residents, spatial extent of the developed footprint based on mid-resolution land cover data, distribution of the local workforce, and the number and type of public venues, dependent-care facilities, and community-support businesses. Population distributions also are characterized by a function of travel time to safety, based on anisotropic, path-distance, geospatial modeling. We used an unsupervised-model-based clustering algorithm and a v-fold, cross-validation procedure (v=50) to identify the appropriate number of community types. We selected class solutions that provided the appropriate balance between parsimony and model fit. The goal of the vulnerability classification is to provide emergency managers with a general sense of the types of communities in tsunami hazard zones based on similar characteristics instead of only providing an exhaustive list of attributes for individual communities. This classification scheme can be then used to target and prioritize risk-reduction efforts that address common issues across multiple communities. The presentation will include a discussion of the utility of proposed place classifications to support regional preparedness and outreach efforts.

  9. A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.

    PubMed

    Pang, Shuchao; Yu, Zhezhou; Orgun, Mehmet A

    2017-03-01

    Highly accurate classification of biomedical images is an essential task in the clinical diagnosis of numerous medical diseases identified from those images. Traditional image classification methods combined with hand-crafted image feature descriptors and various classifiers are not able to effectively improve the accuracy rate and meet the high requirements of classification of biomedical images. The same also holds true for artificial neural network models directly trained with limited biomedical images used as training data or directly used as a black box to extract the deep features based on another distant dataset. In this study, we propose a highly reliable and accurate end-to-end classifier for all kinds of biomedical images via deep learning and transfer learning. We first apply domain transferred deep convolutional neural network for building a deep model; and then develop an overall deep learning architecture based on the raw pixels of original biomedical images using supervised training. In our model, we do not need the manual design of the feature space, seek an effective feature vector classifier or segment specific detection object and image patches, which are the main technological difficulties in the adoption of traditional image classification methods. Moreover, we do not need to be concerned with whether there are large training sets of annotated biomedical images, affordable parallel computing resources featuring GPUs or long times to wait for training a perfect deep model, which are the main problems to train deep neural networks for biomedical image classification as observed in recent works. With the utilization of a simple data augmentation method and fast convergence speed, our algorithm can achieve the best accuracy rate and outstanding classification ability for biomedical images. We have evaluated our classifier on several well-known public biomedical datasets and compared it with several state-of-the-art approaches. We propose a robust automated end-to-end classifier for biomedical images based on a domain transferred deep convolutional neural network model that shows a highly reliable and accurate performance which has been confirmed on several public biomedical image datasets. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  10. Analysis of Turbulent Boundary-Layer over Rough Surfaces with Application to Projectile Aerodynamics

    DTIC Science & Technology

    1988-12-01

    12 V. APPLICATION IN COMPONENT BUILD-UP METHODOLOGIES ....................... 12 1. COMPONENT BUILD-UP IN DRAG...dimensional roughness. II. CLASSIFICATION OF PREDICTION METHODS Prediction methods can be classified into two main approache-: 1) Correlation methodologies ...data are availaNe. V. APPLICATION IN COMPONENT BUILD-UP METHODOLOGIES 1. COMPONENT BUILD-UP IN DRAG The new correlation can be used for an engine.ring

  11. Identification of Coffee Varieties Using Laser-Induced Breakdown Spectroscopy and Chemometrics.

    PubMed

    Zhang, Chu; Shen, Tingting; Liu, Fei; He, Yong

    2017-12-31

    We linked coffee quality to its different varieties. This is of interest because the identification of coffee varieties should help coffee trading and consumption. Laser-induced breakdown spectroscopy (LIBS) combined with chemometric methods was used to identify coffee varieties. Wavelet transform (WT) was used to reduce LIBS spectra noise. Partial least squares-discriminant analysis (PLS-DA), radial basis function neural network (RBFNN), and support vector machine (SVM) were used to build classification models. Loadings of principal component analysis (PCA) were used to select the spectral variables contributing most to the identification of coffee varieties. Twenty wavelength variables corresponding to C I, Mg I, Mg II, Al II, CN, H, Ca II, Fe I, K I, Na I, N I, and O I were selected. PLS-DA, RBFNN, and SVM models on selected wavelength variables showed acceptable results. SVM and RBFNN models performed better with a classification accuracy of over 80% in the prediction set, for both full spectra and the selected variables. The overall results indicated that it was feasible to use LIBS and chemometric methods to identify coffee varieties. For further studies, more samples are needed to produce robust classification models, research should be conducted on which methods to use to select spectral peaks that correspond to the elements contributing most to identification, and the methods for acquiring stable spectra should also be studied.

  12. Identification of Coffee Varieties Using Laser-Induced Breakdown Spectroscopy and Chemometrics

    PubMed Central

    Zhang, Chu; Shen, Tingting

    2017-01-01

    We linked coffee quality to its different varieties. This is of interest because the identification of coffee varieties should help coffee trading and consumption. Laser-induced breakdown spectroscopy (LIBS) combined with chemometric methods was used to identify coffee varieties. Wavelet transform (WT) was used to reduce LIBS spectra noise. Partial least squares-discriminant analysis (PLS-DA), radial basis function neural network (RBFNN), and support vector machine (SVM) were used to build classification models. Loadings of principal component analysis (PCA) were used to select the spectral variables contributing most to the identification of coffee varieties. Twenty wavelength variables corresponding to C I, Mg I, Mg II, Al II, CN, H, Ca II, Fe I, K I, Na I, N I, and O I were selected. PLS-DA, RBFNN, and SVM models on selected wavelength variables showed acceptable results. SVM and RBFNN models performed better with a classification accuracy of over 80% in the prediction set, for both full spectra and the selected variables. The overall results indicated that it was feasible to use LIBS and chemometric methods to identify coffee varieties. For further studies, more samples are needed to produce robust classification models, research should be conducted on which methods to use to select spectral peaks that correspond to the elements contributing most to identification, and the methods for acquiring stable spectra should also be studied. PMID:29301228

  13. Classification of male lower torso for underwear design

    NASA Astrophysics Data System (ADS)

    Cheng, Z.; Kuzmichev, V. E.

    2017-10-01

    By means of scanning technology we have got new information about the morphology of male bodies and have redistricted the classification of men’s underwear by adopting one to consumer demands. To build the new classification in accordance with male body characteristic factors of lower torso, we make the method of underwear designing which allow to get the accurate and convenience for consumers products.

  14. Simulation of earthquake caused building damages for the development of fast reconnaissance techniques

    NASA Astrophysics Data System (ADS)

    Schweier, C.; Markus, M.; Steinle, E.

    2004-04-01

    Catastrophic events like strong earthquakes can cause big losses in life and economic values. An increase in the efficiency of reconnaissance techniques could help to reduce the losses in life as many victims die after and not during the event. A basic prerequisite to improve the rescue teams' work is an improved planning of the measures. This can only be done on the basis of reliable and detailed information about the actual situation in the affected regions. Therefore, a bundle of projects at Karlsruhe university aim at the development of a tool for fast information retrieval after strong earthquakes. The focus is on urban areas as the most losses occur there. In this paper the approach for a damage analysis of buildings will be presented. It consists of an automatic methodology to model buildings in three dimensions, a comparison of pre- and post-event models to detect changes and a subsequent classification of the changes into damage types. The process is based on information extraction from airborne laserscanning data, i.e. digital surface models (DSM) acquired through scanning of an area with pulsed laser light. To date, there are no laserscanning derived DSMs available to the authors that were taken of areas that suffered damages from earthquakes. Therefore, it was necessary to simulate such data for the development of the damage detection methodology. In this paper two different methodologies used for simulating the data will be presented. The first method is to create CAD models of undamaged buildings based on their construction plans and alter them artificially in such a way as if they had suffered serious damage. Then, a laserscanning data set is simulated based on these models which can be compared with real laserscanning data acquired of the buildings (in intact state). The other approach is to use measurements of actual damaged buildings and simulate their intact state. It is possible to model the geometrical structure of these damaged buildings based on digital photography taken after the event by evaluating the images with photogrammetrical methods. The intact state of the buildings is simulated based on on-site investigations, and finally laserscanning data are simulated for both states.

  15. Automated 3D Phenotype Analysis Using Data Mining

    PubMed Central

    Plyusnin, Ilya; Evans, Alistair R.; Karme, Aleksis; Gionis, Aristides; Jernvall, Jukka

    2008-01-01

    The ability to analyze and classify three-dimensional (3D) biological morphology has lagged behind the analysis of other biological data types such as gene sequences. Here, we introduce the techniques of data mining to the study of 3D biological shapes to bring the analyses of phenomes closer to the efficiency of studying genomes. We compiled five training sets of highly variable morphologies of mammalian teeth from the MorphoBrowser database. Samples were labeled either by dietary class or by conventional dental types (e.g. carnassial, selenodont). We automatically extracted a multitude of topological attributes using Geographic Information Systems (GIS)-like procedures that were then used in several combinations of feature selection schemes and probabilistic classification models to build and optimize classifiers for predicting the labels of the training sets. In terms of classification accuracy, computational time and size of the feature sets used, non-repeated best-first search combined with 1-nearest neighbor classifier was the best approach. However, several other classification models combined with the same searching scheme proved practical. The current study represents a first step in the automatic analysis of 3D phenotypes, which will be increasingly valuable with the future increase in 3D morphology and phenomics databases. PMID:18320060

  16. Binary Image Classification: A Genetic Programming Approach to the Problem of Limited Training Instances.

    PubMed

    Al-Sahaf, Harith; Zhang, Mengjie; Johnston, Mark

    2016-01-01

    In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.

  17. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    PubMed

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  18. MALDI Mass Spectrometry Imaging: A Novel Tool for the Identification and Classification of Amyloidosis.

    PubMed

    Winter, Martin; Tholey, Andreas; Kristen, Arnt; Röcken, Christoph

    2017-11-01

    Amyloidosis is a group of diseases caused by extracellular accumulation of fibrillar polypeptide aggregates. So far, diagnosis is performed by Congo red staining of tissue sections in combination with polarization microscopy. Subsequent identification of the causative protein by immunohistochemistry harbors some difficulties regarding sensitivity and specificity. Mass spectrometry based approaches have been demonstrated to constitute a reliable method to supplement typing of amyloidosis, but still depend on Congo red staining. In the present study, we used matrix-assisted laser desorption/ionization mass spectrometry imaging coupled with ion mobility separation (MALDI-IMS MSI) to investigate amyloid deposits in formalin-fixed and paraffin-embedded tissue samples. Utilizing a novel peptide filter method, we found a universal peptide signature for amyloidoses. Furthermore, differences in the peptide composition of ALλ and ATTR amyloid were revealed and used to build a reliable classification model. Integrating the peptide filter in MALDI-IMS MSI analysis, we developed a bioinformatics workflow facilitating the identification and classification of amyloidosis in a less time and sample-consuming experimental setup. Our findings demonstrate also the feasibility to investigate the amyloid's protein composition, thus paving the way to establish classification models for the diverse types of amyloidoses and to shed further light on the complex process of amyloidogenesis. © 2017 The Authors, Proteomics Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    PubMed Central

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  20. 3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints

    NASA Astrophysics Data System (ADS)

    Ghorpade, Vijaya K.; Checchin, Paul; Malaterre, Laurent; Trassoudaine, Laurent

    2017-12-01

    The accelerated advancement in modeling, digitizing, and visualizing techniques for 3D shapes has led to an increasing amount of 3D models creation and usage, thanks to the 3D sensors which are readily available and easy to utilize. As a result, determining the similarity between 3D shapes has become consequential and is a fundamental task in shape-based recognition, retrieval, clustering, and classification. Several decades of research in Content-Based Information Retrieval (CBIR) has resulted in diverse techniques for 2D and 3D shape or object classification/retrieval and many benchmark data sets. In this article, a novel technique for 3D shape representation and object classification has been proposed based on analyses of spatial, geometric distributions of 3D keypoints. These distributions capture the intrinsic geometric structure of 3D objects. The result of the approach is a probability distribution function (PDF) produced from spatial disposition of 3D keypoints, keypoints which are stable on object surface and invariant to pose changes. Each class/instance of an object can be uniquely represented by a PDF. This shape representation is robust yet with a simple idea, easy to implement but fast enough to compute. Both Euclidean and topological space on object's surface are considered to build the PDFs. Topology-based geodesic distances between keypoints exploit the non-planar surface properties of the object. The performance of the novel shape signature is tested with object classification accuracy. The classification efficacy of the new shape analysis method is evaluated on a new dataset acquired with a Time-of-Flight camera, and also, a comparative evaluation on a standard benchmark dataset with state-of-the-art methods is performed. Experimental results demonstrate superior classification performance of the new approach on RGB-D dataset and depth data.

  1. Cross-cultural differences in processing of architectural ranking: evidence from an event-related potential study.

    PubMed

    Mecklinger, Axel; Kriukova, Olga; Mühlmann, Heiner; Grunwald, Thomas

    2014-01-01

    Visual object identification is modulated by perceptual experience. In a cross-cultural ERP study we investigated whether cultural expertise determines how buildings that vary in their ranking between high and low according to the Western architectural decorum are perceived. Two groups of German and Chinese participants performed an object classification task in which high- and low-ranking Western buildings had to be discriminated from everyday life objects. ERP results indicate that an early stage of visual object identification (i.e., object model selection) is facilitated for high-ranking buildings for the German participants, only. At a later stage of object identification, in which object knowledge is complemented by information from semantic and episodic long-term memory, no ERP evidence for cultural differences was obtained. These results suggest that the identification of architectural ranking is modulated by culturally specific expertise with Western-style architecture already at an early processing stage.

  2. Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine

    NASA Technical Reports Server (NTRS)

    Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)

    2002-01-01

    The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.

  3. Classification of jet fuels by fuzzy rule-building expert systems applied to three-way data by fast gas chromatography--fast scanning quadrupole ion trap mass spectrometry.

    PubMed

    Sun, Xiaobo; Zimmermann, Carolyn M; Jackson, Glen P; Bunker, Christopher E; Harrington, Peter B

    2011-01-30

    A fast method that can be used to classify unknown jet fuel types or detect possible property changes in jet fuel physical properties is of paramount interest to national defense and the airline industries. While fast gas chromatography (GC) has been used with conventional mass spectrometry (MS) to study jet fuels, fast GC was combined with fast scanning MS and used to classify jet fuels into lot numbers or origin for the first time by using fuzzy rule-building expert system (FuRES) classifiers. In the process of building classifiers, the data were pretreated with and without wavelet transformation and evaluated with respect to performance. Principal component transformation was used to compress the two-way data images prior to classification. Jet fuel samples were successfully classified with 99.8 ± 0.5% accuracy for both with and without wavelet compression. Ten bootstrapped Latin partitions were used to validate the generalized prediction accuracy. Optimized partial least squares (o-PLS) regression results were used as positively biased references for comparing the FuRES prediction results. The prediction results for the jet fuel samples obtained with these two methods were compared statistically. The projected difference resolution (PDR) method was also used to evaluate the fast GC and fast MS data. Two batches of aliquots of ten new samples were prepared and run independently 4 days apart to evaluate the robustness of the method. The only change in classification parameters was the use of polynomial retention time alignment to correct for drift that occurred during the 4-day span of the two collections. FuRES achieved perfect classifications for four models of uncompressed three-way data. This fast GC/fast MS method furnishes characteristics of high speed, accuracy, and robustness. This mode of measurement may be useful as a monitoring tool to track changes in the chemical composition of fuels that may also lead to property changes. Copyright © 2010 Elsevier B.V. All rights reserved.

  4. Multisensor multiresolution data fusion for improvement in classification

    NASA Astrophysics Data System (ADS)

    Rubeena, V.; Tiwari, K. C.

    2016-04-01

    The rapid advancements in technology have facilitated easy availability of multisensor and multiresolution remote sensing data. Multisensor, multiresolution data contain complementary information and fusion of such data may result in application dependent significant information which may otherwise remain trapped within. The present work aims at improving classification by fusing features of coarse resolution hyperspectral (1 m) LWIR and fine resolution (20 cm) RGB data. The classification map comprises of eight classes. The class names are Road, Trees, Red Roof, Grey Roof, Concrete Roof, Vegetation, bare Soil and Unclassified. The processing methodology for hyperspectral LWIR data comprises of dimensionality reduction, resampling of data by interpolation technique for registering the two images at same spatial resolution, extraction of the spatial features to improve classification accuracy. In the case of fine resolution RGB data, the vegetation index is computed for classifying the vegetation class and the morphological building index is calculated for buildings. In order to extract the textural features, occurrence and co-occurence statistics is considered and the features will be extracted from all the three bands of RGB data. After extracting the features, Support Vector Machine (SVMs) has been used for training and classification. To increase the classification accuracy, post processing steps like removal of any spurious noise such as salt and pepper noise is done which is followed by filtering process by majority voting within the objects for better object classification.

  5. Influence of crisp values on the object-based data extraction procedure from LiDAR data

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Rousell, Adam

    2014-05-01

    Nowadays a plethora of approaches attempt to automate the process of object extraction from LiDAR data. However, the majority of these methods require the fusion of the LiDAR dataset with other information such as photogrammetric imagery. The approach that has been used as the basis for this paper is a novel method which makes use of human knowledge and the CNL modelling language to automatically extract buildings solely from LiDAR point cloud data in a transferable method. A number of rules are implemented to generate an artificial intelligence algorithm which is used for the object extraction. Although the single dataset method has been found to successfully extract building footprints from the point cloud dataset, at this initial stage it has one restriction that may limit its effectiveness - a number of the rules that are used are based on crisp boundary values. If, for example, the slope of the ground surface is used as a rule for determining objects then the slope value of a pixel would be assessed to determine if it is suitable for a building structure. This check would be performed by identifying whether the slope value is less than or greater than a threshold value. However, in reality such a crisp classification process is likely not to be a true reflection of real world scenarios. For example, using the crisp methods a difference of 1° in slope could result in one region in a dataset being deemed suitable and its neighboring region being seen as not suitable. It is likely however that there is in reality little difference in the actual suitability of these two neighboring regions. A more suitable classification process may be the use of fuzzy set theory whereby each region is seen as having degree of membership to a number of sets (or classifications). In the above example, the two regions would likely be seen as having very similar membership values to the different sets, although this is obviously dependent on factors such as the extent of each region. The purpose of this study is to identify to what extent the use of explicit boundary values has on the overall building footprint dataset extracted. By performing the analysis multiple times using differing threshold values for rules, it is possible to compare the resultant datasets and thus identify the impact of using such classification procedures. If a significant difference is found between the resultant datasets, this would highlight that the use of such crisp methods in the extraction processes may not be optimal and that a future enhancement to the method would be to consider the use of fuzzy classification methods.

  6. A Renovation Decision-Support Model for Evaluating the Functional Condition of Army Facilities

    DTIC Science & Technology

    1994-04-01

    PAGES Buildings--Remodeling for others use cost effectiveness 90 Army facilities RENMOD 16 . PRICE CODE 17. SECURITY CLASSIFICATION 18. SECURITY...7540-01-280-5500 StndWd Form 296 (Rev. 2-80) Preaus by ANSI 80 23.- 16 2010 FOREWORD This research was conducted for the Assistant Chief of Staff for...it means any home improvement. To an economist, it is any investment designed to forestall the capital depreciation of a structure. To an architect

  7. Nursing Classification Systems

    PubMed Central

    Henry, Suzanne Bakken; Mead, Charles N.

    1997-01-01

    Abstract Our premise is that from the perspective of maximum flexibility of data usage by computer-based record (CPR) systems, existing nursing classification systems are necessary, but not sufficient, for representing important aspects of “what nurses do.” In particular, we have focused our attention on those classification systems that represent nurses' clinical activities through the abstraction of activities into categories of nursing interventions. In this theoretical paper, we argue that taxonomic, combinatorial vocabularies capable of coding atomic-level nursing activities are required to effectively capture in a reproducible and reversible manner the clinical decisions and actions of nurses, and that, without such vocabularies and associated grammars, potentially important clinical process data is lost during the encoding process. Existing nursing intervention classification systems do not fulfill these criteria. As background to our argument, we first present an overview of the content, methods, and evaluation criteria used in previous studies whose focus has been to evaluate the effectiveness of existing coding and classification systems. Next, using the Ingenerf typology of taxonomic vocabularies, we categorize the formal type and structure of three existing nursing intervention classification systems—Nursing Interventions Classification, Omaha System, and Home Health Care Classification. Third, we use records from home care patients to show examples of lossy data transformation, the loss of potentially significant atomic data, resulting from encoding using each of the three systems. Last, we provide an example of the application of a formal representation methodology (conceptual graphs) which we believe could be used as a model to build the required combinatorial, taxonomic vocabulary for representing nursing interventions. PMID:9147341

  8. Authentication of whisky due to its botanical origin and way of production by instrumental analysis and multivariate classification methods

    NASA Astrophysics Data System (ADS)

    Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz

    2017-02-01

    Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation.

  9. Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).

    PubMed

    Bevilacqua, Marta; Marini, Federico

    2014-08-01

    The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.

  10. 1961-1968 New Construction Report.

    ERIC Educational Resources Information Center

    National Association of Physical Plant Administrators of Universities and Colleges, Richmond, IN.

    137 NAPPA colleges and universities provided data for this summary. Projects are summarized by thirteen building classifications. Under each classification the following information headings are used--(1) name of institution, (2) project completion date, (3) gross square feet, (4) net assignable area, (5) construction costs, (6) number of stories,…

  11. EEG-based Affect and Workload Recognition in a Virtual Driving Environment for ASD Intervention

    PubMed Central

    Wade, Joshua W.; Key, Alexandra P.; Warren, Zachary E.; Sarkar, Nilanjan

    2017-01-01

    objective To build group-level classification models capable of recognizing affective states and mental workload of individuals with autism spectrum disorder (ASD) during driving skill training. Methods Twenty adolescents with ASD participated in a six-session virtual reality driving simulator based experiment, during which their electroencephalogram (EEG) data were recorded alongside driving events and a therapist’s rating of their affective states and mental workload. Five feature generation approaches including statistical features, fractal dimension features, higher order crossings (HOC)-based features, power features from frequency bands, and power features from bins (Δf = 2 Hz) were applied to extract relevant features. Individual differences were removed with a two-step feature calibration method. Finally, binary classification results based on the k-nearest neighbors algorithm and univariate feature selection method were evaluated by leave-one-subject-out nested cross-validation to compare feature types and identify discriminative features. Results The best classification results were achieved using power features from bins for engagement (0.95) and boredom (0.78), and HOC-based features for enjoyment (0.90), frustration (0.88), and workload (0.86). Conclusion Offline EEG-based group-level classification models are feasible for recognizing binary low and high intensity of affect and workload of individuals with ASD in the context of driving. However, while promising the applicability of the models in an online adaptive driving task requires further development. Significance The developed models provide a basis for an EEG-based passive brain computer interface system that has the potential to benefit individuals with ASD with an affect- and workload-based individualized driving skill training intervention. PMID:28422647

  12. Building Extraction from Remote Sensing Data Using Fully Convolutional Networks

    NASA Astrophysics Data System (ADS)

    Bittner, K.; Cui, S.; Reinartz, P.

    2017-05-01

    Building detection and footprint extraction are highly demanded for many remote sensing applications. Though most previous works have shown promising results, the automatic extraction of building footprints still remains a nontrivial topic, especially in complex urban areas. Recently developed extensions of the CNN framework made it possible to perform dense pixel-wise classification of input images. Based on these abilities we propose a methodology, which automatically generates a full resolution binary building mask out of a Digital Surface Model (DSM) using a Fully Convolution Network (FCN) architecture. The advantage of using the depth information is that it provides geometrical silhouettes and allows a better separation of buildings from background as well as through its invariance to illumination and color variations. The proposed framework has mainly two steps. Firstly, the FCN is trained on a large set of patches consisting of normalized DSM (nDSM) as inputs and available ground truth building mask as target outputs. Secondly, the generated predictions from FCN are viewed as unary terms for a Fully connected Conditional Random Fields (FCRF), which enables us to create a final binary building mask. A series of experiments demonstrate that our methodology is able to extract accurate building footprints which are close to the buildings original shapes to a high degree. The quantitative and qualitative analysis show the significant improvements of the results in contrast to the multy-layer fully connected network from our previous work.

  13. Towards Automatic Classification of Exoplanet-Transit-Like Signals: A Case Study on Kepler Mission Data

    NASA Astrophysics Data System (ADS)

    Valizadegan, Hamed; Martin, Rodney; McCauliff, Sean D.; Jenkins, Jon Michael; Catanzarite, Joseph; Oza, Nikunj C.

    2015-08-01

    Building new catalogues of planetary candidates, astrophysical false alarms, and non-transiting phenomena is a challenging task that currently requires a reviewing team of astrophysicists and astronomers. These scientists need to examine more than 100 diagnostic metrics and associated graphics for each candidate exoplanet-transit-like signal to classify it into one of the three classes. Considering that the NASA Explorer Program's TESS mission and ESA's PLATO mission survey even a larger area of space, the classification of their transit-like signals is more time-consuming for human agents and a bottleneck to successfully construct the new catalogues in a timely manner. This encourages building automatic classification tools that can quickly and reliably classify the new signal data from these missions. The standard tool for building automatic classification systems is the supervised machine learning that requires a large set of highly accurate labeled examples in order to build an effective classifier. This requirement cannot be easily met for classifying transit-like signals because not only are existing labeled signals very limited, but also the current labels may not be reliable (because the labeling process is a subjective task). Our experiments with using different supervised classifiers to categorize transit-like signals verifies that the labeled signals are not rich enough to provide the classifier with enough power to generalize well beyond the observed cases (e.g. to unseen or test signals). That motivated us to utilize a new category of learning techniques, so-called semi-supervised learning, that combines the label information from the costly labeled signals, and distribution information from the cheaply available unlabeled signals in order to construct more effective classifiers. Our study on the Kepler Mission data shows that semi-supervised learning can significantly improve the result of multiple base classifiers (e.g. Support Vector Machines, AdaBoost, and Decision Tree) and is a good technique for automatic classification of exoplanet-transit-like signal.

  14. High-rise architecture in Ufa, Russia, based on crystallography canons

    NASA Astrophysics Data System (ADS)

    Narimanovich Sabitov, Ildar; Radikovna Kudasheva, Dilara; Yaroslavovich Vdovin, Denis

    2018-03-01

    The article considers fundamental steps of high-rise architecture forming stylistic tendencies, based on C. Willis and M. A. Korotich's studies. Crystallographic shaping as a direction is assigned on basis of classification by M. A. Korotich's. This direction is particularly examined and the main high-rise architecture forming aspects on basis of natural polycrystals forming principles are assigned. The article describes crystal forms transformation into an architectural composition, analyzes constructive systems within the framework of CTBUH (Council on Tall Buildings and Urban Habitat) classification, and picks out one of its types as the most optimal for using in buildings-crystals. The last stage of our research is the theoretical principles approbation into an experimental project of high-rise building in Ufa with the description of its contextual dislocation aspects.

  15. Revision of seismic design codes corresponding to building damages in the ``5.12'' Wenchuan earthquake

    NASA Astrophysics Data System (ADS)

    Wang, Yayong

    2010-06-01

    A large number of buildings were seriously damaged or collapsed in the “5.12” Wenchuan earthquake. Based on field surveys and studies of damage to different types of buildings, seismic design codes have been updated. This paper briefly summarizes some of the major revisions that have been incorporated into the “Standard for classification of seismic protection of building constructions GB50223-2008” and “Code for Seismic Design of Buildings GB50011-2001.” The definition of seismic fortification class for buildings has been revisited, and as a result, the seismic classifications for schools, hospitals and other buildings that hold large populations such as evacuation shelters and information centers have been upgraded in the GB50223-2008 Code. The main aspects of the revised GB50011-2001 code include: (a) modification of the seismic intensity specified for the Provinces of Sichuan, Shanxi and Gansu; (b) basic conceptual design for retaining walls and building foundations in mountainous areas; (c) regularity of building configuration; (d) integration of masonry structures and pre-cast RC floors; (e) requirements for calculating and detailing stair shafts; and (f) limiting the use of single-bay RC frame structures. Some significant examples of damage in the epicenter areas are provided as a reference in the discussion on the consequences of collapse, the importance of duplicate structural systems, and the integration of RC and masonry structures.

  16. Low-power wearable respiratory sound sensing.

    PubMed

    Oletic, Dinko; Arsenali, Bruno; Bilas, Vedran

    2014-04-09

    Building upon the findings from the field of automated recognition of respiratory sound patterns, we propose a wearable wireless sensor implementing on-board respiratory sound acquisition and classification, to enable continuous monitoring of symptoms, such as asthmatic wheezing. Low-power consumption of such a sensor is required in order to achieve long autonomy. Considering that the power consumption of its radio is kept minimal if transmitting only upon (rare) occurrences of wheezing, we focus on optimizing the power consumption of the digital signal processor (DSP). Based on a comprehensive review of asthmatic wheeze detection algorithms, we analyze the computational complexity of common features drawn from short-time Fourier transform (STFT) and decision tree classification. Four algorithms were implemented on a low-power TMS320C5505 DSP. Their classification accuracies were evaluated on a dataset of prerecorded respiratory sounds in two operating scenarios of different detection fidelities. The execution times of all algorithms were measured. The best classification accuracy of over 92%, while occupying only 2.6% of the DSP's processing time, is obtained for the algorithm featuring the time-frequency tracking of shapes of crests originating from wheezing, with spectral features modeled using energy.

  17. Structural classification of proteins using texture descriptors extracted from the cellular automata image.

    PubMed

    Kavianpour, Hamidreza; Vasighi, Mahdi

    2017-02-01

    Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.

  18. Comparing supervised and unsupervised multiresolution segmentation approaches for extracting buildings from very high resolution imagery.

    PubMed

    Belgiu, Mariana; Dr Guţ, Lucian

    2014-10-01

    Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.

  19. Mapping land cover in urban residential landscapes using fine resolution imagery and object-oriented classification

    USDA-ARS?s Scientific Manuscript database

    A knowledge of different types of land cover in urban residential landscapes is important for building social and economic city-wide policies including landscape ordinances and water conservation programs. Urban landscapes are typically heterogeneous, so classification of land cover in these areas ...

  20. Classification of suicide attempters in schizophrenia using sociocultural and clinical features: A machine learning approach.

    PubMed

    Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo

    2017-07-01

    Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Raster Vs. Point Cloud LiDAR Data Classification

    NASA Astrophysics Data System (ADS)

    El-Ashmawy, N.; Shaker, A.

    2014-09-01

    Airborne Laser Scanning systems with light detection and ranging (LiDAR) technology is one of the fast and accurate 3D point data acquisition techniques. Generating accurate digital terrain and/or surface models (DTM/DSM) is the main application of collecting LiDAR range data. Recently, LiDAR range and intensity data have been used for land cover classification applications. Data range and Intensity, (strength of the backscattered signals measured by the LiDAR systems), are affected by the flying height, the ground elevation, scanning angle and the physical characteristics of the objects surface. These effects may lead to uneven distribution of point cloud or some gaps that may affect the classification process. Researchers have investigated the conversion of LiDAR range point data to raster image for terrain modelling. Interpolation techniques have been used to achieve the best representation of surfaces, and to fill the gaps between the LiDAR footprints. Interpolation methods are also investigated to generate LiDAR range and intensity image data for land cover classification applications. In this paper, different approach has been followed to classifying the LiDAR data (range and intensity) for land cover mapping. The methodology relies on the classification of the point cloud data based on their range and intensity and then converted the classified points into raster image. The gaps in the data are filled based on the classes of the nearest neighbour. Land cover maps are produced using two approaches using: (a) the conventional raster image data based on point interpolation; and (b) the proposed point data classification. A study area covering an urban district in Burnaby, British Colombia, Canada, is selected to compare the results of the two approaches. Five different land cover classes can be distinguished in that area: buildings, roads and parking areas, trees, low vegetation (grass), and bare soil. The results show that an improvement of around 10 % in the classification results can be achieved by using the proposed approach.

  2. IET. Tank building (TAN627). Plans, elevation, details. shows position of ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Tank building (TAN-627). Plans, elevation, details. shows position of tanks within building and concrete supports. Ralph M. Parsons 902-4-ANP-627-A&S 420. Date: Fabruary 1954. Approved by INEEL Classification Office for public release. INEEL index code no. 035-0627-00-693-106975 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  3. Non-linear molecular pattern classification using molecular beacons with multiple targets.

    PubMed

    Lee, In-Hee; Lee, Seung Hwan; Park, Tai Hyun; Zhang, Byoung-Tak

    2013-12-01

    In vitro pattern classification has been highlighted as an important future application of DNA computing. Previous work has demonstrated the feasibility of linear classifiers using DNA-based molecular computing. However, complex tasks require non-linear classification capability. Here we design a molecular beacon that can interact with multiple targets and experimentally shows that its fluorescent signals form a complex radial-basis function, enabling it to be used as a building block for non-linear molecular classification in vitro. The proposed method was successfully applied to solving artificial and real-world classification problems: XOR and microRNA expression patterns. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Exposure to traffic pollution: comparison between measurements and a model.

    PubMed

    Alili, F; Momas, I; Callais, F; Le Moullec, Y; Sacre, C; Chiron, M; Flori, J P

    2001-01-01

    French researchers from the Building Scientific and Technical Center have produced a traffic-exposure index. To achieve this, they used an air pollution dispersion model that enabled them to calculate automobile pollutant concentrations in front of subjects' residences and places of work. Researchers used this model, which was tested at 27 Paris canyon street sites, and compared nitrogen oxides measurements obtained with passive samplers during a 6-wk period and calculations derived from the model. There was a highly significant correlation (r = .83) between the 2 series of values; their mean concentrations were not significantly different. The results suggested that the aforementioned model could be a useful epidemiological tool for the classification of city dwellers by present-or even cumulative exposure to automobile air pollution.

  5. Intrusion detection using rough set classification.

    PubMed

    Zhang, Lian-hua; Zhang, Guan-hua; Zhang, Jie; Bai, Ying-cai

    2004-09-01

    Recently machine learning-based intrusion detection approaches have been subjected to extensive researches because they can detect both misuse and anomaly. In this paper, rough set classification (RSC), a modern learning algorithm, is used to rank the features extracted for detecting intrusions and generate intrusion detection models. Feature ranking is a very critical step when building the model. RSC performs feature ranking before generating rules, and converts the feature ranking to minimal hitting set problem addressed by using genetic algorithm (GA). This is done in classical approaches using Support Vector Machine (SVM) by executing many iterations, each of which removes one useless feature. Compared with those methods, our method can avoid many iterations. In addition, a hybrid genetic algorithm is proposed to increase the convergence speed and decrease the training time of RSC. The models generated by RSC take the form of "IF-THEN" rules, which have the advantage of explication. Tests and comparison of RSC with SVM on DARPA benchmark data showed that for Probe and DoS attacks both RSC and SVM yielded highly accurate results (greater than 99% accuracy on testing set).

  6. Analysis of x-ray hand images for bone age assessment

    NASA Astrophysics Data System (ADS)

    Serrat, Joan; Vitria, Jordi M.; Villanueva, Juan J.

    1990-09-01

    In this paper we describe a model-based system for the assessment of skeletal maturity on hand radiographs by the TW2 method. The problem consists in classiflying a set of bones appearing in an image in one of several stages described in an atlas. A first approach consisting in pre-processing segmentation and classification independent phases is also presented. However it is only well suited for well contrasted low noise images without superimposed bones were the edge detection by zero crossing of second directional derivatives is able to extract all bone contours maybe with little gaps and few false edges on the background. Hence the use of all available knowledge about the problem domain is needed to build a rather general system. We have designed a rule-based system for narrow down the rank of possible stages for each bone and guide the analysis process. It calls procedures written in conventional languages for matching stage models against the image and getting features needed in the classification process.

  7. Raman spectroscopy coupled with advanced statistics for differentiating menstrual and peripheral blood.

    PubMed

    Sikirzhytskaya, Aliaksandra; Sikirzhytski, Vitali; Lednev, Igor K

    2014-01-01

    Body fluids are a common and important type of forensic evidence. In particular, the identification of menstrual blood stains is often a key step during the investigation of rape cases. Here, we report on the application of near-infrared Raman microspectroscopy for differentiating menstrual blood from peripheral blood. We observed that the menstrual and peripheral blood samples have similar but distinct Raman spectra. Advanced statistical analysis of the multiple Raman spectra that were automatically (Raman mapping) acquired from the 40 dried blood stains (20 donors for each group) allowed us to build classification model with maximum (100%) sensitivity and specificity. We also demonstrated that despite certain common constituents, menstrual blood can be readily distinguished from vaginal fluid. All of the classification models were verified using cross-validation methods. The proposed method overcomes the problems associated with currently used biochemical methods, which are destructive, time consuming and expensive. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Application of the Coastal and Marine Ecological Classification Standard Using Satellite-Derived and Modeled Data Products for Pelagic Habitats in the Northern Gulf of Mexico

    DTIC Science & Technology

    2013-12-10

    intertidal vegetation . Comments from resource managers requested products incor- porating bathymetry and sediment data. To further build on the...and availability of intertidal vegetation are other key factors in successful movement into the estuary for brown shrimp, both of these data were...distribution of intertidal vegetation . The NWI classes EEM1 and EEM2 are the two classes into which intertidal vegeta- tion falls in Galveston. On the ground

  9. Classification of buildings mold threat using electronic nose

    NASA Astrophysics Data System (ADS)

    Łagód, Grzegorz; Suchorab, Zbigniew; Guz, Łukasz; Sobczuk, Henryk

    2017-07-01

    Mold is considered to be one of the most important features of Sick Building Syndrome and is an important problem in current building industry. In many cases it is caused by the rising moisture of building envelopes surface and exaggerated humidity of indoor air. Concerning historical buildings it is mostly caused by outdated raising techniques among that is absence of horizontal isolation against moisture and hygroscopic materials applied for construction. Recent buildings also suffer problem of mold risk which is caused in many cases by hermetization leading to improper performance of gravitational ventilation systems that make suitable conditions for mold development. Basing on our research there is proposed a method of buildings mold threat classification using electronic nose, based on a gas sensors array which consists of MOS sensors (metal oxide semiconductor). Used device is frequently applied for air quality assessment in environmental engineering branches. Presented results show the interpretation of e-nose readouts of indoor air sampled in rooms threatened with mold development in comparison with clean reference rooms and synthetic air. Obtained multivariate data were processed, visualized and classified using a PCA (Principal Component Analysis) and ANN (Artificial Neural Network) methods. Described investigation confirmed that electronic nose - gas sensors array supported with data processing enables to classify air samples taken from different rooms affected with mold.

  10. On the difficulty to delimit disease risk hot spots

    NASA Astrophysics Data System (ADS)

    Charras-Garrido, M.; Azizi, L.; Forbes, F.; Doyle, S.; Peyrard, N.; Abrial, D.

    2013-06-01

    Representing the health state of a region is a helpful tool to highlight spatial heterogeneity and localize high risk areas. For ease of interpretation and to determine where to apply control procedures, we need to clearly identify and delineate homogeneous regions in terms of disease risk, and in particular disease risk hot spots. However, even if practical purposes require the delineation of different risk classes, such a classification does not correspond to a reality and is thus difficult to estimate. Working with grouped data, a first natural choice is to apply disease mapping models. We apply a usual disease mapping model, producing continuous estimations of the risks that requires a post-processing classification step to obtain clearly delimited risk zones. We also apply a risk partition model that build a classification of the risk levels in a one step procedure. Working with point data, we will focus on the scan statistic clustering method. We illustrate our article with a real example concerning the bovin spongiform encephalopathy (BSE) an animal disease whose zones at risk are well known by the epidemiologists. We show that in this difficult case of a rare disease and a very heterogeneous population, the different methods provide risk zones that are globally coherent. But, related to the dichotomy between the need and the reality, the exact delimitation of the risk zones, as well as the corresponding estimated risks are quite different.

  11. GPS-based microenvironment tracker (MicroTrac) model to estimate time–location of individuals for air pollution exposure assessments: Model evaluation in central North Carolina

    PubMed Central

    Breen, Michael S.; Long, Thomas C.; Schultz, Bradley D.; Crooks, James; Breen, Miyuki; Langstaff, John E.; Isaacs, Kristin K.; Tan, Yu-Mei; Williams, Ronald W.; Cao, Ye; Geller, Andrew M.; Devlin, Robert B.; Batterman, Stuart A.; Buckley, Timothy J.

    2014-01-01

    A critical aspect of air pollution exposure assessment is the estimation of the time spent by individuals in various microenvironments (ME). Accounting for the time spent in different ME with different pollutant concentrations can reduce exposure misclassifications, while failure to do so can add uncertainty and bias to risk estimates. In this study, a classification model, called MicroTrac, was developed to estimate time of day and duration spent in eight ME (indoors and outdoors at home, work, school; inside vehicles; other locations) from global positioning system (GPS) data and geocoded building boundaries. Based on a panel study, MicroTrac estimates were compared with 24-h diary data from nine participants, with corresponding GPS data and building boundaries of home, school, and work. MicroTrac correctly classified the ME for 99.5% of the daily time spent by the participants. The capability of MicroTrac could help to reduce the time–location uncertainty in air pollution exposure models and exposure metrics for individuals in health studies. PMID:24619294

  12. Discrimination of raw and processed Dipsacus asperoides by near infrared spectroscopy combined with least squares-support vector machine and random forests

    NASA Astrophysics Data System (ADS)

    Xin, Ni; Gu, Xiao-Feng; Wu, Hao; Hu, Yu-Zhu; Yang, Zhong-Lin

    2012-04-01

    Most herbal medicines could be processed to fulfill the different requirements of therapy. The purpose of this study was to discriminate between raw and processed Dipsacus asperoides, a common traditional Chinese medicine, based on their near infrared (NIR) spectra. Least squares-support vector machine (LS-SVM) and random forests (RF) were employed for full-spectrum classification. Three types of kernels, including linear kernel, polynomial kernel and radial basis function kernel (RBF), were checked for optimization of LS-SVM model. For comparison, a linear discriminant analysis (LDA) model was performed for classification, and the successive projections algorithm (SPA) was executed prior to building an LDA model to choose an appropriate subset of wavelengths. The three methods were applied to a dataset containing 40 raw herbs and 40 corresponding processed herbs. We ran 50 runs of 10-fold cross validation to evaluate the model's efficiency. The performance of the LS-SVM with RBF kernel (RBF LS-SVM) was better than the other two kernels. The RF, RBF LS-SVM and SPA-LDA successfully classified all test samples. The mean error rates for the 50 runs of 10-fold cross validation were 1.35% for RBF LS-SVM, 2.87% for RF, and 2.50% for SPA-LDA. The best classification results were obtained by using LS-SVM with RBF kernel, while RF was fast in the training and making predictions.

  13. In silico prediction of multiple-category classification model for cytochrome P450 inhibitors and non-inhibitors using machine-learning method.

    PubMed

    Lee, J H; Basith, S; Cui, M; Kim, B; Choi, S

    2017-10-01

    The cytochrome P450 (CYP) enzyme superfamily is involved in phase I metabolism which chemically modifies a variety of substrates via oxidative reactions to make them more water-soluble and easier to eliminate. Inhibition of these enzymes leads to undesirable effects, including toxic drug accumulations and adverse drug-drug interactions. Hence, it is necessary to develop in silico models that can predict the inhibition potential of compounds for different CYP isoforms. This study focused on five major CYP isoforms, including CYP1A2, 2C9, 2C19, 2D6 and 3A4, that are responsible for more than 90% of the metabolism of clinical drugs. The main aim of this study is to develop a multiple-category classification model (MCM) for the major CYP isoforms using a Laplacian-modified naïve Bayesian method. The dataset composed of more than 4500 compounds was collected from the PubChem Bioassay database. VolSurf+ descriptors and FCFP_8 fingerprint were used as input features to build classification models. The results demonstrated that the developed MCM using Laplacian-modified naïve Bayesian method was successful in classifying inhibitors and non-inhibitors for each CYP isoform. Moreover, the accuracy, sensitivity and specificity values for both training and test sets were above 80% and also yielded satisfactory area under the receiver operating characteristic curve and Matthews correlation coefficient values.

  14. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—a cohort study

    PubMed Central

    Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2013-01-01

    Objective We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Design Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Setting Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. Participants 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Outcome measures Incident type 2 diabetes, hypertension and comorbidity. Results Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign ‘high’ risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned ‘low’ risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Conclusions Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case–control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait. PMID:23676796

  15. Intelligent quotient estimation of mental retarded people from different psychometric instruments using artificial neural networks.

    PubMed

    Di Nuovo, Alessandro G; Di Nuovo, Santo; Buono, Serafino

    2012-02-01

    The estimation of a person's intelligence quotient (IQ) by means of psychometric tests is indispensable in the application of psychological assessment to several fields. When complex tests as the Wechsler scales, which are the most commonly used and universally recognized parameter for the diagnosis of degrees of retardation, are not applicable, it is necessary to use other psycho-diagnostic tools more suited for the subject's specific condition. But to ensure a homogeneous diagnosis it is necessary to reach a common metric, thus, the aim of our work is to build models able to estimate accurately and reliably the Wechsler IQ, starting from different psycho-diagnostic tools. Four different psychometric tests (Leiter international performance scale; coloured progressive matrices test; the mental development scale; psycho educational profile), along with the Wechsler scale, were administered to a group of 40 mentally retarded subjects, with various pathologies, and control persons. The obtained database is used to evaluate Wechsler IQ estimation models starting from the scores obtained in the other tests. Five modelling methods, two statistical and three from machine learning, that belong to the family of artificial neural networks (ANNs) are employed to build the estimator. Several error metrics for estimated IQ and for retardation level classification are defined to compare the performance of the various models with univariate and multivariate analyses. Eight empirical studies show that, after ten-fold cross-validation, best average estimation error is of 3.37 IQ points and mental retardation level classification error of 7.5%. Furthermore our experiments prove the superior performance of ANN methods over statistical regression ones, because in all cases considered ANN models show the lowest estimation error (from 0.12 to 0.9 IQ points) and the lowest classification error (from 2.5% to 10%). Since the estimation performance is better than the confidence interval of Wechsler scales (five IQ points), we consider models built very accurate and reliable and they can be used into help clinical diagnosis. Therefore a computer software based on the results of our work is currently used in a clinical center and empirical trails confirm its validity. Furthermore positive results in our multivariate studies suggest new approaches for clinicians. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Fully Convolutional Network Based Shadow Extraction from GF-2 Imagery

    NASA Astrophysics Data System (ADS)

    Li, Z.; Cai, G.; Ren, H.

    2018-04-01

    There are many shadows on the high spatial resolution satellite images, especially in the urban areas. Although shadows on imagery severely affect the information extraction of land cover or land use, they provide auxiliary information for building extraction which is hard to achieve a satisfactory accuracy through image classification itself. This paper focused on the method of building shadow extraction by designing a fully convolutional network and training samples collected from GF-2 satellite imagery in the urban region of Changchun city. By means of spatial filtering and calculation of adjacent relationship along the sunlight direction, the small patches from vegetation or bridges have been eliminated from the preliminary extracted shadows. Finally, the building shadows were separated. The extracted building shadow information from the proposed method in this paper was compared with the results from the traditional object-oriented supervised classification algorihtms. It showed that the deep learning network approach can improve the accuracy to a large extent.

  17. Object-oriented regression for building predictive models with high dimensional omics data from translational studies.

    PubMed

    Zhao, Lue Ping; Bolouri, Hamid

    2016-04-01

    Maturing omics technologies enable researchers to generate high dimension omics data (HDOD) routinely in translational clinical studies. In the field of oncology, The Cancer Genome Atlas (TCGA) provided funding support to researchers to generate different types of omics data on a common set of biospecimens with accompanying clinical data and has made the data available for the research community to mine. One important application, and the focus of this manuscript, is to build predictive models for prognostic outcomes based on HDOD. To complement prevailing regression-based approaches, we propose to use an object-oriented regression (OOR) methodology to identify exemplars specified by HDOD patterns and to assess their associations with prognostic outcome. Through computing patient's similarities to these exemplars, the OOR-based predictive model produces a risk estimate using a patient's HDOD. The primary advantages of OOR are twofold: reducing the penalty of high dimensionality and retaining the interpretability to clinical practitioners. To illustrate its utility, we apply OOR to gene expression data from non-small cell lung cancer patients in TCGA and build a predictive model for prognostic survivorship among stage I patients, i.e., we stratify these patients by their prognostic survival risks beyond histological classifications. Identification of these high-risk patients helps oncologists to develop effective treatment protocols and post-treatment disease management plans. Using the TCGA data, the total sample is divided into training and validation data sets. After building up a predictive model in the training set, we compute risk scores from the predictive model, and validate associations of risk scores with prognostic outcome in the validation data (P-value=0.015). Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Object-Oriented Regression for Building Predictive Models with High Dimensional Omics Data from Translational Studies

    PubMed Central

    Zhao, Lue Ping; Bolouri, Hamid

    2016-01-01

    Maturing omics technologies enable researchers to generate high dimension omics data (HDOD) routinely in translational clinical studies. In the field of oncology, The Cancer Genome Atlas (TCGA) provided funding support to researchers to generate different types of omics data on a common set of biospecimens with accompanying clinical data and to make the data available for the research community to mine. One important application, and the focus of this manuscript, is to build predictive models for prognostic outcomes based on HDOD. To complement prevailing regression-based approaches, we propose to use an object-oriented regression (OOR) methodology to identify exemplars specified by HDOD patterns and to assess their associations with prognostic outcome. Through computing patient’s similarities to these exemplars, the OOR-based predictive model produces a risk estimate using a patient’s HDOD. The primary advantages of OOR are twofold: reducing the penalty of high dimensionality and retaining the interpretability to clinical practitioners. To illustrate its utility, we apply OOR to gene expression data from non-small cell lung cancer patients in TCGA and build a predictive model for prognostic survivorship among stage I patients, i.e., we stratify these patients by their prognostic survival risks beyond histological classifications. Identification of these high-risk patients helps oncologists to develop effective treatment protocols and post-treatment disease management plans. Using the TCGA data, the total sample is divided into training and validation data sets. After building up a predictive model in the training set, we compute risk scores from the predictive model, and validate associations of risk scores with prognostic outcome in the validation data (p=0.015). PMID:26972839

  19. A three-way approach for protein function classification

    PubMed Central

    2017-01-01

    The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy. PMID:28234929

  20. A hierarchical anatomical classification schema for prediction of phenotypic side effects

    PubMed Central

    Kanji, Rakesh

    2018-01-01

    Prediction of adverse drug reactions is an important problem in drug discovery endeavors which can be addressed with data-driven strategies. SIDER is one of the most reliable and frequently used datasets for identification of key features as well as building machine learning models for side effects prediction. The inherently unbalanced nature of this data presents with a difficult multi-label multi-class problem towards prediction of drug side effects. We highlight the intrinsic issue with SIDER data and methodological flaws in relying on performance measures such as AUC while attempting to predict side effects.We argue for the use of metrics that are robust to class imbalance for evaluation of classifiers. Importantly, we present a ‘hierarchical anatomical classification schema’ which aggregates side effects into organs, sub-systems, and systems. With the help of a weighted performance measure, using 5-fold cross-validation we show that this strategy facilitates biologically meaningful side effects prediction at different levels of anatomical hierarchy. By implementing various machine learning classifiers we show that Random Forest model yields best classification accuracy at each level of coarse-graining. The manually curated, hierarchical schema for side effects can also serve as the basis of future studies towards prediction of adverse reactions and identification of key features linked to specific organ systems. Our study provides a strategy for hierarchical classification of side effects rooted in the anatomy and can pave the way for calibrated expert systems for multi-level prediction of side effects. PMID:29494708

  1. A three-way approach for protein function classification.

    PubMed

    Ur Rehman, Hafeez; Azam, Nouman; Yao, JingTao; Benso, Alfredo

    2017-01-01

    The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy.

  2. A hierarchical anatomical classification schema for prediction of phenotypic side effects.

    PubMed

    Wadhwa, Somin; Gupta, Aishwarya; Dokania, Shubham; Kanji, Rakesh; Bagler, Ganesh

    2018-01-01

    Prediction of adverse drug reactions is an important problem in drug discovery endeavors which can be addressed with data-driven strategies. SIDER is one of the most reliable and frequently used datasets for identification of key features as well as building machine learning models for side effects prediction. The inherently unbalanced nature of this data presents with a difficult multi-label multi-class problem towards prediction of drug side effects. We highlight the intrinsic issue with SIDER data and methodological flaws in relying on performance measures such as AUC while attempting to predict side effects.We argue for the use of metrics that are robust to class imbalance for evaluation of classifiers. Importantly, we present a 'hierarchical anatomical classification schema' which aggregates side effects into organs, sub-systems, and systems. With the help of a weighted performance measure, using 5-fold cross-validation we show that this strategy facilitates biologically meaningful side effects prediction at different levels of anatomical hierarchy. By implementing various machine learning classifiers we show that Random Forest model yields best classification accuracy at each level of coarse-graining. The manually curated, hierarchical schema for side effects can also serve as the basis of future studies towards prediction of adverse reactions and identification of key features linked to specific organ systems. Our study provides a strategy for hierarchical classification of side effects rooted in the anatomy and can pave the way for calibrated expert systems for multi-level prediction of side effects.

  3. Remote Sensing Image Classification Applied to the First National Geographical Information Census of China

    NASA Astrophysics Data System (ADS)

    Yu, Xin; Wen, Zongyong; Zhu, Zhaorong; Xia, Qiang; Shun, Lan

    2016-06-01

    Image classification will still be a long way in the future, although it has gone almost half a century. In fact, researchers have gained many fruits in the image classification domain, but there is still a long distance between theory and practice. However, some new methods in the artificial intelligence domain will be absorbed into the image classification domain and draw on the strength of each to offset the weakness of the other, which will open up a new prospect. Usually, networks play the role of a high-level language, as is seen in Artificial Intelligence and statistics, because networks are used to build complex model from simple components. These years, Bayesian Networks, one of probabilistic networks, are a powerful data mining technique for handling uncertainty in complex domains. In this paper, we apply Tree Augmented Naive Bayesian Networks (TAN) to texture classification of High-resolution remote sensing images and put up a new method to construct the network topology structure in terms of training accuracy based on the training samples. Since 2013, China government has started the first national geographical information census project, which mainly interprets geographical information based on high-resolution remote sensing images. Therefore, this paper tries to apply Bayesian network to remote sensing image classification, in order to improve image interpretation in the first national geographical information census project. In the experiment, we choose some remote sensing images in Beijing. Experimental results demonstrate TAN outperform than Naive Bayesian Classifier (NBC) and Maximum Likelihood Classification Method (MLC) in the overall classification accuracy. In addition, the proposed method can reduce the workload of field workers and improve the work efficiency. Although it is time consuming, it will be an attractive and effective method for assisting office operation of image interpretation.

  4. Maritime Pre-Positioning Force-Future: Bill Payer or Sea Basing Enabler?

    DTIC Science & Technology

    2008-03-25

    Ship Building Plan , UAV CLASSIFICATION: Unclassified Actions at sea no longer suffice to influence world events; actions from the sea must...in amphibious ships or fall victim to an untenable Navy ship building plan . Premature consideration of cost issues hindered MPF-F program...fiscal environment and an illusory Navy ship building plan . Given the demonstrated capability and success of the current Maritime Pre-positioning

  5. Spring Ankle with Regenerative Kinetics to Build a New Generation of Transtibial Prostheses

    DTIC Science & Technology

    2008-07-31

    form factor that is portable to the wearer. The objective is to build a transtibial prosthesis that will support a Military amputee’s return to...active duty. 15. SUBJECT TERMS Transtibial Prosthesis , regenerative, spring, wearable robot 16. SECURITY CLASSIFICATION OF: 17. LIMITATION...Regenerative Kinetics” to build a new generation of transtibial prostheses Keywords: Transtibial Prosthesis , regenerative, spring, wearable robot

  6. Quantifying physical characteristics of wildland fuels using the fuel characteristic classification system.

    Treesearch

    Cynthia L. Riccardi; Susan J. Prichard; David V. Sandberg; Roger D. Ottmar

    2007-01-01

    Wildland fuel characteristics are used in many applications of operational fire predictions and to understand fire effects and behaviour. Even so, there is a shortage of information on basic fuel properties and the physical characteristics of wildland fuels. The Fuel Characteristic Classification System (FCCS) builds and catalogues fuelbed descriptions based on...

  7. A One-Versus-All Class Binarization Strategy for Bearing Diagnostics of Concurrent Defects

    PubMed Central

    Ng, Selina S. Y.; Tse, Peter W.; Tsui, Kwok L.

    2014-01-01

    In bearing diagnostics using a data-driven modeling approach, a concern is the need for data from all possible scenarios to build a practical model for all operating conditions. This paper is a study on bearing diagnostics with the concurrent occurrence of multiple defect types. The authors are not aware of any work in the literature that studies this practical problem. A strategy based on one-versus-all (OVA) class binarization is proposed to improve fault diagnostics accuracy while reducing the number of scenarios for data collection, by predicting concurrent defects from training data of normal and single defects. The proposed OVA diagnostic approach is evaluated with empirical analysis using support vector machine (SVM) and C4.5 decision tree, two popular classification algorithms frequently applied to system health diagnostics and prognostics. Statistical features are extracted from the time domain and the frequency domain. Prediction performance of the proposed strategy is compared with that of a simple multi-class classification, as well as that of random guess and worst-case classification. We have verified the potential of the proposed OVA diagnostic strategy in performance improvements for single-defect diagnosis and predictions of BPFO plus BPFI concurrent defects using two laboratory-collected vibration data sets. PMID:24419162

  8. Predicting human liver microsomal stability with machine learning techniques.

    PubMed

    Sakiyama, Yojiro; Yuki, Hitomi; Moriya, Takashi; Hattori, Kazunari; Suzuki, Misaki; Shimada, Kaoru; Honma, Teruki

    2008-02-01

    To ensure a continuing pipeline in pharmaceutical research, lead candidates must possess appropriate metabolic stability in the drug discovery process. In vitro ADMET (absorption, distribution, metabolism, elimination, and toxicity) screening provides us with useful information regarding the metabolic stability of compounds. However, before the synthesis stage, an efficient process is required in order to deal with the vast quantity of data from large compound libraries and high-throughput screening. Here we have derived a relationship between the chemical structure and its metabolic stability for a data set of in-house compounds by means of various in silico machine learning such as random forest, support vector machine (SVM), logistic regression, and recursive partitioning. For model building, 1952 proprietary compounds comprising two classes (stable/unstable) were used with 193 descriptors calculated by Molecular Operating Environment. The results using test compounds have demonstrated that all classifiers yielded satisfactory results (accuracy > 0.8, sensitivity > 0.9, specificity > 0.6, and precision > 0.8). Above all, classification by random forest as well as SVM yielded kappa values of approximately 0.7 in an independent validation set, slightly higher than other classification tools. These results suggest that nonlinear/ensemble-based classification methods might prove useful in the area of in silico ADME modeling.

  9. A one-versus-all class binarization strategy for bearing diagnostics of concurrent defects.

    PubMed

    Ng, Selina S Y; Tse, Peter W; Tsui, Kwok L

    2014-01-13

    In bearing diagnostics using a data-driven modeling approach, a concern is the need for data from all possible scenarios to build a practical model for all operating conditions. This paper is a study on bearing diagnostics with the concurrent occurrence of multiple defect types. The authors are not aware of any work in the literature that studies this practical problem. A strategy based on one-versus-all (OVA) class binarization is proposed to improve fault diagnostics accuracy while reducing the number of scenarios for data collection, by predicting concurrent defects from training data of normal and single defects. The proposed OVA diagnostic approach is evaluated with empirical analysis using support vector machine (SVM) and C4.5 decision tree, two popular classification algorithms frequently applied to system health diagnostics and prognostics. Statistical features are extracted from the time domain and the frequency domain. Prediction performance of the proposed strategy is compared with that of a simple multi-class classification, as well as that of random guess and worst-case classification. We have verified the potential of the proposed OVA diagnostic strategy in performance improvements for single-defect diagnosis and predictions of BPFO plus BPFI concurrent defects using two laboratory-collected vibration data sets.

  10. Micro-Raman spectroscopy of natural and synthetic indigo samples.

    PubMed

    Vandenabeele, Peter; Moens, Luc

    2003-02-01

    In this work indigo samples from three different sources are studied by using Raman spectroscopy: the synthetic pigment and pigments from the woad (Isatis tinctoria) and the indigo plant (Indigofera tinctoria). 21 samples were obtained from 8 suppliers; for each sample 5 Raman spectra were recorded and used for further chemometrical analysis. Principal components analysis (PCA) was performed as data reduction method before applying hierarchical cluster analysis. Linear discriminant analysis (LDA) was implemented as a non-hierarchical supervised pattern recognition method to build a classification model. In order to avoid broad-shaped interferences from the fluorescence background, the influence of 1st and 2nd derivatives on the classification was studied by using cross-validation. Although chemically identical, it is shown that Raman spectroscopy in combination with suitable chemometric methods has the potential to discriminate between synthetic and natural indigo samples.

  11. Authentication of whisky due to its botanical origin and way of production by instrumental analysis and multivariate classification methods.

    PubMed

    Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz

    2017-02-15

    Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Unsupervised classification of variable stars

    NASA Astrophysics Data System (ADS)

    Valenzuela, Lucas; Pichara, Karim

    2018-03-01

    During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.

  13. Improved wavelet packet classification algorithm for vibrational intrusions in distributed fiber-optic monitoring systems

    NASA Astrophysics Data System (ADS)

    Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo

    2015-05-01

    An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.

  14. Region-Based Building Rooftop Extraction and Change Detection

    NASA Astrophysics Data System (ADS)

    Tian, J.; Metzlaff, L.; d'Angelo, P.; Reinartz, P.

    2017-09-01

    Automatic extraction of building changes is important for many applications like disaster monitoring and city planning. Although a lot of research work is available based on 2D as well as 3D data, an improvement in accuracy and efficiency is still needed. The introducing of digital surface models (DSMs) to building change detection has strongly improved the resulting accuracy. In this paper, a post-classification approach is proposed for building change detection using satellite stereo imagery. Firstly, DSMs are generated from satellite stereo imagery and further refined by using a segmentation result obtained from the Sobel gradients of the panchromatic image. Besides the refined DSMs, the panchromatic image and the pansharpened multispectral image are used as input features for mean-shift segmentation. The DSM is used to calculate the nDSM, out of which the initial building candidate regions are extracted. The candidate mask is further refined by morphological filtering and by excluding shadow regions. Following this, all segments that overlap with a building candidate region are determined. A building oriented segments merging procedure is introduced to generate a final building rooftop mask. As the last step, object based change detection is performed by directly comparing the building rooftops extracted from the pre- and after-event imagery and by fusing the change indicators with the roof-top region map. A quantitative and qualitative assessment of the proposed approach is provided by using WorldView-2 satellite data from Istanbul, Turkey.

  15. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care, this technology may improve the standard of burn care for patients without access to specialized facilities.

  16. The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases.

    PubMed

    Udatha, D B R K Gupta; Kouskoumvekaki, Irene; Olsson, Lisbeth; Panagiotou, Gianni

    2011-01-01

    One of the most intriguing groups of enzymes, the feruloyl esterases (FAEs), is ubiquitous in both simple and complex organisms. FAEs have gained importance in biofuel, medicine and food industries due to their capability of acting on a large range of substrates for cleaving ester bonds and synthesizing high-added value molecules through esterification and transesterification reactions. During the past two decades extensive studies have been carried out on the production and partial characterization of FAEs from fungi, while much less is known about FAEs of bacterial or plant origin. Initial classification studies on FAEs were restricted on sequence similarity and substrate specificity on just four model substrates and considered only a handful of FAEs belonging to the fungal kingdom. This study centers on the descriptor-based classification and structural analysis of experimentally verified and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training sequences and all the sequences of the blind test. The underlying functionality of the 12 proposed FAE families was validated against a combination of prediction tools and published experimental data. Another important aspect of the present work involves the development of pharmacophore models for the new FAE families, for which sufficient information on known substrates existed. Knowing the pharmacophoric features of a small molecule that are essential for binding to the members of a certain family opens a window of opportunities for tailored applications of FAEs. Copyright © 2010 Elsevier Inc. All rights reserved.

  17. SU-G-BRC-13: Model Based Classification for Optimal Position Selection for Left-Sided Breast Radiotherapy: Free Breathing, DIBH, Or Prone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, H; Liu, T; Xu, X

    Purpose: There are clinical decision challenges to select optimal treatment positions for left-sided breast cancer patients—supine free breathing (FB), supine Deep Inspiration Breath Hold (DIBH) and prone free breathing (prone). Physicians often make the decision based on experiences and trials, which might not always result optimal OAR doses. We herein propose a mathematical model to predict the lowest OAR doses among these three positions, providing a quantitative tool for corresponding clinical decision. Methods: Patients were scanned in FB, DIBH, and prone positions under an IRB approved protocol. Tangential beam plans were generated for each position, and OAR doses were calculated.more » The position with least OAR doses is defined as the optimal position. The following features were extracted from each scan to build the model: heart, ipsilateral lung, breast volume, in-field heart, ipsilateral lung volume, distance between heart and target, laterality of heart, and dose to heart and ipsilateral lung. Principal Components Analysis (PCA) was applied to remove the co-linearity of the input data and also to lower the data dimensionality. Feature selection, another method to reduce dimensionality, was applied as a comparison. Support Vector Machine (SVM) was then used for classification. Thirtyseven patient data were acquired; up to now, five patient plans were available. K-fold cross validation was used to validate the accuracy of the classifier model with small training size. Results: The classification results and K-fold cross validation demonstrated the model is capable of predicting the optimal position for patients. The accuracy of K-fold cross validations has reached 80%. Compared to PCA, feature selection allows causal features of dose to be determined. This provides more clinical insights. Conclusion: The proposed classification system appeared to be feasible. We are generating plans for the rest of the 37 patient images, and more statistically significant results are to be presented.« less

  18. Ensemble Feature Learning of Genomic Data Using Support Vector Machine

    PubMed Central

    Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.

    2016-01-01

    The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923

  19. Can We Speculate Running Application With Server Power Consumption Trace?

    PubMed

    Li, Yuanlong; Hu, Han; Wen, Yonggang; Zhang, Jun

    2018-05-01

    In this paper, we propose to detect the running applications in a server by classifying the observed power consumption series for the purpose of data center energy consumption monitoring and analysis. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning-based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbor and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as local time warping (LTW), which utilizes a user-specified index set for local warping, and is designed to be noncommutative and nondynamic programming. Second, we hybridize the 1-nearest neighbor (1NN)-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of dynamic time warping (DTW) from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its noncommutative feature is indeed beneficial. We also test a linear version of LTW and find out that it can perform similar to state-of-the-art DTW-based method while it runs as fast as the linear runtime lower bound methods like LB_Keogh for our problem. With the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models.

  20. Classification-Assisted Memetic Algorithms for Equality-Constrained Optimization Problems

    NASA Astrophysics Data System (ADS)

    Handoko, Stephanus Daniel; Kwoh, Chee Keong; Ong, Yew Soon

    Regressions has successfully been incorporated into memetic algorithm (MA) to build surrogate models for the objective or constraint landscape of optimization problems. This helps to alleviate the needs for expensive fitness function evaluations by performing local refinements on the approximated landscape. Classifications can alternatively be used to assist MA on the choice of individuals that would experience refinements. Support-vector-assisted MA were recently proposed to alleviate needs for function evaluations in the inequality-constrained optimization problems by distinguishing regions of feasible solutions from those of the infeasible ones based on some past solutions such that search efforts can be focussed on some potential regions only. For problems having equality constraints, however, the feasible space would obviously be extremely small. It is thus extremely difficult for the global search component of the MA to produce feasible solutions. Hence, the classification of feasible and infeasible space would become ineffective. In this paper, a novel strategy to overcome such limitation is proposed, particularly for problems having one and only one equality constraint. The raw constraint value of an individual, instead of its feasibility class, is utilized in this work.

  1. Classification of document page images based on visual similarity of layout structures

    NASA Astrophysics Data System (ADS)

    Shin, Christian K.; Doermann, David S.

    1999-12-01

    Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify a document's type in the absence of domain specific models. A document type or genre can be defined by the user based primarily on layout structure. Our classification approach is based on 'visual similarity' of the layout structure by building a supervised classifier, given examples of the class. We use image features, such as the percentages of tex and non-text (graphics, image, table, and ruling) content regions, column structures, variations in the point size of fonts, the density of content area, and various statistics on features of connected components which can be derived from class samples without class knowledge. In order to obtain class labels for training samples, we conducted a user relevance test where subjects ranked UW-I document images with respect to the 12 representative images. We implemented our classification scheme using the OC1, a decision tree classifier, and report our findings.

  2. Fuzzy Classification of Ocean Color Satellite Data for Bio-optical Algorithm Constituent Retrievals

    NASA Technical Reports Server (NTRS)

    Campbell, Janet W.

    1998-01-01

    The ocean has been traditionally viewed as a 2 class system. Morel and Prieur (1977) classified ocean water according to the dominant absorbent particle suspended in the water column. Case 1 is described as having a high concentration of phytoplankton (and detritus) relative to other particles. Conversely, case 2 is described as having inorganic particles such as suspended sediments in high concentrations. Little work has gone into the problem of mixing bio-optical models for these different water types. An approach is put forth here to blend bio-optical algorithms based on a fuzzy classification scheme. This scheme involves two procedures. First, a clustering procedure identifies classes and builds class statistics from in-situ optical measurements. Next, a classification procedure assigns satellite pixels partial memberships to these classes based on their ocean color reflectance signature. These membership assignments can be used as the basis for a weighting retrievals from class-specific bio-optical algorithms. This technique is demonstrated with in-situ optical measurements and an image from the SeaWiFS ocean color satellite.

  3. Towards exaggerated emphysema stereotypes

    NASA Astrophysics Data System (ADS)

    Chen, C.; Sørensen, L.; Lauze, F.; Igel, C.; Loog, M.; Feragen, A.; de Bruijne, M.; Nielsen, M.

    2012-03-01

    Classification is widely used in the context of medical image analysis and in order to illustrate the mechanism of a classifier, we introduce the notion of an exaggerated image stereotype based on training data and trained classifier. The stereotype of some image class of interest should emphasize/exaggerate the characteristic patterns in an image class and visualize the information the employed classifier relies on. This is useful for gaining insight into the classification and serves for comparison with the biological models of disease. In this work, we build exaggerated image stereotypes by optimizing an objective function which consists of a discriminative term based on the classification accuracy, and a generative term based on the class distributions. A gradient descent method based on iterated conditional modes (ICM) is employed for optimization. We use this idea with Fisher's linear discriminant rule and assume a multivariate normal distribution for samples within a class. The proposed framework is applied to computed tomography (CT) images of lung tissue with emphysema. The synthesized stereotypes illustrate the exaggerated patterns of lung tissue with emphysema, which is underpinned by three different quantitative evaluation methods.

  4. Effects of Vaporized Decontamination Systems on Selected Building Interior Materials: Vaporized Hydrogen Peroxide

    DTIC Science & Technology

    2009-01-01

    surfaces in buildings following a terrorist attack using CB agents. Vaporized hydrogen peroxide ( VHP ) and Cl02 are decontamination technologies that...decontaminant. The focus of this technical report is the evaluation of the building interior materials and the Steris VHP technology. 15. SUBJECT...TERMS Material Compatibility VHP vaporized hydrogen peroxide 16. SECURITY CLASSIFICATION OF: a. REPORT U b. ABSTRACT U c. THIS PAGE U 17

  5. Evaluation of air quality zone classification methods based on ambient air concentration exposure.

    PubMed

    Freeman, Brian; McBean, Ed; Gharabaghi, Bahram; Thé, Jesse

    2017-05-01

    Air quality zones are used by regulatory authorities to implement ambient air standards in order to protect human health. Air quality measurements at discrete air monitoring stations are critical tools to determine whether an air quality zone complies with local air quality standards or is noncompliant. This study presents a novel approach for evaluation of air quality zone classification methods by breaking the concentration distribution of a pollutant measured at an air monitoring station into compliance and exceedance probability density functions (PDFs) and then using Monte Carlo analysis with the Central Limit Theorem to estimate long-term exposure. The purpose of this paper is to compare the risk associated with selecting one ambient air classification approach over another by testing the possible exposure an individual living within a zone may face. The chronic daily intake (CDI) is utilized to compare different pollutant exposures over the classification duration of 3 years between two classification methods. Historical data collected from air monitoring stations in Kuwait are used to build representative models of 1-hr NO 2 and 8-hr O 3 within a zone that meets the compliance requirements of each method. The first method, the "3 Strike" method, is a conservative approach based on a winner-take-all approach common with most compliance classification methods, while the second, the 99% Rule method, allows for more robust analyses and incorporates long-term trends. A Monte Carlo analysis is used to model the CDI for each pollutant and each method with the zone at a single station and with multiple stations. The model assumes that the zone is already in compliance with air quality standards over the 3 years under the different classification methodologies. The model shows that while the CDI of the two methods differs by 2.7% over the exposure period for the single station case, the large number of samples taken over the duration period impacts the sensitivity of the statistical tests, causing the null hypothesis to fail. Local air quality managers can use either methodology to classify the compliance of an air zone, but must accept that the 99% Rule method may cause exposures that are statistically more significant than the 3 Strike method. A novel method using the Central Limit Theorem and Monte Carlo analysis is used to directly compare different air standard compliance classification methods by estimating the chronic daily intake of pollutants. This method allows air quality managers to rapidly see how individual classification methods may impact individual population groups, as well as to evaluate different pollutants based on dosage and exposure when complete health impacts are not known.

  6. Fire potential rating for wildland fuelbeds using the Fuel Characteristic Classification System.

    Treesearch

    David V. Sandberg; Cynthia L. Riccardi; Mark D. Schaff

    2007-01-01

    The Fuel Characteristic Classification System (FCCS) is a systematic catalog of inherent physical properties of wildland fuelbeds that allows land managers, policymakers, and scientists to build and calculate fuel characteristics with complete or incomplete information. The FCCS is equipped with a set of equations to calculate the potential of any real-world or...

  7. Local-aggregate modeling for big data via distributed optimization: Applications to neuroimaging.

    PubMed

    Hu, Yue; Allen, Genevera I

    2015-12-01

    Technological advances have led to a proliferation of structured big data that have matrix-valued covariates. We are specifically motivated to build predictive models for multi-subject neuroimaging data based on each subject's brain imaging scans. This is an ultra-high-dimensional problem that consists of a matrix of covariates (brain locations by time points) for each subject; few methods currently exist to fit supervised models directly to this tensor data. We propose a novel modeling and algorithmic strategy to apply generalized linear models (GLMs) to this massive tensor data in which one set of variables is associated with locations. Our method begins by fitting GLMs to each location separately, and then builds an ensemble by blending information across locations through regularization with what we term an aggregating penalty. Our so called, Local-Aggregate Model, can be fit in a completely distributed manner over the locations using an Alternating Direction Method of Multipliers (ADMM) strategy, and thus greatly reduces the computational burden. Furthermore, we propose to select the appropriate model through a novel sequence of faster algorithmic solutions that is similar to regularization paths. We will demonstrate both the computational and predictive modeling advantages of our methods via simulations and an EEG classification problem. © 2015, The International Biometric Society.

  8. Inferring anatomical therapeutic chemical (ATC) class of drugs using shortest path and random walk with restart algorithms.

    PubMed

    Chen, Lei; Liu, Tao; Zhao, Xian

    2018-06-01

    The anatomical therapeutic chemical (ATC) classification system is a widely accepted drug classification scheme. This system comprises five levels and includes several classes in each level. Drugs are classified into classes according to their therapeutic effects and characteristics. The first level includes 14 main classes. In this study, we proposed two network-based models to infer novel potential chemicals deemed to belong in the first level of ATC classification. To build these models, two large chemical networks were constructed using the chemical-chemical interaction information retrieved from the Search Tool for Interactions of Chemicals (STITCH). Two classic network algorithms, shortest path (SP) and random walk with restart (RWR) algorithms, were executed on the corresponding network to mine novel chemicals for each ATC class using the validated drugs in a class as seed nodes. Then, the obtained chemicals yielded by these two algorithms were further evaluated by a permutation test and an association test. The former can exclude chemicals produced by the structure of the network, i.e., false positive discoveries. By contrast, the latter identifies the most important chemicals that have strong associations with the ATC class. Comparisons indicated that the two models can provide quite dissimilar results, suggesting that the results yielded by one model can be essential supplements for those obtained by the other model. In addition, several representative inferred chemicals were analyzed to confirm the reliability of the results generated by the two models. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Angular difference feature extraction for urban scene classification using ZY-3 multi-angle high-resolution satellite imagery

    NASA Astrophysics Data System (ADS)

    Huang, Xin; Chen, Huijun; Gong, Jianya

    2018-01-01

    Spaceborne multi-angle images with a high-resolution are capable of simultaneously providing spatial details and three-dimensional (3D) information to support detailed and accurate classification of complex urban scenes. In recent years, satellite-derived digital surface models (DSMs) have been increasingly utilized to provide height information to complement spectral properties for urban classification. However, in such a way, the multi-angle information is not effectively exploited, which is mainly due to the errors and difficulties of the multi-view image matching and the inaccuracy of the generated DSM over complex and dense urban scenes. Therefore, it is still a challenging task to effectively exploit the available angular information from high-resolution multi-angle images. In this paper, we investigate the potential for classifying urban scenes based on local angular properties characterized from high-resolution ZY-3 multi-view images. Specifically, three categories of angular difference features (ADFs) are proposed to describe the angular information at three levels (i.e., pixel, feature, and label levels): (1) ADF-pixel: the angular information is directly extrapolated by pixel comparison between the multi-angle images; (2) ADF-feature: the angular differences are described in the feature domains by comparing the differences between the multi-angle spatial features (e.g., morphological attribute profiles (APs)). (3) ADF-label: label-level angular features are proposed based on a group of urban primitives (e.g., buildings and shadows), in order to describe the specific angular information related to the types of primitive classes. In addition, we utilize spatial-contextual information to refine the multi-level ADF features using superpixel segmentation, for the purpose of alleviating the effects of salt-and-pepper noise and representing the main angular characteristics within a local area. The experiments on ZY-3 multi-angle images confirm that the proposed ADF features can effectively improve the accuracy of urban scene classification, with a significant increase in overall accuracy (3.8-11.7%) compared to using the spectral bands alone. Furthermore, the results indicated the superiority of the proposed ADFs in distinguishing between the spectrally similar and complex man-made classes, including roads and various types of buildings (e.g., high buildings, urban villages, and residential apartments).

  10. Intended Use of a Building in Terms of Updating the Cadastral Database and Harmonizing the Data with other Public Records

    NASA Astrophysics Data System (ADS)

    Buśko, Małgorzata

    2017-06-01

    According to the original wording of the Regulation on the register of land and buildings of 2001, in the real estate cadastre there was one attribute associated with the use of a building structure - its intended use, which was applicable until the amendment to the Regulation was introduced in 2013. Then, additional attributes were added, i.e. the type of the building according to the Classification of Fixed Assets (KST), the class of the building according to the Polish Classification of Types of Constructions (PKOB) and, at the same time, the main functional use and other functions of the building remained in the Regulation as well. The record data on buildings are captured for the real estate cadastre from other data sets, for example those maintained by architectural and construction authorities. At the same time, the data contained in the cadastre, after they have been entered or changed in the database, are transferred to other registers, such as tax records, or land and mortgage court registers. This study is the result of the analysis of the laws applicable to the specific units and registers. A list of discrepancies in the attributes occurring in the different registers was prepared. The practical part of the study paid particular attention to the legal bases and procedures for entering the function of a building in the real estate cadastre, which is extremely significant, as it is the attribute determining the property tax basis.

  11. Development of algorithms for building inventory compilation through remote sensing and statistical inferencing

    NASA Astrophysics Data System (ADS)

    Sarabandi, Pooya

    Building inventories are one of the core components of disaster vulnerability and loss estimations models, and as such, play a key role in providing decision support for risk assessment, disaster management and emergency response efforts. In may parts of the world inclusive building inventories, suitable for the use in catastrophe models cannot be found. Furthermore, there are serious shortcomings in the existing building inventories that include incomplete or out-dated information on critical attributes as well as missing or erroneous values for attributes. In this dissertation a set of methodologies for updating spatial and geometric information of buildings from single and multiple high-resolution optical satellite images are presented. Basic concepts, terminologies and fundamentals of 3-D terrain modeling from satellite images are first introduced. Different sensor projection models are then presented and sources of optical noise such as lens distortions are discussed. An algorithm for extracting height and creating 3-D building models from a single high-resolution satellite image is formulated. The proposed algorithm is a semi-automated supervised method capable of extracting attributes such as longitude, latitude, height, square footage, perimeter, irregularity index and etc. The associated errors due to the interactive nature of the algorithm are quantified and solutions for minimizing the human-induced errors are proposed. The height extraction algorithm is validated against independent survey data and results are presented. The validation results show that an average height modeling accuracy of 1.5% can be achieved using this algorithm. Furthermore, concept of cross-sensor data fusion for the purpose of 3-D scene reconstruction using quasi-stereo images is developed in this dissertation. The developed algorithm utilizes two or more single satellite images acquired from different sensors and provides the means to construct 3-D building models in a more economical way. A terrain-dependent-search algorithm is formulated to facilitate the search for correspondences in a quasi-stereo pair of images. The calculated heights for sample buildings using cross-sensor data fusion algorithm show an average coefficient of variation 1.03%. In order to infer structural-type and occupancy-type, i.e. engineering attributes, of buildings from spatial and geometric attributes of 3-D models, a statistical data analysis framework is formulated. Applications of "Classification Trees" and "Multinomial Logistic Models" in modeling the marginal probabilities of class-membership of engineering attributes are investigated. Adaptive statistical models to incorporate different spatial and geometric attributes of buildings---while inferring the engineering attributes---are developed in this dissertation. The inferred engineering attributes in conjunction with the spatial and geometric attributes derived from the imagery can be used to augment regional building inventories and therefore enhance the result of catastrophe models. In the last part of the dissertation, a set of empirically-derived motion-damage relationships based on the correlation of observed building performance with measured ground-motion parameters from 1994 Northridge and 1999 Chi-Chi Taiwan earthquakes are developed. Fragility functions in the form of cumulative lognormal distributions and damage probability matrices for several classes of buildings (wood, steel and concrete), as well as number of ground-motion intensity measures are developed and compared to currently-used motion-damage relationships.

  12. High-throughput screening of chemicals as functional ...

    EPA Pesticide Factsheets

    Identifying chemicals that provide a specific function within a product, yet have minimal impact on the human body or environment, is the goal of most formulation chemists and engineers practicing green chemistry. We present a methodology to identify potential chemical functional substitutes from large libraries of chemicals using machine learning based models. We collect and analyze publicly available information on the function of chemicals in consumer products or industrial processes to identify a suite of harmonized function categories suitable for modeling. We use structural and physicochemical descriptors for these chemicals to build 41 quantitative structure–use relationship (QSUR) models for harmonized function categories using random forest classification. We apply these models to screen a library of nearly 6400 chemicals with available structure information for potential functional substitutes. Using our Functional Use database (FUse), we could identify uses for 3121 chemicals; 4412 predicted functional uses had a probability of 80% or greater. We demonstrate the potential application of the models to high-throughput (HT) screening for “candidate alternatives” by merging the valid functional substitute classifications with hazard metrics developed from HT screening assays for bioactivity. A descriptor set could be obtained for 6356 Tox21 chemicals that have undergone a battery of HT in vitro bioactivity screening assays. By applying QSURs, we wer

  13. Building a common pipeline for rule-based document classification.

    PubMed

    Patterson, Olga V; Ginter, Thomas; DuVall, Scott L

    2013-01-01

    Instance-based classification of clinical text is a widely used natural language processing task employed as a step for patient classification, document retrieval, or information extraction. Rule-based approaches rely on concept identification and context analysis in order to determine the appropriate class. We propose a five-step process that enables even small research teams to develop simple but powerful rule-based NLP systems by taking advantage of a common UIMA AS based pipeline for classification. Our proposed methodology coupled with the general-purpose solution provides researchers with access to the data locked in clinical text in cases of limited human resources and compact timelines.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zarate, M.A.; Slotnick, J.; Ramos, M.

    The development and implementation of a solid waste management program served to build local capacity in San Mateo Ixtatan between 2002 and 2003 as part of a public health action plan. The program was developed and implemented in two phases: (1) the identification and education of a working team from the community; and (2) the completion of a solid waste classification and quantification study. Social capital and the water cycle were two public health approaches utilized to build a sustainable program. The activities accomplished gained support from the community and municipal authorities. A description of the tasks completed and findingsmore » of the solid waste classification and quantification performed by a local working group are presented in this paper.« less

  15. Building a robust vehicle detection and classification module

    NASA Astrophysics Data System (ADS)

    Grigoryev, Anton; Khanipov, Timur; Koptelov, Ivan; Bocharov, Dmitry; Postnikov, Vassily; Nikolaev, Dmitry

    2015-12-01

    The growing adoption of intelligent transportation systems (ITS) and autonomous driving requires robust real-time solutions for various event and object detection problems. Most of real-world systems still cannot rely on computer vision algorithms and employ a wide range of costly additional hardware like LIDARs. In this paper we explore engineering challenges encountered in building a highly robust visual vehicle detection and classification module that works under broad range of environmental and road conditions. The resulting technology is competitive to traditional non-visual means of traffic monitoring. The main focus of the paper is on software and hardware architecture, algorithm selection and domain-specific heuristics that help the computer vision system avoid implausible answers.

  16. Intelligent tutoring systems as tools for investigating individual differences in learning

    NASA Technical Reports Server (NTRS)

    Shute, Valerie J.

    1987-01-01

    The ultimate goal of this research is to build an improved model-based selection and classification system for the United States Air Force. Researchers are developing innovative approaches to ability testing. The Learning Abilities Measurement Program (LAMP) examines individual differences in learning abilities, seeking answers to the questions of why some people learn more and better than others and whether there are basic cognitive processes applicable across tasks and domains that are predictive of successful performance (or whether there are more complex problem solving behaviors involved).

  17. Comparing two metabolic profiling approaches (liquid chromatography and gas chromatography coupled to mass spectrometry) for extra-virgin olive oil phenolic compounds analysis: A botanical classification perspective.

    PubMed

    Bajoub, Aadil; Pacchiarotta, Tiziana; Hurtado-Fernández, Elena; Olmo-García, Lucía; García-Villalba, Rocío; Fernández-Gutiérrez, Alberto; Mayboroda, Oleg A; Carrasco-Pancorbo, Alegría

    2016-01-08

    Over the last decades, the phenolic compounds from virgin olive oil (VOO) have become the subject of intensive research because of their biological activities and their influence on some of the most relevant attributes of this interesting matrix. Developing metabolic profiling approaches to determine them in monovarietal virgin olive oils could help to gain a deeper insight into olive oil phenolic compounds composition as well as to promote their use for botanical origin tracing purposes. To this end, two approaches were comparatively investigated (LC-ESI-TOF MS and GC-APCI-TOF MS) to evaluate their capacity to properly classify 25 olive oil samples belonging to five different varieties (Arbequina, Cornicabra, Hojiblanca, Frantoio and Picual), using the entire chromatographic phenolic profiles combined to chemometrics (principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA)). The application of PCA to LC-MS and GC-MS data showed the natural clustering of the samples, seeing that 2 varieties were dominating the models (Arbequina and Frantoio), suppressing any possible discrimination among the other cultivars. Afterwards, PLS-DA was used to build four different efficient predictive models for varietal classification of the samples under study. The varietal markers pointed out by each platform were compared. In general, with the exception of one GC-MS model, all exhibited proper quality parameters. The models constructed by using the LC-MS data demonstrated superior classification ability. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. ADM. Service Building (TAN603). Floor plan. Names of functional areas. ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    ADM. Service Building (TAN-603). Floor plan. Names of functional areas. Ralph M. Parsons 902-2-ANY-603-A 43. Date: December 1952. Approved by INEEL Classification Office for public release. INEEL index code no. 033-0603-00-693-106718 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  19. FET. Control and equipment building (TAN630). Sections. Ralph M. Parsons ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Control and equipment building (TAN-630). Sections. Ralph M. Parsons 1229-2 ANP/GE-5-630-A-4. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0630-00-693-107083 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  20. FET. Chlorination building, TAN637. Elevations, section. Ralph M. Parsons 12292 ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Chlorination building, TAN-637. Elevations, section. Ralph M. Parsons 1229-2 ANP/GE-5-637-A-S-H&V-1. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0637-00-693-107148 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  1. IET. Control and equipment building (TAN620). Blast roof details. Ralph ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Control and equipment building (TAN-620). Blast roof details. Ralph M. Parsons 902-4-ANP-620-A-323. Date: February 1954. Approved by INEEL Classification Office for public release. INEEL index code no. 035-620-00-693-106908 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  2. IET. Control and equipment building (TAN620). Details and room finish ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Control and equipment building (TAN-620). Details and room finish schedule. Ralph M. Parsons 902-4-ANP-620-A 322. Approved by INEEL Classification Office for public release. INEEL index code no. 035-0629-00-693-106907 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  3. ADM. Administration Building (TAN602). Early room layout, door and room ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    ADM. Administration Building (TAN-602). Early room layout, door and room schedules. Ralph M. Parsons 902-2-ANP-602-A 31. Date: December 1952. Approved by INEEL Classification Office for public release. INEEL index code no. 033-0602-00-693-106710 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  4. Building Performance Optimization while Empowering Occupants Toward Environmentally Sustainable Behavior through Continuous Monitoring and Diagnostics

    DTIC Science & Technology

    2016-12-01

    conservation, building occupant comfort and satisfaction 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a...21 3.2.7 Occupant Comfort and Satisfaction ............................................................................. 22 3.2.8 Facility...50 6.7 PO-VII: INCREASE IN OCCUPANT SATISFACTION ......................................... 51 6.8 PO-VIII

  5. Robustness of neuroprosthetic decoding algorithms.

    PubMed

    Serruya, Mijail; Hatsopoulos, Nicholas; Fellows, Matthew; Paninski, Liam; Donoghue, John

    2003-03-01

    We assessed the ability of two algorithms to predict hand kinematics from neural activity as a function of the amount of data used to determine the algorithm parameters. Using chronically implanted intracortical arrays, single- and multineuron discharge was recorded during trained step tracking and slow continuous tracking tasks in macaque monkeys. The effect of increasing the amount of data used to build a neural decoding model on the ability of that model to predict hand kinematics accurately was examined. We evaluated how well a maximum-likelihood model classified discrete reaching directions and how well a linear filter model reconstructed continuous hand positions over time within and across days. For each of these two models we asked two questions: (1) How does classification performance change as the amount of data the model is built upon increases? (2) How does varying the time interval between the data used to build the model and the data used to test the model affect reconstruction? Less than 1 min of data for the discrete task (8 to 13 neurons) and less than 3 min (8 to 18 neurons) for the continuous task were required to build optimal models. Optimal performance was defined by a cost function we derived that reflects both the ability of the model to predict kinematics accurately and the cost of taking more time to build such models. For both the maximum-likelihood classifier and the linear filter model, increasing the duration between the time of building and testing the model within a day did not cause any significant trend of degradation or improvement in performance. Linear filters built on one day and tested on neural data on a subsequent day generated error-measure distributions that were not significantly different from those generated when the linear filters were tested on neural data from the initial day (p<0.05, Kolmogorov-Smirnov test). These data show that only a small amount of data from a limited number of cortical neurons appears to be necessary to construct robust models to predict kinematic parameters for the subsequent hours. Motor-control signals derived from neurons in motor cortex can be reliably acquired for use in neural prosthetic devices. Adequate decoding models can be built rapidly from small numbers of cells and maintained with daily calibration sessions.

  6. Classification Accuracy Increase Using Multisensor Data Fusion

    NASA Astrophysics Data System (ADS)

    Makarau, A.; Palubinskas, G.; Reinartz, P.

    2011-09-01

    The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport objects, forest, roads, rail roads, etc.

  7. Classifying patents based on their semantic content.

    PubMed

    Bergeaud, Antonin; Potiron, Yoann; Raimbault, Juste

    2017-01-01

    In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.

  8. Study on Ecological Risk Assessment of Guangxi Coastal Zone Based on 3s Technology

    NASA Astrophysics Data System (ADS)

    Zhong, Z.; Luo, H.; Ling, Z. Y.; Huang, Y.; Ning, W. Y.; Tang, Y. B.; Shao, G. Z.

    2018-05-01

    This paper takes Guangxi coastal zone as the study area, following the standards of land use type, divides the coastal zone of ecological landscape into seven kinds of natural wetland landscape types such as woodland, farmland, grassland, water, urban land and wetlands. Using TM data of 2000-2015 such 15 years, with the CART decision tree algorithm, for analysis the characteristic of types of landscape's remote sensing image and build decision tree rules of landscape classification to extract information classification. Analyzing of the evolution process of the landscape pattern in Guangxi coastal zone in nearly 15 years, we may understand the distribution characteristics and change rules. Combined with the natural disaster data, we use of landscape index and the related risk interference degree and construct ecological risk evaluation model in Guangxi coastal zone for ecological risk assessment results of Guangxi coastal zone.

  9. Classifying patents based on their semantic content

    PubMed Central

    2017-01-01

    In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information. PMID:28445550

  10. A robust dataset-agnostic heart disease classifier from Phonocardiogram.

    PubMed

    Banerjee, Rohan; Dutta Choudhury, Anirban; Deshpande, Parijat; Bhattacharya, Sakyajit; Pal, Arpan; Mandana, K M

    2017-07-01

    Automatic classification of normal and abnormal heart sounds is a popular area of research. However, building a robust algorithm unaffected by signal quality and patient demography is a challenge. In this paper we have analysed a wide list of Phonocardiogram (PCG) features in time and frequency domain along with morphological and statistical features to construct a robust and discriminative feature set for dataset-agnostic classification of normal and cardiac patients. The large and open access database, made available in Physionet 2016 challenge was used for feature selection, internal validation and creation of training models. A second dataset of 41 PCG segments, collected using our in-house smart phone based digital stethoscope from an Indian hospital was used for performance evaluation. Our proposed methodology yielded sensitivity and specificity scores of 0.76 and 0.75 respectively on the test dataset in classifying cardiovascular diseases. The methodology also outperformed three popular prior art approaches, when applied on the same dataset.

  11. Mapping the function of neuronal ion channels in model and experiment

    PubMed Central

    Podlaski, William F; Seeholzer, Alexander; Groschner, Lukas N; Miesenböck, Gero; Ranjan, Rajnish; Vogels, Tim P

    2017-01-01

    Ion channel models are the building blocks of computational neuron models. Their biological fidelity is therefore crucial for the interpretation of simulations. However, the number of published models, and the lack of standardization, make the comparison of ion channel models with one another and with experimental data difficult. Here, we present a framework for the automated large-scale classification of ion channel models. Using annotated metadata and responses to a set of voltage-clamp protocols, we assigned 2378 models of voltage- and calcium-gated ion channels coded in NEURON to 211 clusters. The IonChannelGenealogy (ICGenealogy) web interface provides an interactive resource for the categorization of new and existing models and experimental recordings. It enables quantitative comparisons of simulated and/or measured ion channel kinetics, and facilitates field-wide standardization of experimentally-constrained modeling. DOI: http://dx.doi.org/10.7554/eLife.22152.001 PMID:28267430

  12. Informal settlement classification using point-cloud and image-based features from UAV data

    NASA Astrophysics Data System (ADS)

    Gevaert, C. M.; Persello, C.; Sliuzas, R.; Vosselman, G.

    2017-03-01

    Unmanned Aerial Vehicles (UAVs) are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Furthermore, it is of interest to analyse which fundamental attributes are suitable for describing these objects in different geographic locations. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. UAV datasets from informal settlements in two different countries are compared in order to identify salient features for specific objects in heterogeneous urban environments. Findings show that the integration of 2D and 3D features leads to an overall accuracy of 91.6% and 95.2% respectively for informal settlements in Kigali, Rwanda and Maldonado, Uruguay.

  13. Hyperspectral imaging of polymer banknotes for building and analysis of spectral library

    NASA Astrophysics Data System (ADS)

    Lim, Hoong-Ta; Murukeshan, Vadakke Matham

    2017-11-01

    The use of counterfeit banknotes increases crime rates and cripples the economy. New countermeasures are required to stop counterfeiters who use advancing technologies with criminal intent. Many countries started adopting polymer banknotes to replace paper notes, as polymer notes are more durable and have better quality. The research on authenticating such banknotes is of much interest to the forensic investigators. Hyperspectral imaging can be employed to build a spectral library of polymer notes, which can then be used for classification to authenticate these notes. This is however not widely reported and has become a research interest in forensic identification. This paper focuses on the use of hyperspectral imaging on polymer notes to build spectral libraries, using a pushbroom hyperspectral imager which has been previously reported. As an initial study, a spectral library will be built from three arbitrarily chosen regions of interest of five circulated genuine polymer notes. Principal component analysis is used for dimension reduction and to convert the information in the spectral library to principal components. A 99% confidence ellipse is formed around the cluster of principal component scores of each class and then used as classification criteria. The potential of the adopted methodology is demonstrated by the classification of the imaged regions as training samples.

  14. Indoor transformer stations and ELF magnetic field exposure: use of transformer structural characteristics to improve exposure assessment.

    PubMed

    Okokon, Enembe Oku; Roivainen, Päivi; Kheifets, Leeka; Mezei, Gabor; Juutilainen, Jukka

    2014-01-01

    Previous studies have shown that populations of multiapartment buildings with indoor transformer stations may serve as a basis for improved epidemiological studies on the relationship between childhood leukaemia and extremely-low-frequency (ELF) magnetic fields (MFs). This study investigated whether classification based on structural characteristics of the transformer stations would improve ELF MF exposure assessment. The data included MF measurements in apartments directly above transformer stations ("exposed" apartments) in 30 buildings in Finland, and reference apartments in the same buildings. Transformer structural characteristics (type and location of low-voltage conductors) were used to classify exposed apartments into high-exposure (HE) and intermediate-exposure (IE) categories. An exposure gradient was observed: both the time-average MF and time above a threshold (0.4 μT) were highest in the HE apartments and lowest in the reference apartments, showing a statistically significant trend. The differences between HE and IE apartments, however, were not statistically significant. A simulation exercise showed that the three-category classification did not perform better than a two-category classification (exposed and reference apartments) in detecting the existence of an increased risk. However, data on the structural characteristics of transformers is potentially useful for evaluating exposure-response relationship.

  15. Dem Reconstruction Using Light Field and Bidirectional Reflectance Function from Multi-View High Resolution Spatial Images

    NASA Astrophysics Data System (ADS)

    de Vieilleville, F.; Ristorcelli, T.; Delvit, J.-M.

    2016-06-01

    This paper presents a method for dense DSM reconstruction from high resolution, mono sensor, passive imagery, spatial panchromatic image sequence. The interest of our approach is four-fold. Firstly, we extend the core of light field approaches using an explicit BRDF model from the Image Synthesis community which is more realistic than the Lambertian model. The chosen model is the Cook-Torrance BRDF which enables us to model rough surfaces with specular effects using specific material parameters. Secondly, we extend light field approaches for non-pinhole sensors and non-rectilinear motion by using a proper geometric transformation on the image sequence. Thirdly, we produce a 3D volume cost embodying all the tested possible heights and filter it using simple methods such as Volume Cost Filtering or variational optimal methods. We have tested our method on a Pleiades image sequence on various locations with dense urban buildings and report encouraging results with respect to classic multi-label methods such as MIC-MAC, or more recent pipelines such as S2P. Last but not least, our method also produces maps of material parameters on the estimated points, allowing us to simplify building classification or road extraction.

  16. Point clouds segmentation as base for as-built BIM creation

    NASA Astrophysics Data System (ADS)

    Macher, H.; Landes, T.; Grussenmeyer, P.

    2015-08-01

    In this paper, a three steps segmentation approach is proposed in order to create 3D models from point clouds acquired by TLS inside buildings. The three scales of segmentation are floors, rooms and planes composing the rooms. First, floor segmentation is performed based on analysis of point distribution along Z axis. Then, for each floor, room segmentation is achieved considering a slice of point cloud at ceiling level. Finally, planes are segmented for each room, and planes corresponding to ceilings and floors are identified. Results of each step are analysed and potential improvements are proposed. Based on segmented point clouds, the creation of as-built BIM is considered in a future work section. Not only the classification of planes into several categories is proposed, but the potential use of point clouds acquired outside buildings is also considered.

  17. Simulation Technology Laboratory Building 970 hazards assessment document

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wood, C.L.; Starr, M.D.

    1994-11-01

    The Department of Energy Order 5500.3A requires facility-specific hazards assessments be prepared, maintained, and used for emergency planning purposes. This hazards assessment document describes the chemical and radiological hazards associated with the Simulation Technology Laboratory, Building 970. The entire inventory was screened according to the potential airborne impact to onsite and offsite individuals. The air dispersion model, ALOHA, estimated pollutant concentrations downwind from the source of a release, taking into consideration the toxicological and physical characteristics of the release site, the atmospheric conditions, and the circumstances of the release. The greatest distances at which a postulated facility event will producemore » consequences exceeding the ERPG-2 and Early Severe Health Effects thresholds are 78 and 46 meters, respectively. The highest emergency classification is a Site Area Emergency. The Emergency Planning Zone is 100 meters.« less

  18. Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.

    PubMed

    Yang, Lingjian; Ainali, Chrysanthi; Tsoka, Sophia; Papageorgiou, Lazaros G

    2014-12-05

    Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies. A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile. The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.

  19. Classification of jet fuel properties by near-infrared spectroscopy using fuzzy rule-building expert systems and support vector machines.

    PubMed

    Xu, Zhanfeng; Bunker, Christopher E; Harrington, Peter de B

    2010-11-01

    Monitoring the changes of jet fuel physical properties is important because fuel used in high-performance aircraft must meet rigorous specifications. Near-infrared (NIR) spectroscopy is a fast method to characterize fuels. Because of the complexity of NIR spectral data, chemometric techniques are used to extract relevant information from spectral data to accurately classify physical properties of complex fuel samples. In this work, discrimination of fuel types and classification of flash point, freezing point, boiling point (10%, v/v), boiling point (50%, v/v), and boiling point (90%, v/v) of jet fuels (JP-5, JP-8, Jet A, and Jet A1) were investigated. Each physical property was divided into three classes, low, medium, and high ranges, using two evaluations with different class boundary definitions. The class boundaries function as the threshold to alarm when the fuel properties change. Optimal partial least squares discriminant analysis (oPLS-DA), fuzzy rule-building expert system (FuRES), and support vector machines (SVM) were used to build the calibration models between the NIR spectra and classes of physical property of jet fuels. OPLS-DA, FuRES, and SVM were compared with respect to prediction accuracy. The validation of the calibration model was conducted by applying bootstrap Latin partition (BLP), which gives a measure of precision. Prediction accuracy of 97 ± 2% of the flash point, 94 ± 2% of freezing point, 99 ± 1% of the boiling point (10%, v/v), 98 ± 2% of the boiling point (50%, v/v), and 96 ± 1% of the boiling point (90%, v/v) were obtained by FuRES in one boundaries definition. Both FuRES and SVM obtained statistically better prediction accuracy over those obtained by oPLS-DA. The results indicate that combined with chemometric classifiers NIR spectroscopy could be a fast method to monitor the changes of jet fuel physical properties.

  20. About decomposition approach for solving the classification problem

    NASA Astrophysics Data System (ADS)

    Andrianova, A. A.

    2016-11-01

    This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.

  1. Statistical classification of drug incidents due to look-alike sound-alike mix-ups.

    PubMed

    Wong, Zoie Shui Yee

    2016-06-01

    It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.

  2. Discrimination of lymphoma using laser-induced breakdown spectroscopy conducted on whole blood samples

    PubMed Central

    Chen, Xue; Li, Xiaohui; Yang, Sibo; Yu, Xin; Liu, Aichun

    2018-01-01

    Lymphoma is a significant cancer that affects the human lymphatic and hematopoietic systems. In this work, discrimination of lymphoma using laser-induced breakdown spectroscopy (LIBS) conducted on whole blood samples is presented. The whole blood samples collected from lymphoma patients and healthy controls are deposited onto standard quantitative filter papers and ablated with a 1064 nm Q-switched Nd:YAG laser. 16 atomic and ionic emission lines of calcium (Ca), iron (Fe), magnesium (Mg), potassium (K) and sodium (Na) are selected to discriminate the cancer disease. Chemometric methods, including principal component analysis (PCA), linear discriminant analysis (LDA) classification, and k nearest neighbor (kNN) classification are used to build the discrimination models. Both LDA and kNN models have achieved very good discrimination performances for lymphoma, with an accuracy of over 99.7%, a sensitivity of over 0.996, and a specificity of over 0.997. These results demonstrate that the whole-blood-based LIBS technique in combination with chemometric methods can serve as a fast, less invasive, and accurate method for detection and discrimination of human malignancies. PMID:29541503

  3. T-RMSD: a fine-grained, structure-based classification method and its application to the functional characterization of TNF receptors.

    PubMed

    Magis, Cedrik; Stricher, François; van der Sloot, Almer M; Serrano, Luis; Notredame, Cedric

    2010-07-16

    This study addresses the relation between structural and functional similarity in proteins. We introduce a novel method named tree based on root mean square deviation (T-RMSD), which uses distance RMSD (dRMSD) variations to build fine-grained structure-based classifications of proteins. The main improvement of the T-RMSD over similar methods, such as Dali, is its capacity to produce the equivalent of a bootstrap value for each cluster node. We validated our approach on two domain families studied extensively for their role in many biological and pathological pathways: the small GTPase RAS superfamily and the cysteine-rich domains (CRDs) associated with the tumor necrosis factor receptors (TNFRs) family. Our analysis showed that T-RMSD is able to automatically recover and refine existing classifications. In the case of the small GTPase ARF subfamily, T-RMSD can distinguish GTP- from GDP-bound states, while in the case of CRDs it can identify two new subgroups associated with well defined functional features (ligand binding and formation of ligand pre-assembly complex). We show how hidden Markov models (HMMs) can be built on these new groups and propose a methodology to use these models simultaneously in order to do fine-grained functional genomic annotation without known 3D structures. T-RMSD, an open source freeware incorporated in the T-Coffee package, is available online. 2010 Elsevier Ltd. All rights reserved.

  4. Identification of cigarette smoke inhalations from wearable sensor data using a Support Vector Machine classifier.

    PubMed

    Lopez-Meyer, Paulo; Tiffany, Stephen; Sazonov, Edward

    2012-01-01

    This study presents a subject-independent model for detection of smoke inhalations from wearable sensors capturing characteristic hand-to-mouth gestures and changes in breathing patterns during cigarette smoking. Wearable sensors were used to detect the proximity of the hand to the mouth and to acquire the respiratory patterns. The waveforms of sensor signals were used as features to build a Support Vector Machine classification model. Across a data set of 20 enrolled participants, precision of correct identification of smoke inhalations was found to be >87%, and a resulting recall >80%. These results suggest that it is possible to analyze smoking behavior by means of a wearable and non-invasive sensor system.

  5. A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork.

    PubMed

    Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen

    2018-04-01

    This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control.

  6. A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork

    PubMed Central

    Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen

    2018-01-01

    Abstract This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control. PMID:29805285

  7. FET. Tank Building, TAN631. Elevations, sections, details. Tank pads and ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Tank Building, TAN-631. Elevations, sections, details. Tank pads and saddles. RAlph M. Parsons 1229-2 ANP/GE-5-631-A-1. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0631-00-693-107142 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  8. IET. Control and equipment building (TAN620) floor plan. Schedule of ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Control and equipment building (TAN-620) floor plan. Schedule of furniture and equipment. Ralph M. Parsons 902-4-ANP-A 320. Date: February 1954. Approved by INEEL Classification Office for public release. INEEL index code no. 035-0620-00-693-106905 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  9. FINAL REPORT: Building Performance Optimization while Empowering Occupants Toward Environmentally Sustainable Behavior through Continuous Monitoring and Diagnostics

    DTIC Science & Technology

    2016-12-05

    conservation, building occupant comfort and satisfaction 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a...21 3.2.7 Occupant Comfort and Satisfaction ............................................................................. 22 3.2.8 Facility...50 6.7 PO-VII: INCREASE IN OCCUPANT SATISFACTION ......................................... 51 6.8 PO-VIII

  10. Building Extraction Based on Openstreetmap Tags and Very High Spatial Resolution Image in Urban Area

    NASA Astrophysics Data System (ADS)

    Kang, L.; Wang, Q.; Yan, H. W.

    2018-04-01

    How to derive contour of buildings from VHR images is the essential problem for automatic building extraction in urban area. To solve this problem, OSM data is introduced to offer vector contour information of buildings which is hard to get from VHR images. First, we import OSM data into database. The line string data of OSM with tags of building, amenity, office etc. are selected and combined into completed contours; Second, the accuracy of contours of buildings is confirmed by comparing with the real buildings in Google Earth; Third, maximum likelihood classification is conducted with the confirmed building contours, and the result demonstrates that the proposed approach is effective and accurate. The approach offers a new way for automatic interpretation of VHR images.

  11. Global Dynamic Exposure and the OpenBuildingMap - Communicating Risk and Involving Communities

    NASA Astrophysics Data System (ADS)

    Schorlemmer, Danijel; Beutin, Thomas; Hirata, Naoshi; Hao, Ken; Wyss, Max; Cotton, Fabrice; Prehn, Karsten

    2017-04-01

    Detailed understanding of local risk factors regarding natural catastrophes requires in-depth characterization of the local exposure. Current exposure capture techniques have to find the balance between resolution and coverage. We aim at bridging this gap by employing a crowd-sourced approach to exposure capturing, focusing on risk related to earthquake hazard. OpenStreetMap (OSM), the rich and constantly growing geographical database, is an ideal foundation for this task. More than 3.5 billion geographical nodes, more than 200 million building footprints (growing by 100'000 per day), and a plethora of information about school, hospital, and other critical facilities allows us to exploit this dataset for risk-related computations. We are combining the strengths of crowd-sourced data collection with the knowledge of experts in extracting the most information from these data. Besides relying on the very active OpenStreetMap community and the Humanitarian OpenStreetMap Team, which are collecting building information at high pace, we are providing a tailored building capture tool for mobile devices. This tool is facilitating simple and fast building property capturing for OpenStreetMap by any person or interested community. With our OpenBuildingMap system, we are harvesting this dataset by processing every building in near-realtime. We are collecting exposure and vulnerability indicators from explicitly provided data (e.g. hospital locations), implicitly provided data (e.g. building shapes and positions), and semantically derived data, i.e. interpretation applying expert knowledge. The expert knowledge is needed to translate the simple building properties as captured by OpenStreetMap users into vulnerability and exposure indicators and subsequently into building classifications as defined in the Building Taxonomy 2.0 developed by the Global Earthquake Model (GEM) and the European Macroseismic Scale (EMS98). With this approach, we increase the resolution of existing exposure models from aggregated exposure information to building-by-building vulnerability. We report on our method, on the software development for the mobile application and the server-side analysis system, and on the OpenBuildingMap (www.openbuildingmap.org), our global Tile Map Service focusing on building properties. The free/open framework we provide can be used on commodity hardware for local to regional exposure capturing, for stakeholders in disaster management and mitigation for communicating risk, and for communities to understand their risk.

  12. Formalizing Resources for Planning

    NASA Technical Reports Server (NTRS)

    Bedrax-Weiss, Tania; McGann, Conor; Ramakrishnan, Sailesh

    2003-01-01

    In this paper we present a classification scheme which circumscribes a large class of resources found in the real world. Building on the work of others we also define key properties of resources that allow formal expression of the proposed classification. Furthermore, operations that change the state of a resource are formalized. Together, properties and operations go a long way in formalizing the representation and reasoning aspects of resources for planning.

  13. [Management of chemical products and European standards: new classification criteria according to the 1272/2008 (CLP) regulation].

    PubMed

    Fanghella, Paola Di Prospero; Aliberti, Ludovica Malaguti

    2013-01-01

    The European Union adopted regulations (EC) 1907/2006 REACH e (EC)1272/2008 CLP, to manage chemicals. REACH requires for evaluation and management of risks connected to the use of chemical substances, while o CLP provides for the classification, labelling and packagings of dangerous substances and mixtures by implementing in the EU the UN Globally Harmonised System of Classification and Labelling applying the building block approach, that is taking on board the hazard classes and categories which are close to the existing EU system in order to maintain the level of protection of human health and environment. This regulation provides also for the notification of the classification and labelling of substances to the Classification & Labelling Inventory established by the European Chemicals Agency (ECHA). Some european downstream regulations making reference to the classification criteria, as the health and safety laws at workplace, need to be adapted to these regulations.

  14. Site Classification using Multichannel Channel Analysis of Surface Wave (MASW) method on Soft and Hard Ground

    NASA Astrophysics Data System (ADS)

    Ashraf, M. A. M.; Kumar, N. S.; Yusoh, R.; Hazreek, Z. A. M.; Aziman, M.

    2018-04-01

    Site classification utilizing average shear wave velocity (Vs(30) up to 30 meters depth is a typical parameter. Numerous geophysical methods have been proposed for estimation of shear wave velocity by utilizing assortment of testing configuration, processing method, and inversion algorithm. Multichannel Analysis of Surface Wave (MASW) method is been rehearsed by numerous specialist and professional to geotechnical engineering for local site characterization and classification. This study aims to determine the site classification on soft and hard ground using MASW method. The subsurface classification was made utilizing National Earthquake Hazards Reduction Program (NERHP) and international Building Code (IBC) classification. Two sites are chosen to acquire the shear wave velocity which is in the state of Pulau Pinang for soft soil and Perlis for hard rock. Results recommend that MASW technique can be utilized to spatially calculate the distribution of shear wave velocity (Vs(30)) in soil and rock to characterize areas.

  15. Development of structure-activity relationship for metal oxide nanoparticles

    NASA Astrophysics Data System (ADS)

    Liu, Rong; Zhang, Hai Yuan; Ji, Zhao Xia; Rallo, Robert; Xia, Tian; Chang, Chong Hyun; Nel, Andre; Cohen, Yoram

    2013-05-01

    Nanomaterial structure-activity relationships (nano-SARs) for metal oxide nanoparticles (NPs) toxicity were investigated using metrics based on dose-response analysis and consensus self-organizing map clustering. The NP cellular toxicity dataset included toxicity profiles consisting of seven different assays for human bronchial epithelial (BEAS-2B) and murine myeloid (RAW 264.7) cells, over a concentration range of 0.39-100 mg L-1 and exposure time up to 24 h, for twenty-four different metal oxide NPs. Various nano-SAR building models were evaluated, based on an initial pool of thirty NP descriptors. The conduction band energy and ionic index (often correlated with the hydration enthalpy) were identified as suitable NP descriptors that are consistent with suggested toxicity mechanisms for metal oxide NPs and metal ions. The best performing nano-SAR with the above two descriptors, built with support vector machine (SVM) model and of validated robustness, had a balanced classification accuracy of ~94%. An applicability domain for the present data was established with a reasonable confidence level of 80%. Given the potential role of nano-SARs in decision making, regarding the environmental impact of NPs, the class probabilities provided by the SVM nano-SAR enabled the construction of decision boundaries with respect to toxicity classification under different acceptance levels of false negative relative to false positive predictions.Nanomaterial structure-activity relationships (nano-SARs) for metal oxide nanoparticles (NPs) toxicity were investigated using metrics based on dose-response analysis and consensus self-organizing map clustering. The NP cellular toxicity dataset included toxicity profiles consisting of seven different assays for human bronchial epithelial (BEAS-2B) and murine myeloid (RAW 264.7) cells, over a concentration range of 0.39-100 mg L-1 and exposure time up to 24 h, for twenty-four different metal oxide NPs. Various nano-SAR building models were evaluated, based on an initial pool of thirty NP descriptors. The conduction band energy and ionic index (often correlated with the hydration enthalpy) were identified as suitable NP descriptors that are consistent with suggested toxicity mechanisms for metal oxide NPs and metal ions. The best performing nano-SAR with the above two descriptors, built with support vector machine (SVM) model and of validated robustness, had a balanced classification accuracy of ~94%. An applicability domain for the present data was established with a reasonable confidence level of 80%. Given the potential role of nano-SARs in decision making, regarding the environmental impact of NPs, the class probabilities provided by the SVM nano-SAR enabled the construction of decision boundaries with respect to toxicity classification under different acceptance levels of false negative relative to false positive predictions. Electronic supplementary information (ESI) available. See DOI: 10.1039/c3nr01533e

  16. Enhancing navigation in biomedical databases by community voting and database-driven text classification

    PubMed Central

    Duchrow, Timo; Shtatland, Timur; Guettler, Daniel; Pivovarov, Misha; Kramer, Stefan; Weissleder, Ralph

    2009-01-01

    Background The breadth of biological databases and their information content continues to increase exponentially. Unfortunately, our ability to query such sources is still often suboptimal. Here, we introduce and apply community voting, database-driven text classification, and visual aids as a means to incorporate distributed expert knowledge, to automatically classify database entries and to efficiently retrieve them. Results Using a previously developed peptide database as an example, we compared several machine learning algorithms in their ability to classify abstracts of published literature results into categories relevant to peptide research, such as related or not related to cancer, angiogenesis, molecular imaging, etc. Ensembles of bagged decision trees met the requirements of our application best. No other algorithm consistently performed better in comparative testing. Moreover, we show that the algorithm produces meaningful class probability estimates, which can be used to visualize the confidence of automatic classification during the retrieval process. To allow viewing long lists of search results enriched by automatic classifications, we added a dynamic heat map to the web interface. We take advantage of community knowledge by enabling users to cast votes in Web 2.0 style in order to correct automated classification errors, which triggers reclassification of all entries. We used a novel framework in which the database "drives" the entire vote aggregation and reclassification process to increase speed while conserving computational resources and keeping the method scalable. In our experiments, we simulate community voting by adding various levels of noise to nearly perfectly labelled instances, and show that, under such conditions, classification can be improved significantly. Conclusion Using PepBank as a model database, we show how to build a classification-aided retrieval system that gathers training data from the community, is completely controlled by the database, scales well with concurrent change events, and can be adapted to add text classification capability to other biomedical databases. The system can be accessed at . PMID:19799796

  17. Assessment of Pansharpening Methods Applied to WorldView-2 Imagery Fusion.

    PubMed

    Li, Hui; Jing, Linhai; Tang, Yunwei

    2017-01-05

    Since WorldView-2 (WV-2) images are widely used in various fields, there is a high demand for the use of high-quality pansharpened WV-2 images for different application purposes. With respect to the novelty of the WV-2 multispectral (MS) and panchromatic (PAN) bands, the performances of eight state-of-art pan-sharpening methods for WV-2 imagery including six datasets from three WV-2 scenes were assessed in this study using both quality indices and information indices, along with visual inspection. The normalized difference vegetation index, normalized difference water index, and morphological building index, which are widely used in applications related to land cover classification, the extraction of vegetation areas, buildings, and water bodies, were employed in this work to evaluate the performance of different pansharpening methods in terms of information presentation ability. The experimental results show that the Haze- and Ratio-based, adaptive Gram-Schmidt, Generalized Laplacian pyramids (GLP) methods using enhanced spectral distortion minimal model and enhanced context-based decision model methods are good choices for producing fused WV-2 images used for image interpretation and the extraction of urban buildings. The two GLP-based methods are better choices than the other methods, if the fused images will be used for applications related to vegetation and water-bodies.

  18. Assessment of Pansharpening Methods Applied to WorldView-2 Imagery Fusion

    PubMed Central

    Li, Hui; Jing, Linhai; Tang, Yunwei

    2017-01-01

    Since WorldView-2 (WV-2) images are widely used in various fields, there is a high demand for the use of high-quality pansharpened WV-2 images for different application purposes. With respect to the novelty of the WV-2 multispectral (MS) and panchromatic (PAN) bands, the performances of eight state-of-art pan-sharpening methods for WV-2 imagery including six datasets from three WV-2 scenes were assessed in this study using both quality indices and information indices, along with visual inspection. The normalized difference vegetation index, normalized difference water index, and morphological building index, which are widely used in applications related to land cover classification, the extraction of vegetation areas, buildings, and water bodies, were employed in this work to evaluate the performance of different pansharpening methods in terms of information presentation ability. The experimental results show that the Haze- and Ratio-based, adaptive Gram-Schmidt, Generalized Laplacian pyramids (GLP) methods using enhanced spectral distortion minimal model and enhanced context-based decision model methods are good choices for producing fused WV-2 images used for image interpretation and the extraction of urban buildings. The two GLP-based methods are better choices than the other methods, if the fused images will be used for applications related to vegetation and water-bodies. PMID:28067770

  19. Building block extraction and classification by means of aerial images fused with super-resolution reconstructed elevation data

    NASA Astrophysics Data System (ADS)

    Panagiotopoulou, Antigoni; Bratsolis, Emmanuel; Charou, Eleni; Perantonis, Stavros

    2017-10-01

    The detailed three-dimensional modeling of buildings utilizing elevation data, such as those provided by light detection and ranging (LiDAR) airborne scanners, is increasingly demanded today. There are certain application requirements and available datasets to which any research effort has to be adapted. Our dataset includes aerial orthophotos, with a spatial resolution 20 cm, and a digital surface model generated from LiDAR, with a spatial resolution 1 m and an elevation resolution 20 cm, from an area of Athens, Greece. The aerial images are fused with LiDAR, and we classify these data with a multilayer feedforward neural network for building block extraction. The innovation of our approach lies in the preprocessing step in which the original LiDAR data are super-resolution (SR) reconstructed by means of a stochastic regularized technique before their fusion with the aerial images takes place. The Lorentzian estimator combined with the bilateral total variation regularization performs the SR reconstruction. We evaluate the performance of our approach against that of fusing unprocessed LiDAR data with aerial images. We present the classified images and the statistical measures confusion matrix, kappa coefficient, and overall accuracy. The results demonstrate that our approach predominates over that of fusing unprocessed LiDAR data with aerial images.

  20. Semantic Segmentation of Indoor Point Clouds Using Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Babacan, K.; Chen, L.; Sohn, G.

    2017-11-01

    As Building Information Modelling (BIM) thrives, geometry becomes no longer sufficient; an ever increasing variety of semantic information is needed to express an indoor model adequately. On the other hand, for the existing buildings, automatically generating semantically enriched BIM from point cloud data is in its infancy. The previous research to enhance the semantic content rely on frameworks in which some specific rules and/or features that are hand coded by specialists. These methods immanently lack generalization and easily break in different circumstances. On this account, a generalized framework is urgently needed to automatically and accurately generate semantic information. Therefore we propose to employ deep learning techniques for the semantic segmentation of point clouds into meaningful parts. More specifically, we build a volumetric data representation in order to efficiently generate the high number of training samples needed to initiate a convolutional neural network architecture. The feedforward propagation is used in such a way to perform the classification in voxel level for achieving semantic segmentation. The method is tested both for a mobile laser scanner point cloud, and a larger scale synthetically generated data. We also demonstrate a case study, in which our method can be effectively used to leverage the extraction of planar surfaces in challenging cluttered indoor environments.

  1. Consensus models to predict endocrine disruption for all ...

    EPA Pesticide Factsheets

    Humans are potentially exposed to tens of thousands of man-made chemicals in the environment. It is well known that some environmental chemicals mimic natural hormones and thus have the potential to be endocrine disruptors. Most of these environmental chemicals have never been tested for their ability to disrupt the endocrine system, in particular, their ability to interact with the estrogen receptor. EPA needs tools to prioritize thousands of chemicals, for instance in the Endocrine Disruptor Screening Program (EDSP). Collaborative Estrogen Receptor Activity Prediction Project (CERAPP) was intended to be a demonstration of the use of predictive computational models on HTS data including ToxCast and Tox21 assays to prioritize a large chemical universe of 32464 unique structures for one specific molecular target – the estrogen receptor. CERAPP combined multiple computational models for prediction of estrogen receptor activity, and used the predicted results to build a unique consensus model. Models were developed in collaboration between 17 groups in the U.S. and Europe and applied to predict the common set of chemicals. Structure-based techniques such as docking and several QSAR modeling approaches were employed, mostly using a common training set of 1677 compounds provided by U.S. EPA, to build a total of 42 classification models and 8 regression models for binding, agonist and antagonist activity. All predictions were evaluated on ToxCast data and on an exte

  2. Fusion of Terrestrial and Airborne Laser Data for 3D modeling Applications

    NASA Astrophysics Data System (ADS)

    Mohammed, Hani Mahmoud

    This thesis deals with the 3D modeling phase of the as-built large BIM projects. Among several means of BIM data capturing, such as photogrammetric or range tools, laser scanners have been one of the most efficient and practical tool for a long time. They can generate point clouds with high resolution for 3D models that meet nowadays' market demands. The current 3D modeling projects of as-built BIMs are mainly focused on using one type of laser scanner data, such as Airborne or Terrestrial. According to the literatures, no significant (few) efforts were made towards the fusion of heterogeneous laser scanner data despite its importance. The importance of the fusion of heterogeneous data arises from the fact that no single type of laser data can provide all the information about BIM, especially for large BIM projects that are existing on a large area, such as university buildings, or Heritage places. Terrestrial laser scanners are able to map facades of buildings and other terrestrial objects. However, they lack the ability to map roofs or higher parts in the BIM project. Airborne laser scanner on the other hand, can map roofs of the buildings efficiently and can map only small part of the facades. Short range laser scanners can map the interiors of the BIM projects, while long range scanners are used for mapping wide exterior areas in BIM projects. In this thesis the long range laser scanner data obtained in the Stop-and-Go mapping mode, the short range laser scanner data, obtained in a fully static mapping mode, and the airborne laser data are all fused together to bring a complete effective solution for a large BIM project. Working towards the 3D modeling of BIM projects, the thesis framework starts with the registration of the data, where a new fast automatic registration algorithm were developed. The next step is to recognize the different objects in the BIM project (classification), and obtain 3D models for the buildings. The last step is the development of an occlusion removal algorithm to efficiently retain parts of the buildings occluded by surrounding objects such as trees, vehicles, or street poles.

  3. A Review of Major Nursing Vocabularies and the Extent to Which They Have the Characteristics Required for Implementation in Computer-based Systems

    PubMed Central

    Henry, Suzanne Bakken; Warren, Judith J.; Lange, Linda; Button, Patricia

    1998-01-01

    Building on the work of previous authors, the Computer-based Patient Record Institute (CPRI) Work Group on Codes and Structures has described features of a classification scheme for implementation within a computer-based patient record. The authors of the current study reviewed the evaluation literature related to six major nursing vocabularies (the North American Nursing Diagnosis Association Taxonomy 1, the Nursing Interventions Classification, the Nursing Outcomes Classification, the Home Health Care Classification, the Omaha System, and the International Classification for Nursing Practice) to determine the extent to which the vocabularies include the CPRI features. None of the vocabularies met all criteria. The Omaha System, Home Health Care Classification, and International Classification for Nursing Practice each included five features. Criteria not fully met by any systems were clear and non-redundant representation of concepts, administrative cross-references, syntax and grammar, synonyms, uncertainty, context-free identifiers, and language independence. PMID:9670127

  4. Mapping urban impervious surface using object-based image analysis with WorldView-3 satellite imagery

    NASA Astrophysics Data System (ADS)

    Iabchoon, Sanwit; Wongsai, Sangdao; Chankon, Kanoksuk

    2017-10-01

    Land use and land cover (LULC) data are important to monitor and assess environmental change. LULC classification using satellite images is a method widely used on a global and local scale. Especially, urban areas that have various LULC types are important components of the urban landscape and ecosystem. This study aims to classify urban LULC using WorldView-3 (WV-3) very high-spatial resolution satellite imagery and the object-based image analysis method. A decision rules set was applied to classify the WV-3 images in Kathu subdistrict, Phuket province, Thailand. The main steps were as follows: (1) the image was ortho-rectified with ground control points and using the digital elevation model, (2) multiscale image segmentation was applied to divide the image pixel level into image object level, (3) development of the decision ruleset for LULC classification using spectral bands, spectral indices, spatial and contextual information, and (4) accuracy assessment was computed using testing data, which sampled by statistical random sampling. The results show that seven LULC classes (water, vegetation, open space, road, residential, building, and bare soil) were successfully classified with overall classification accuracy of 94.14% and a kappa coefficient of 92.91%.

  5. A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson's disease.

    PubMed

    Nancy Jane, Y; Khanna Nehemiah, H; Arputharaj, Kannan

    2016-04-01

    Parkinson's disease (PD) is a movement disorder that affects the patient's nervous system and health-care applications mostly uses wearable sensors to collect these data. Since these sensors generate time stamped data, analyzing gait disturbances in PD becomes challenging task. The objective of this paper is to develop an effective clinical decision-making system (CDMS) that aids the physician in diagnosing the severity of gait disturbances in PD affected patients. This paper presents a Q-backpropagated time delay neural network (Q-BTDNN) classifier that builds a temporal classification model, which performs the task of classification and prediction in CDMS. The proposed Q-learning induced backpropagation (Q-BP) training algorithm trains the Q-BTDNN by generating a reinforced error signal. The network's weights are adjusted through backpropagating the generated error signal. For experimentation, the proposed work uses a PD gait database, which contains gait measures collected through wearable sensors from three different PD research studies. The experimental result proves the efficiency of Q-BP in terms of its improved classification accuracy of 91.49%, 92.19% and 90.91% with three datasets accordingly compared to other neural network training algorithms. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Object Based Image Analysis Combining High Spatial Resolution Imagery and Laser Point Clouds for Urban Land Cover

    NASA Astrophysics Data System (ADS)

    Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong

    2016-06-01

    With the rapid developments of the sensor technology, high spatial resolution imagery and airborne Lidar point clouds can be captured nowadays, which make classification, extraction, evaluation and analysis of a broad range of object features available. High resolution imagery, Lidar dataset and parcel map can be widely used for classification as information carriers. Therefore, refinement of objects classification is made possible for the urban land cover. The paper presents an approach to object based image analysis (OBIA) combing high spatial resolution imagery and airborne Lidar point clouds. The advanced workflow for urban land cover is designed with four components. Firstly, colour-infrared TrueOrtho photo and laser point clouds were pre-processed to derive the parcel map of water bodies and nDSM respectively. Secondly, image objects are created via multi-resolution image segmentation integrating scale parameter, the colour and shape properties with compactness criterion. Image can be subdivided into separate object regions. Thirdly, image objects classification is performed on the basis of segmentation and a rule set of knowledge decision tree. These objects imagery are classified into six classes such as water bodies, low vegetation/grass, tree, low building, high building and road. Finally, in order to assess the validity of the classification results for six classes, accuracy assessment is performed through comparing randomly distributed reference points of TrueOrtho imagery with the classification results, forming the confusion matrix and calculating overall accuracy and Kappa coefficient. The study area focuses on test site Vaihingen/Enz and a patch of test datasets comes from the benchmark of ISPRS WG III/4 test project. The classification results show higher overall accuracy for most types of urban land cover. Overall accuracy is 89.5% and Kappa coefficient equals to 0.865. The OBIA approach provides an effective and convenient way to combine high resolution imagery and Lidar ancillary data for classification of urban land cover.

  7. A novel approach to internal crown characterization for coniferous tree species classification

    NASA Astrophysics Data System (ADS)

    Harikumar, A.; Bovolo, F.; Bruzzone, L.

    2016-10-01

    The knowledge about individual trees in forest is highly beneficial in forest management. High density small foot- print multi-return airborne Light Detection and Ranging (LiDAR) data can provide a very accurate information about the structural properties of individual trees in forests. Every tree species has a unique set of crown structural characteristics that can be used for tree species classification. In this paper, we use both the internal and external crown structural information of a conifer tree crown, derived from a high density small foot-print multi-return LiDAR data acquisition for species classification. Considering the fact that branches are the major building blocks of a conifer tree crown, we obtain the internal crown structural information using a branch level analysis. The structure of each conifer branch is represented using clusters in the LiDAR point cloud. We propose the joint use of the k-means clustering and geometric shape fitting, on the LiDAR data projected onto a novel 3-dimensional space, to identify branch clusters. After mapping the identified clusters back to the original space, six internal geometric features are estimated using a branch-level analysis. The external crown characteristics are modeled by using six least correlated features based on cone fitting and convex hull. Species classification is performed using a sparse Support Vector Machines (sparse SVM) classifier.

  8. Building confidence and credibility into CAD with belief decision trees

    NASA Astrophysics Data System (ADS)

    Affenit, Rachael N.; Barns, Erik R.; Furst, Jacob D.; Rasin, Alexander; Raicu, Daniela S.

    2017-03-01

    Creating classifiers for computer-aided diagnosis in the absence of ground truth is a challenging problem. Using experts' opinions as reference truth is difficult because the variability in the experts' interpretations introduces uncertainty in the labeled diagnostic data. This uncertainty translates into noise, which can significantly affect the performance of any classifier on test data. To address this problem, we propose a new label set weighting approach to combine the experts' interpretations and their variability, as well as a selective iterative classification (SIC) approach that is based on conformal prediction. Using the NIH/NCI Lung Image Database Consortium (LIDC) dataset in which four radiologists interpreted the lung nodule characteristics, including the degree of malignancy, we illustrate the benefits of the proposed approach. Our results show that the proposed 2-label-weighted approach significantly outperforms the accuracy of the original 5- label and 2-label-unweighted classification approaches by 39.9% and 7.6%, respectively. We also found that the weighted 2-label models produce higher skewness values by 1.05 and 0.61 for non-SIC and SIC respectively on root mean square error (RMSE) distributions. When each approach was combined with selective iterative classification, this further improved the accuracy of classification for the 2-weighted-label by 7.5% over the original, and improved the skewness of the 5-label and 2-unweighted-label by 0.22 and 0.44 respectively.

  9. Large-area settlement pattern recognition from Landsat-8 data

    NASA Astrophysics Data System (ADS)

    Wieland, Marc; Pittore, Massimiliano

    2016-09-01

    The study presents an image processing and analysis pipeline that combines object-based image analysis with a Support Vector Machine to derive a multi-layered settlement product from Landsat-8 data over large areas. 43 image scenes are processed over large parts of Central Asia (Southern Kazakhstan, Kyrgyzstan, Tajikistan and Eastern Uzbekistan). The main tasks tackled by this work include built-up area identification, settlement type classification and urban structure types pattern recognition. Besides commonly used accuracy assessments of the resulting map products, thorough performance evaluations are carried out under varying conditions to tune algorithm parameters and assess their applicability for the given tasks. As part of this, several research questions are being addressed. In particular the influence of the improved spatial and spectral resolution of Landsat-8 on the SVM performance to identify built-up areas and urban structure types are evaluated. Also the influence of an extended feature space including digital elevation model features is tested for mountainous regions. Moreover, the spatial distribution of classification uncertainties is analyzed and compared to the heterogeneity of the building stock within the computational unit of the segments. The study concludes that the information content of Landsat-8 images is sufficient for the tested classification tasks and even detailed urban structures could be extracted with satisfying accuracy. Freely available ancillary settlement point location data could further improve the built-up area classification. Digital elevation features and pan-sharpening could, however, not significantly improve the classification results. The study highlights the importance of dynamically tuned classifier parameters, and underlines the use of Shannon entropy computed from the soft answers of the SVM as a valid measure of the spatial distribution of classification uncertainties.

  10. [Identification of varieties of textile fibers by using Vis/NIR infrared spectroscopy technique].

    PubMed

    Wu, Gui-Fang; He, Yong

    2010-02-01

    The aim of the present paper was to provide new insight into Vis/NIR spectroscopic analysis of textile fibers. In order to achieve rapid identification of the varieties of fibers, the authors selected 5 kinds of fibers of cotton, flax, wool, silk and tencel to do a study with Vis/NIR spectroscopy. Firstly, the spectra of each kind of fiber were scanned by spectrometer, and principal component analysis (PCA) method was used to analyze the characteristics of the pattern of Vis/NIR spectra. Principal component scores scatter plot (PC1 x PC2 x PC3) of fiber indicated the classification effect of five varieties of fibers. The former 6 principal components (PCs) were selected according to the quantity and size of PCs. The PCA classification model was optimized by using the least-squares support vector machines (LS-SVM) method. The authors used the 6 PCs extracted by PCA as the inputs of LS-SVM, and PCA-LS-SVM model was built to achieve varieties validation as well as mathematical model building and optimization analysis. Two hundred samples (40 samples for each variety of fibers) of five varieties of fibers were used for calibration of PCA-LS-SVM model, and the other 50 samples (10 samples for each variety of fibers) were used for validation. The result of validation showed that Vis/NIR spectroscopy technique based on PCA-LS-SVM had a powerful classification capability. It provides a new method for identifying varieties of fibers rapidly and real time, so it has important significance for protecting the rights of consumers, ensuring the quality of textiles, and implementing rationalization production and transaction of textile materials and its production.

  11. Probability of identification: adulteration of American Ginseng with Asian Ginseng.

    PubMed

    Harnly, James; Chen, Pei; Harrington, Peter De B

    2013-01-01

    The AOAC INTERNATIONAL guidelines for validation of botanical identification methods were applied to the detection of Asian Ginseng [Panax ginseng (PG)] as an adulterant for American Ginseng [P. quinquefolius (PQ)] using spectral fingerprints obtained by flow injection mass spectrometry (FIMS). Samples of 100% PQ and 100% PG were physically mixed to provide 90, 80, and 50% PQ. The multivariate FIMS fingerprint data were analyzed using soft independent modeling of class analogy (SIMCA) based on 100% PQ. The Q statistic, a measure of the degree of non-fit of the test samples with the calibration model, was used as the analytical parameter. FIMS was able to discriminate between 100% PQ and 100% PG, and between 100% PQ and 90, 80, and 50% PQ. The probability of identification (POI) curve was estimated based on the SD of 90% PQ. A digital model of adulteration, obtained by mathematically summing the experimentally acquired spectra of 100% PQ and 100% PG in the desired ratios, agreed well with the physical data and provided an easy and more accurate method for constructing the POI curve. Two chemometric modeling methods, SIMCA and fuzzy optimal associative memories, and two classification methods, partial least squares-discriminant analysis and fuzzy rule-building expert systems, were applied to the data. The modeling methods correctly identified the adulterated samples; the classification methods did not.

  12. Development and Comparison of hERG Blocker Classifiers: Assessment on Different Datasets Yields Markedly Different Results.

    PubMed

    Marchese Robinson, Richard L; Glen, Robert C; Mitchell, John B O

    2011-05-16

    In recent years, considerable effort has been invested in the development of classification models for prospective hERG inhibitors, due to the implications of hERG blockade for cardiotoxicity and the low throughput of functional hERG assays. We present novel approaches for binary classification which seek to separate strong inhibitors (IC50 <1 µM) from 'non-blockers' exhibiting moderate (1-10 µM) or weak (IC50 ≥10 µM) inhibition, as required by the pharmaceutical industry. Our approaches are based on (discretized) 2D descriptors, selected using Winnow, with additional models generated using Random Forest (RF) and Support Vector Machines (SVMs). We compare our models to those previously developed by Thai and Ecker and by Dubus et al. The purpose of this paper is twofold: 1. To propose that our approaches (with Matthews Correlation Coefficients from 0.40 to 0.87 on truly external test sets, when extrapolation beyond the applicability domain was not evident and sufficient quantities of data were available for training) are competitive with those currently proposed in the literature. 2. To highlight key issues associated with building and assessing truly predictive models, in particular the considerable variation in model performance when training and testing on different datasets. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. FET. Control and equipment building (TAN630). Basement floor plan. Tunnel ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Control and equipment building (TAN-630). Basement floor plan. Tunnel to hangar (TAN-629). Electrical and chemical services. Ralph M. Parsons 1229-2 ANP/GE-630-A-1. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0630-00-693-107080 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  14. IET. Fuel transfer pumping building (TAN625). Elevations, foundation. Detail of ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Fuel transfer pumping building (TAN-625). Elevations, foundation. Detail of access stairway to coupling station. Ralph M. Parsons 902-a-ANY-620-625-A&S 414. Date: February 1954. Approved by INEEL Classification Office for public release. INEEL index code no. 035-0625-00-693-106971 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  15. FET. Control and equipment building (TAN630). East elevation and section. ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Control and equipment building (TAN-630). East elevation and section. Shielded roadway and personnel entrances. Ralph M. Parsons 1229-2 ANP/GE-5-630-A-5. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0630-00-693-107084 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  16. FET. Control and equipment building, TAN630. Main floor plan. Control ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Control and equipment building, TAN-630. Main floor plan. Control room. Room numbers and functions. Ralph M. Parsons. 1229-2-ANP/GE-5-630-A-2. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0630-00-693-107081 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  17. FET. Control and equipment building (TAN630). Sections. Earth cover. Shielded ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FET. Control and equipment building (TAN-630). Sections. Earth cover. Shielded access entries for personnel and vehicles. Ralph M. Parsons 1229-2 ANP/GE-5-630-A-3. Date: March 1957. Approved by INEEL Classification Office for public release. INEEL index code no. 036-0630-00-693-107082 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  18. ADM. Administration Building (TAN602). Elevations, sections, details. Shows areas that ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    ADM. Administration Building (TAN-602). Elevations, sections, details. Shows areas that were soon remodeled or added onto. Ralph M. Parsons 902-2-ANP-602-A 32 Date: August 1955. Approved by INEEL Classification Office for public release. INEEL index code no. 033-0602-00-693-106711 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  19. Bayesian network modelling of upper gastrointestinal bleeding

    NASA Astrophysics Data System (ADS)

    Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri

    2013-09-01

    Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.

  20. Conditional Density Estimation with HMM Based Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Hu, Fasheng; Liu, Zhenqiu; Jia, Chunxin; Chen, Dechang

    Conditional density estimation is very important in financial engineer, risk management, and other engineering computing problem. However, most regression models have a latent assumption that the probability density is a Gaussian distribution, which is not necessarily true in many real life applications. In this paper, we give a framework to estimate or predict the conditional density mixture dynamically. Through combining the Input-Output HMM with SVM regression together and building a SVM model in each state of the HMM, we can estimate a conditional density mixture instead of a single gaussian. With each SVM in each node, this model can be applied for not only regression but classifications as well. We applied this model to denoise the ECG data. The proposed method has the potential to apply to other time series such as stock market return predictions.

  1. Using classification tree modelling to investigate drug prescription practices at health facilities in rural Tanzania.

    PubMed

    Kajungu, Dan K; Selemani, Majige; Masanja, Irene; Baraka, Amuri; Njozi, Mustafa; Khatib, Rashid; Dodoo, Alexander N; Binka, Fred; Macq, Jean; D'Alessandro, Umberto; Speybroeck, Niko

    2012-09-05

    Drug prescription practices depend on several factors related to the patient, health worker and health facilities. A better understanding of the factors influencing prescription patterns is essential to develop strategies to mitigate the negative consequences associated with poor practices in both the public and private sectors. A cross-sectional study was conducted in rural Tanzania among patients attending health facilities, and health workers. Patients, health workers and health facilities-related factors with the potential to influence drug prescription patterns were used to build a model of key predictors. Standard data mining methodology of classification tree analysis was used to define the importance of the different factors on prescription patterns. This analysis included 1,470 patients and 71 health workers practicing in 30 health facilities. Patients were mostly treated in dispensaries. Twenty two variables were used to construct two classification tree models: one for polypharmacy (prescription of ≥3 drugs) on a single clinic visit and one for co-prescription of artemether-lumefantrine (AL) with antibiotics. The most important predictor of polypharmacy was the diagnosis of several illnesses. Polypharmacy was also associated with little or no supervision of the health workers, administration of AL and private facilities. Co-prescription of AL with antibiotics was more frequent in children under five years of age and the other important predictors were transmission season, mode of diagnosis and the location of the health facility. Standard data mining methodology is an easy-to-implement analytical approach that can be useful for decision-making. Polypharmacy is mainly due to the diagnosis of multiple illnesses.

  2. Classification and virtual screening of androgen receptor antagonists.

    PubMed

    Li, Jiazhong; Gramatica, Paola

    2010-05-24

    Computational tools, such as quantitative structure-activity relationship (QSAR), are highly useful as screening support for prioritization of substances of very high concern (SVHC). From the practical point of view, QSAR models should be effective to pick out more active rather than inactive compounds, expressed as sensitivity in classification works. This research investigates the classification of a big data set of endocrine-disrupting chemicals (EDCs)-androgen receptor (AR) antagonists, mainly aiming to improve the external sensitivity and to screen for potential AR binders. The kNN, lazy IB1, and ADTree methods and the consensus approach were used to build different models, which improve the sensitivity on external chemicals from 57.1% (literature) to 76.4%. Additionally, the models' predictive abilities were further validated on a blind collected data set (sensitivity: 85.7%). Then the proposed classifiers were used: (i) to distinguish a set of AR binders into antagonists and agonists; (ii) to screen a combined estrogen receptor binder database to find out possible chemicals that can bind to both AR and ER; and (iii) to virtually screen our in-house environmental chemical database. The in silico screening results suggest: (i) that some compounds can affect the normal endocrine system through a complex mechanism binding both to ER and AR; (ii) new EDCs, which are nonER binders, but can in silico bind to AR, are recognized; and (iii) about 20% of compounds in a big data set of environmental chemicals are predicted as new AR antagonists. The priority should be given to them to experimentally test the binding activities with AR.

  3. Accuracy assessments and areal estimates using two-phase stratified random sampling, cluster plots, and the multivariate composite estimator

    Treesearch

    Raymond L. Czaplewski

    2000-01-01

    Consider the following example of an accuracy assessment. Landsat data are used to build a thematic map of land cover for a multicounty region. The map classifier (e.g., a supervised classification algorithm) assigns each pixel into one category of land cover. The classification system includes 12 different types of forest and land cover: black spruce, balsam fir,...

  4. Synergistic Use of WorldView-2 Imagery and Airborne LiDAR Data for Urban Land Cover Classification

    NASA Astrophysics Data System (ADS)

    Wu, M. F.; Sun, Z. C.; Yang, B.; Yu, S. S.

    2017-02-01

    There are lots of challenges for deriving urban land cover types for high resolution optical imagery because of spectral similarity of different objects, mixed pixels, shadows of buildings and large tree crowns. In order to reduce these uncertainties, recently, it’s a trend of the classification of urban land cover from multi-source sensors in the field of urban remote sensing. In this study, a hierarchical support vector machine (SVM) classification method was applied to the urban land cover mapping, using the WorldView-2 imagery and airborne Light Detection and Ranging (LiDAR) data. The results showed that: (1) The overall accuracy (OA) and overall kappa (OK) were 72.92% and 0.66 for WorldView-2 imagery alone; while the OA and OK were improved up to 89.44% and 0.87 for the synergistic use of the two types of data source. (2) Buildings and road/parking lots extracted from fused data were more precision and well-shaped. The two classes from fused data were optimally classified with higher producer’s accuracy and user’s accuracy than WorldView-2 imagery alone. The trees were also easily separated from the grasslands when the airborne LiDAR data was added. (3) The fused data could reduce the phenomenon of different spectral character of the complex and detailed objects. It was also helpful to address the problem of shadows from the high-rise buildings. The results from this study indicate that the synergistic use of high resolution optical imagery and airborne LiDAR data can be an efficient approach to improving the classification of urban land cover.

  5. Link prediction boosted psychiatry disorder classification for functional connectivity network

    NASA Astrophysics Data System (ADS)

    Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang

    2017-02-01

    Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.

  6. Dynamic Human Body Modeling Using a Single RGB Camera.

    PubMed

    Zhu, Haiyu; Yu, Yao; Zhou, Yu; Du, Sidan

    2016-03-18

    In this paper, we present a novel automatic pipeline to build personalized parametric models of dynamic people using a single RGB camera. Compared to previous approaches that use monocular RGB images, our system can model a 3D human body automatically and incrementally, taking advantage of human motion. Based on coarse 2D and 3D poses estimated from image sequences, we first perform a kinematic classification of human body parts to refine the poses and obtain reconstructed body parts. Next, a personalized parametric human model is generated by driving a general template to fit the body parts and calculating the non-rigid deformation. Experimental results show that our shape estimation method achieves comparable accuracy with reconstructed models using depth cameras, yet requires neither user interaction nor any dedicated devices, leading to the feasibility of using this method on widely available smart phones.

  7. Dynamic Human Body Modeling Using a Single RGB Camera

    PubMed Central

    Zhu, Haiyu; Yu, Yao; Zhou, Yu; Du, Sidan

    2016-01-01

    In this paper, we present a novel automatic pipeline to build personalized parametric models of dynamic people using a single RGB camera. Compared to previous approaches that use monocular RGB images, our system can model a 3D human body automatically and incrementally, taking advantage of human motion. Based on coarse 2D and 3D poses estimated from image sequences, we first perform a kinematic classification of human body parts to refine the poses and obtain reconstructed body parts. Next, a personalized parametric human model is generated by driving a general template to fit the body parts and calculating the non-rigid deformation. Experimental results show that our shape estimation method achieves comparable accuracy with reconstructed models using depth cameras, yet requires neither user interaction nor any dedicated devices, leading to the feasibility of using this method on widely available smart phones. PMID:26999159

  8. Molecule kernels: a descriptor- and alignment-free quantitative structure-activity relationship approach.

    PubMed

    Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus

    2008-09-01

    Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.

  9. Modeling Liver-Related Adverse Effects of Drugs Using kNN QSAR Method

    PubMed Central

    Rodgers, Amie D.; Zhu, Hao; Fourches, Dennis; Rusyn, Ivan; Tropsha, Alexander

    2010-01-01

    Adverse effects of drugs (AEDs) continue to be a major cause of drug withdrawals both in development and post-marketing. While liver-related AEDs are a major concern for drug safety, there are few in silico models for predicting human liver toxicity for drug candidates. We have applied the Quantitative Structure Activity Relationship (QSAR) approach to model liver AEDs. In this study, we aimed to construct a QSAR model capable of binary classification (active vs. inactive) of drugs for liver AEDs based on chemical structure. To build QSAR models, we have employed an FDA spontaneous reporting database of human liver AEDs (elevations in activity of serum liver enzymes), which contains data on approximately 500 approved drugs. Approximately 200 compounds with wide clinical data coverage, structural similarity and balanced (40/60) active/inactive ratio were selected for modeling and divided into multiple training/test and external validation sets. QSAR models were developed using the k nearest neighbor method and validated using external datasets. Models with high sensitivity (>73%) and specificity (>94%) for prediction of liver AEDs in external validation sets were developed. To test applicability of the models, three chemical databases (World Drug Index, Prestwick Chemical Library, and Biowisdom Liver Intelligence Module) were screened in silico and the validity of predictions was determined, where possible, by comparing model-based classification with assertions in publicly available literature. Validated QSAR models of liver AEDs based on the data from the FDA spontaneous reporting system can be employed as sensitive and specific predictors of AEDs in pre-clinical screening of drug candidates for potential hepatotoxicity in humans. PMID:20192250

  10. Rapid sample classification using an open port sampling interface coupled with liquid introduction atmospheric pressure ionization mass spectrometry

    DOE PAGES

    Van Berkel, Gary J.; Kertesz, Vilmos

    2016-11-15

    An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less

  11. Rapid sample classification using an open port sampling interface coupled with liquid introduction atmospheric pressure ionization mass spectrometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Van Berkel, Gary J.; Kertesz, Vilmos

    An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less

  12. Feature selection and classification of multiparametric medical images using bagging and SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Resnick, Susan M.; Davatzikos, Christos

    2008-03-01

    This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.

  13. Implicit Wiener series analysis of epileptic seizure recordings.

    PubMed

    Barbero, Alvaro; Franz, Matthias; van Drongelen, Wim; Dorronsoro, José R; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2009-01-01

    Implicit Wiener series are a powerful tool to build Volterra representations of time series with any degree of non-linearity. A natural question is then whether higher order representations yield more useful models. In this work we shall study this question for ECoG data channel relationships in epileptic seizure recordings, considering whether quadratic representations yield more accurate classifiers than linear ones. To do so we first show how to derive statistical information on the Volterra coefficient distribution and how to construct seizure classification patterns over that information. As our results illustrate, a quadratic model seems to provide no advantages over a linear one. Nevertheless, we shall also show that the interpretability of the implicit Wiener series provides insights into the inter-channel relationships of the recordings.

  14. LPT. Chlorination building (TAN643) and water well pumphouse (TAN644). Plans, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    LPT. Chlorination building (TAN-643) and water well pumphouse (TAN-644). Plans, elevations, sections, and details. Ralph M. Parsons 1229-12 ANP/GE-7-643-A-S-H&V-1. November 1956. Approved by INEEL Classification Office for public release. INEEL index code no. 038-0643/0644-00-693-107307 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  15. Alternative Missions for the Army

    DTIC Science & Technology

    1992-07-17

    SECURITY CLASSIFICATION AUTHORITY 1.1I1 H I I BILITY OF REPORT 2b. DECLASSIFICATION /DOWNGRADING SCHEDULE Approved for public release; H ict-ri hiit ...construction, health care, transportation, and law enforcement. - Because they are already located in over 5,000 communities throughout the nation, the...various railway surveys. effectively building the nations first railroad, and also developed the cour.irys water resources through building or improving

  16. A building extraction approach for Airborne Laser Scanner data utilizing the Object Based Image Analysis paradigm

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Tiede, Dirk; Blaschke, Thomas

    2016-10-01

    In the past two decades Object-Based Image Analysis (OBIA) established itself as an efficient approach for the classification and extraction of information from remote sensing imagery and, increasingly, from non-image based sources such as Airborne Laser Scanner (ALS) point clouds. ALS data is represented in the form of a point cloud with recorded multiple returns and intensities. In our work, we combined OBIA with ALS point cloud data in order to identify and extract buildings as 2D polygons representing roof outlines in a top down mapping approach. We performed rasterization of the ALS data into a height raster for the purpose of the generation of a Digital Surface Model (DSM) and a derived Digital Elevation Model (DEM). Further objects were generated in conjunction with point statistics from the linked point cloud. With the use of class modelling methods, we generated the final target class of objects representing buildings. The approach was developed for a test area in Biberach an der Riß (Germany). In order to point out the possibilities of the adaptation-free transferability to another data set, the algorithm has been applied ;as is; to the ISPRS Benchmarking data set of Toronto (Canada). The obtained results show high accuracies for the initial study area (thematic accuracies of around 98%, geometric accuracy of above 80%). The very high performance within the ISPRS Benchmark without any modification of the algorithm and without any adaptation of parameters is particularly noteworthy.

  17. Building a Joint-Service Classification Research Roadmap: Methodological Issues in Selection and Classification

    DTIC Science & Technology

    1994-02-01

    Wijting , 1976). However, missing critical job elements may lead the J-coefficient to underestimate validity (Mossholder & Arvey, 1984), and variation...should be able to approximate the validity estimates derived empirically. Research on the J-Coefficient (Dickinson & Wijting , 1976) and the SYNVAL project...Measur.ment 8, 71-82. Dickinson, T. L, & Wijting , J. P. Q976). Poiiyvcapturingasaprocedute for synLthetic vraidation. Paper preented at the meeting of the

  18. Hyperspectral Mapping of the Invasive Species Pepperweed and the Development of a Habitat Suitability Model

    NASA Technical Reports Server (NTRS)

    Nguyen, Andrew; Gole, Alexander; Randall, Jarom; Dlott, Glade; Zhang, Sylvia; Alfaro, Brian; Schmidt, Cindy; Skiles, J. W.

    2011-01-01

    Mapping and predicting the spatial distribution of invasive plant species is central to habitat management, however difficult to implement at landscape and regional scales. Remote sensing techniques can reduce the impact field campaigns have on these ecologically sensitive areas and can provide a regional and multi-temporal view of invasive species spread. Invasive perennial pepperweed (Lepidium latifolium) is now widespread in fragmented estuaries of the South San Francisco Bay, and is shown to degrade native vegetation in estuaries and adjacent habitats, thereby reducing forage and shelter for wildlife. The purpose of this study is to map the present distribution of pepperweed in estuarine areas of the South San Francisco Bay Salt Pond Restoration Project (Alviso, CA), and create a habitat suitability model to predict future spread. Pepperweed reflectance data were collected in-situ with a GER 1500 spectroradiometer along with 88 corresponding pepperweed presence and absence points used for building the statistical models. The spectral angle mapper (SAM) classification algorithm was used to distinguish the reflectance spectrum of pepperweed and map its distribution using an image from EO-1 Hyperion. To map pepperweed, we performed a supervised classification on an ASTER image with a resulting classification accuracy of 71.8%. We generated a weighted overlay analysis model within a geographic information system (GIS) framework to predict areas in the study site most susceptible to pepperweed colonization. Variables for the model included propensity for disturbance, status of pond restoration, proximity to water channels, and terrain curvature. A Generalized Additive Model (GAM) was also used to generate a probability map and investigate the statistical probability that each variable contributed to predict pepperweed spread. Results from the GAM revealed distance to channels, distance to ponds and curvature were statistically significant (p < 0.01) in determining the locations of suitable pepperweed habitats.

  19. Robust Library Building for Autonomous Classification of Downhole Geophysical Logs Using Gaussian Processes

    NASA Astrophysics Data System (ADS)

    Silversides, Katherine L.; Melkumyan, Arman

    2017-03-01

    Machine learning techniques such as Gaussian Processes can be used to identify stratigraphically important features in geophysical logs. The marker shales in the banded iron formation hosted iron ore deposits of the Hamersley Ranges, Western Australia, form distinctive signatures in the natural gamma logs. The identification of these marker shales is important for stratigraphic identification of unit boundaries for the geological modelling of the deposit. Machine learning techniques each have different unique properties that will impact the results. For Gaussian Processes (GPs), the output values are inclined towards the mean value, particularly when there is not sufficient information in the library. The impact that these inclinations have on the classification can vary depending on the parameter values selected by the user. Therefore, when applying machine learning techniques, care must be taken to fit the technique to the problem correctly. This study focuses on optimising the settings and choices for training a GPs system to identify a specific marker shale. We show that the final results converge even when different, but equally valid starting libraries are used for the training. To analyse the impact on feature identification, GP models were trained so that the output was inclined towards a positive, neutral or negative output. For this type of classification, the best results were when the pull was towards a negative output. We also show that the GP output can be adjusted by using a standard deviation coefficient that changes the balance between certainty and accuracy in the results.

  20. Single-Frame Terrain Mapping Software for Robotic Vehicles

    NASA Technical Reports Server (NTRS)

    Rankin, Arturo L.

    2011-01-01

    This software is a component in an unmanned ground vehicle (UGV) perception system that builds compact, single-frame terrain maps for distribution to other systems, such as a world model or an operator control unit, over a local area network (LAN). Each cell in the map encodes an elevation value, terrain classification, object classification, terrain traversability, terrain roughness, and a confidence value into four bytes of memory. The input to this software component is a range image (from a lidar or stereo vision system), and optionally a terrain classification image and an object classification image, both registered to the range image. The single-frame terrain map generates estimates of the support surface elevation, ground cover elevation, and minimum canopy elevation; generates terrain traversability cost; detects low overhangs and high-density obstacles; and can perform geometry-based terrain classification (ground, ground cover, unknown). A new origin is automatically selected for each single-frame terrain map in global coordinates such that it coincides with the corner of a world map cell. That way, single-frame terrain maps correctly line up with the world map, facilitating the merging of map data into the world map. Instead of using 32 bits to store the floating-point elevation for a map cell, the vehicle elevation is assigned to the map origin elevation and reports the change in elevation (from the origin elevation) in terms of the number of discrete steps. The single-frame terrain map elevation resolution is 2 cm. At that resolution, terrain elevation from 20.5 to 20.5 m (with respect to the vehicle's elevation) is encoded into 11 bits. For each four-byte map cell, bits are assigned to encode elevation, terrain roughness, terrain classification, object classification, terrain traversability cost, and a confidence value. The vehicle s current position and orientation, the map origin, and the map cell resolution are all included in a header for each map. The map is compressed into a vector prior to delivery to another system.

  1. A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations.

    PubMed

    Lin, Lei; Wang, Qian; Sadek, Adel W

    2016-06-01

    The duration of freeway traffic accidents duration is an important factor, which affects traffic congestion, environmental pollution, and secondary accidents. Among previous studies, the M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P builds a tree-based model, like the traditional classification and regression tree (CART) method, but with multiple linear regression models as its leaves. The problem with M5P for accident duration prediction, however, is that whereas linear regression assumes that the conditional distribution of accident durations is normally distributed, the distribution for a "time-to-an-event" is almost certainly nonsymmetrical. A hazard-based duration model (HBDM) is a better choice for this kind of a "time-to-event" modeling scenario, and given this, HBDMs have been previously applied to analyze and predict traffic accidents duration. Previous research, however, has not yet applied HBDMs for accident duration prediction, in association with clustering or classification of the dataset to minimize data heterogeneity. The current paper proposes a novel approach for accident duration prediction, which improves on the original M5P tree algorithm through the construction of a M5P-HBDM model, in which the leaves of the M5P tree model are HBDMs instead of linear regression models. Such a model offers the advantage of minimizing data heterogeneity through dataset classification, and avoids the need for the incorrect assumption of normality for traffic accident durations. The proposed model was then tested on two freeway accident datasets. For each dataset, the first 500 records were used to train the following three models: (1) an M5P tree; (2) a HBDM; and (3) the proposed M5P-HBDM, and the remainder of data were used for testing. The results show that the proposed M5P-HBDM managed to identify more significant and meaningful variables than either M5P or HBDMs. Moreover, the M5P-HBDM had the lowest overall mean absolute percentage error (MAPE). Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Filtering big data from social media--Building an early warning system for adverse drug reactions.

    PubMed

    Yang, Ming; Kiang, Melody; Shang, Wei

    2015-04-01

    Adverse drug reactions (ADRs) are believed to be a leading cause of death in the world. Pharmacovigilance systems are aimed at early detection of ADRs. With the popularity of social media, Web forums and discussion boards become important sources of data for consumers to share their drug use experience, as a result may provide useful information on drugs and their adverse reactions. In this study, we propose an automated ADR related posts filtering mechanism using text classification methods. In real-life settings, ADR related messages are highly distributed in social media, while non-ADR related messages are unspecific and topically diverse. It is expensive to manually label a large amount of ADR related messages (positive examples) and non-ADR related messages (negative examples) to train classification systems. To mitigate this challenge, we examine the use of a partially supervised learning classification method to automate the process. We propose a novel pharmacovigilance system leveraging a Latent Dirichlet Allocation modeling module and a partially supervised classification approach. We select drugs with more than 500 threads of discussion, and collect all the original posts and comments of these drugs using an automatic Web spidering program as the text corpus. Various classifiers were trained by varying the number of positive examples and the number of topics. The trained classifiers were applied to 3000 posts published over 60 days. Top-ranked posts from each classifier were pooled and the resulting set of 300 posts was reviewed by a domain expert to evaluate the classifiers. Compare to the alternative approaches using supervised learning methods and three general purpose partially supervised learning methods, our approach performs significantly better in terms of precision, recall, and the F measure (the harmonic mean of precision and recall), based on a computational experiment using online discussion threads from Medhelp. Our design provides satisfactory performance in identifying ADR related posts for post-marketing drug surveillance. The overall design of our system also points out a potentially fruitful direction for building other early warning systems that need to filter big data from social media networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Structure of 311 service requests as a signature of urban location

    PubMed Central

    Wang, Lingjing; Qian, Cheng; Kats, Philipp; Kontokosta, Constantine; Sobolevsky, Stanislav

    2017-01-01

    While urban systems demonstrate high spatial heterogeneity, many urban planning, economic and political decisions heavily rely on a deep understanding of local neighborhood contexts. We show that the structure of 311 Service Requests enables one possible way of building a unique signature of the local urban context, thus being able to serve as a low-cost decision support tool for urban stakeholders. Considering examples of New York City, Boston and Chicago, we demonstrate how 311 Service Requests recorded and categorized by type in each neighborhood can be utilized to generate a meaningful classification of locations across the city, based on distinctive socioeconomic profiles. Moreover, the 311-based classification of urban neighborhoods can present sufficient information to model various socioeconomic features. Finally, we show that these characteristics are capable of predicting future trends in comparative local real estate prices. We demonstrate 311 Service Requests data can be used to monitor and predict socioeconomic performance of urban neighborhoods, allowing urban stakeholders to quantify the impacts of their interventions. PMID:29040314

  4. Structure of 311 service requests as a signature of urban location.

    PubMed

    Wang, Lingjing; Qian, Cheng; Kats, Philipp; Kontokosta, Constantine; Sobolevsky, Stanislav

    2017-01-01

    While urban systems demonstrate high spatial heterogeneity, many urban planning, economic and political decisions heavily rely on a deep understanding of local neighborhood contexts. We show that the structure of 311 Service Requests enables one possible way of building a unique signature of the local urban context, thus being able to serve as a low-cost decision support tool for urban stakeholders. Considering examples of New York City, Boston and Chicago, we demonstrate how 311 Service Requests recorded and categorized by type in each neighborhood can be utilized to generate a meaningful classification of locations across the city, based on distinctive socioeconomic profiles. Moreover, the 311-based classification of urban neighborhoods can present sufficient information to model various socioeconomic features. Finally, we show that these characteristics are capable of predicting future trends in comparative local real estate prices. We demonstrate 311 Service Requests data can be used to monitor and predict socioeconomic performance of urban neighborhoods, allowing urban stakeholders to quantify the impacts of their interventions.

  5. The coordinating evaluation and spatial correlation analysis of CSGC: A case study of Henan province, China.

    PubMed

    Xie, Mingxia; Wang, Jiayao; Chen, Ke

    2017-01-01

    This study investigates the basic characteristics and proposes a concept for the complex system of geographical conditions (CSGC). By analyzing the DPSIR model and its correlation with the index system, we selected indexes for geographical conditions according to the resources, ecology, environment, economy and society parameters to build a system. This system consists of four hierarchies: index, classification, element and target levels. We evaluated the elements or indexes of the complex system using the TOPSIS method and a general model coordinating multiple complex systems. On this basis, the coordination analysis experiment of geographical conditions is applied to cities in the Henan province in China. The following conclusions were reached: ①According to the pressure, state and impact of geographical conditions, relatively consistent measures are taken around the city, but with conflicting results. ②The coordination degree of geographical conditions is small among regions showing large differences in classification index value. The degree of coordination of such regions is prone to extreme values; however, the smaller the difference the larger the coordination degree. ③The coordinated development of geographical conditions in the Henan province is at the stage of the point axis.

  6. OPTICAL SPECTROSCOPY OF SDSS J004054.65-0915268: THREE POSSIBLE SCENARIOS FOR THE CLASSIFICATION. A z ∼ 5 BL LACERTAE, A BLUE FSRQ, OR A WEAK EMISSION LINE QUASAR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Landoni, M.; Zanutta, A.; Bianco, A.

    2016-02-15

    The haunt of high-redshift BL Lacerate objects is day by day more compelling to firmly understand their intrinsic nature and evolution. SDSS J004054.65-0915268 is, at the moment, one of the most distant BL Lac candidates, at z ∼ 5. We present a new optical-near-IR spectrum obtained with ALFOSC-NOT with a new, custom designed dispersive grating aimed to detect broad emission lines that could disprove this classification. In the obtained spectra, we do not detect any emission features and we provide an upper limit to the luminosity of the C iv broad emission line. Therefore, the nature of the object is then discussed,more » building the overall spectral energy distribution (SED) and fitting it with three different models. Our fits, based on SED modeling with different possible scenarios, cannot rule out the possibility that this source is indeed a BL Lac object, though the absence of optical variability and the lack of strong radio flux seem to suggest that the observed optical emission originates from a thermalized accretion disk.« less

  7. Optical Spectroscopy of SDSS J004054.65-0915268: Three Possible Scenarios for the Classification. A z ˜ 5 BL Lacertae, a Blue FSRQ, or a Weak Emission Line Quasar

    NASA Astrophysics Data System (ADS)

    Landoni, M.; Zanutta, A.; Bianco, A.; Tavecchio, F.; Bonnoli, G.; Ghisellini, G.

    2016-02-01

    The haunt of high-redshift BL Lacerate objects is day by day more compelling to firmly understand their intrinsic nature and evolution. SDSS J004054.65-0915268 is, at the moment, one of the most distant BL Lac candidates, at z ˜ 5. We present a new optical-near-IR spectrum obtained with ALFOSC-NOT with a new, custom designed dispersive grating aimed to detect broad emission lines that could disprove this classification. In the obtained spectra, we do not detect any emission features and we provide an upper limit to the luminosity of the C IV broad emission line. Therefore, the nature of the object is then discussed, building the overall spectral energy distribution (SED) and fitting it with three different models. Our fits, based on SED modeling with different possible scenarios, cannot rule out the possibility that this source is indeed a BL Lac object, though the absence of optical variability and the lack of strong radio flux seem to suggest that the observed optical emission originates from a thermalized accretion disk.

  8. Multi-Temporal Classification and Change Detection Using Uav Images

    NASA Astrophysics Data System (ADS)

    Makuti, S.; Nex, F.; Yang, M. Y.

    2018-05-01

    In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.

  9. Automated simultaneous multiple feature classification of MTI data

    NASA Astrophysics Data System (ADS)

    Harvey, Neal R.; Theiler, James P.; Balick, Lee K.; Pope, Paul A.; Szymanski, John J.; Perkins, Simon J.; Porter, Reid B.; Brumby, Steven P.; Bloch, Jeffrey J.; David, Nancy A.; Galassi, Mark C.

    2002-08-01

    Los Alamos National Laboratory has developed and demonstrated a highly capable system, GENIE, for the two-class problem of detecting a single feature against a background of non-feature. In addition to the two-class case, however, a commonly encountered remote sensing task is the segmentation of multispectral image data into a larger number of distinct feature classes or land cover types. To this end we have extended our existing system to allow the simultaneous classification of multiple features/classes from multispectral data. The technique builds on previous work and its core continues to utilize a hybrid evolutionary-algorithm-based system capable of searching for image processing pipelines optimized for specific image feature extraction tasks. We describe the improvements made to the GENIE software to allow multiple-feature classification and describe the application of this system to the automatic simultaneous classification of multiple features from MTI image data. We show the application of the multiple-feature classification technique to the problem of classifying lava flows on Mauna Loa volcano, Hawaii, using MTI image data and compare the classification results with standard supervised multiple-feature classification techniques.

  10. GLCF: Gallery

    Science.gov Websites

    UMD Land Cover Classification Product External Galleries * ASTER at JPL * AVHRR at JHU * Earth Observatory at NASA * Landsat 7 at USGS * MODIS at NASA * Visible Earth at NASA e-link 4321 Hartwick Building

  11. Catheter tracking via online learning for dynamic motion compensation in transcatheter aortic valve implantation.

    PubMed

    Wang, Peng; Zheng, Yefeng; John, Matthias; Comaniciu, Dorin

    2012-01-01

    Dynamic overlay of 3D models onto 2D X-ray images has important applications in image guided interventions. In this paper, we present a novel catheter tracking for motion compensation in the Transcatheter Aortic Valve Implantation (TAVI). To address such challenges as catheter shape and appearance changes, occlusions, and distractions from cluttered backgrounds, we present an adaptive linear discriminant learning method to build a measurement model online to distinguish catheters from background. An analytic solution is developed to effectively and efficiently update the discriminant model and to minimize the classification errors between the tracking object and backgrounds. The online learned discriminant model is further combined with an offline learned detector and robust template matching in a Bayesian tracking framework. Quantitative evaluations demonstrate the advantages of this method over current state-of-the-art tracking methods in tracking catheters for clinical applications.

  12. Developing interpretable models with optimized set reduction for identifying high risk software components

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Basili, Victor R.; Hetmanski, Christopher J.

    1993-01-01

    Applying equal testing and verification effort to all parts of a software system is not very efficient, especially when resources are limited and scheduling is tight. Therefore, one needs to be able to differentiate low/high fault frequency components so that testing/verification effort can be concentrated where needed. Such a strategy is expected to detect more faults and thus improve the resulting reliability of the overall system. This paper presents the Optimized Set Reduction approach for constructing such models, intended to fulfill specific software engineering needs. Our approach to classification is to measure the software system and build multivariate stochastic models for predicting high risk system components. We present experimental results obtained by classifying Ada components into two classes: is or is not likely to generate faults during system and acceptance test. Also, we evaluate the accuracy of the model and the insights it provides into the error making process.

  13. Exploiting salient semantic analysis for information retrieval

    NASA Astrophysics Data System (ADS)

    Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui

    2016-11-01

    Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.

  14. Evolution of initial discontinuities in the Riemann problem for the Kaup-Boussinesq equation with positive dispersion

    NASA Astrophysics Data System (ADS)

    Congy, T.; Ivanov, S. K.; Kamchatnov, A. M.; Pavloff, N.

    2017-08-01

    We consider the space-time evolution of initial discontinuities of depth and flow velocity for an integrable version of the shallow water Boussinesq system introduced by Kaup. We focus on a specific version of this "Kaup-Boussinesq model" for which a flat water surface is modulationally stable, we speak below of "positive dispersion" model. This model also appears as an approximation to the equations governing the dynamics of polarisation waves in two-component Bose-Einstein condensates. We describe its periodic solutions and the corresponding Whitham modulation equations. The self-similar, one-phase wave structures are composed of different building blocks, which are studied in detail. This makes it possible to establish a classification of all the possible wave configurations evolving from initial discontinuities. The analytic results are confirmed by numerical simulations.

  15. Evolution of initial discontinuities in the Riemann problem for the Kaup-Boussinesq equation with positive dispersion.

    PubMed

    Congy, T; Ivanov, S K; Kamchatnov, A M; Pavloff, N

    2017-08-01

    We consider the space-time evolution of initial discontinuities of depth and flow velocity for an integrable version of the shallow water Boussinesq system introduced by Kaup. We focus on a specific version of this "Kaup-Boussinesq model" for which a flat water surface is modulationally stable, we speak below of "positive dispersion" model. This model also appears as an approximation to the equations governing the dynamics of polarisation waves in two-component Bose-Einstein condensates. We describe its periodic solutions and the corresponding Whitham modulation equations. The self-similar, one-phase wave structures are composed of different building blocks, which are studied in detail. This makes it possible to establish a classification of all the possible wave configurations evolving from initial discontinuities. The analytic results are confirmed by numerical simulations.

  16. Modernism in Belgrade: Classification of Modernist Housing Buildings 1919-1980

    NASA Astrophysics Data System (ADS)

    Dragutinovic, Anica; Pottgiesser, Uta; De Vos, Els; Melenhorst, Michel

    2017-10-01

    Yugoslavian Modernist Architecture, although part of a larger cultural phenomenon, received hardly any international attention, since there are only a few internationally published studies about it. Nevertheless, Modernist Architecture of the Inter-war Yugoslavia (Kingdom of Yugoslavia), and specially Modernist Architecture of the Post-war Yugoslavia (Socialist Federal Republic of Yugoslavia under the “reign” of Tito), represents the most important architectural heritage of the 20th century in former Yugoslavian countries. Belgrade, as the capital city of both newly founded Yugoslavia(s), experienced an immediate economic, political and cultural expansion after the both wars, as well as a large population increase. The construction of sufficient and appropriate new housing was a major undertaking in both periods (1919-1940 and 1948-1980), however conceived and realized with deeply diverging views. The transition from villas and modest apartment buildings, as main housing typologies in the Inter-war period, to the mass housing of the Post-war period, was not only a result of the different socio-political context of the two Yugoslavia(s), but also the country’s industrialization, modernization and technological development. Through the classification of Modernist housing buildings in Belgrade, this paper will investigate on relations between the transformations of the main housing typologies executed under different socio-political contexts on the one side, and development of building technologies, construction systems and materials applied on those buildings on the other side. The paper wants to shed light on the Yugoslavian Modernist Architecture in order to increase the international awareness on its architectural and heritage values. The aim is an integrated re-evaluation of the buildings, presentation of their current condition and potentials for future (re)use with a specific focus on building envelopes and construction.

  17. On the evaluation of the fidelity of supervised classifiers in the prediction of chimeric RNAs.

    PubMed

    Beaumeunier, Sacha; Audoux, Jérôme; Boureux, Anthony; Ruffle, Florence; Commes, Thérèse; Philippe, Nicolas; Alves, Ronnie

    2016-01-01

    High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases. Moreover, if we succeed to have a proper set of observations (enough sequencing data) about true chRNAs, chances are that the devised model can not be able to generalize beyond it. Like any other machine learning problem, the first big issue is finding the good data to build models. As far as we were concerned, there is no common benchmark data available for chRNAs detection. The definition of a classification baseline is lacking in the related literature too. In this work we are moving towards benchmark data and an evaluation of the fidelity of supervised classifiers in the prediction of chRNAs. We proposed a modelization strategy that can be used to increase the tools performances in context of chRNA classification based on a simulated data generator, that permit to continuously integrate new complex chimeric events. The pipeline incorporated a genome mutation process and simulated RNA-seq data. The reads within distinct depth were aligned and analysed by CRAC that integrates genomic location and local coverage, allowing biological predictions at the read scale. Additionally, these reads were functionally annotated and aggregated to form chRNAs events, making it possible to evaluate ML methods (classifiers) performance in both levels of reads and events. Ensemble learning strategies demonstrated to be more robust to this classification problem, providing an average AUC performance of 95 % (ACC=94 %, Kappa=0.87 %). The resulting classification models were also tested on real RNA-seq data from a set of twenty-seven patients with acute myeloid leukemia (AML).

  18. A comparative study of deep learning models for medical image classification

    NASA Astrophysics Data System (ADS)

    Dutta, Suvajit; Manideep, B. C. S.; Rai, Shalva; Vijayarajan, V.

    2017-11-01

    Deep Learning(DL) techniques are conquering over the prevailing traditional approaches of neural network, when it comes to the huge amount of dataset, applications requiring complex functions demanding increase accuracy with lower time complexities. Neurosciences has already exploited DL techniques, thus portrayed itself as an inspirational source for researchers exploring the domain of Machine learning. DL enthusiasts cover the areas of vision, speech recognition, motion planning and NLP as well, moving back and forth among fields. This concerns with building models that can successfully solve variety of tasks requiring intelligence and distributed representation. The accessibility to faster CPUs, introduction of GPUs-performing complex vector and matrix computations, supported agile connectivity to network. Enhanced software infrastructures for distributed computing worked in strengthening the thought that made researchers suffice DL methodologies. The paper emphases on the following DL procedures to traditional approaches which are performed manually for classifying medical images. The medical images are used for the study Diabetic Retinopathy(DR) and computed tomography (CT) emphysema data. Both DR and CT data diagnosis is difficult task for normal image classification methods. The initial work was carried out with basic image processing along with K-means clustering for identification of image severity levels. After determining image severity levels ANN has been applied on the data to get the basic classification result, then it is compared with the result of DNNs (Deep Neural Networks), which performed efficiently because of its multiple hidden layer features basically which increases accuracy factors, but the problem of vanishing gradient in DNNs made to consider Convolution Neural Networks (CNNs) as well for better results. The CNNs are found to be providing better outcomes when compared to other learning models aimed at classification of images. CNNs are favoured as they provide better visual processing models successfully classifying the noisy data as well. The work centres on the detection on Diabetic Retinopathy-loss in vision and recognition of computed tomography (CT) emphysema data measuring the severity levels for both cases. The paper discovers how various Machine Learning algorithms can be implemented ensuing a supervised approach, so as to get accurate results with less complexity possible.

  19. Application of a neural network for reflectance spectrum classification

    NASA Astrophysics Data System (ADS)

    Yang, Gefei; Gartley, Michael

    2017-05-01

    Traditional reflectance spectrum classification algorithms are based on comparing spectrum across the electromagnetic spectrum anywhere from the ultra-violet to the thermal infrared regions. These methods analyze reflectance on a pixel by pixel basis. Inspired by high performance that Convolution Neural Networks (CNN) have demonstrated in image classification, we applied a neural network to analyze directional reflectance pattern images. By using the bidirectional reflectance distribution function (BRDF) data, we can reformulate the 4-dimensional into 2 dimensions, namely incident direction × reflected direction × channels. Meanwhile, RIT's micro-DIRSIG model is utilized to simulate additional training samples for improving the robustness of the neural networks training. Unlike traditional classification by using hand-designed feature extraction with a trainable classifier, neural networks create several layers to learn a feature hierarchy from pixels to classifier and all layers are trained jointly. Hence, the our approach of utilizing the angular features are different to traditional methods utilizing spatial features. Although training processing typically has a large computational cost, simple classifiers work well when subsequently using neural network generated features. Currently, most popular neural networks such as VGG, GoogLeNet and AlexNet are trained based on RGB spatial image data. Our approach aims to build a directional reflectance spectrum based neural network to help us to understand from another perspective. At the end of this paper, we compare the difference among several classifiers and analyze the trade-off among neural networks parameters.

  20. Latent feature representation with stacked auto-encoder for AD/MCI diagnosis

    PubMed Central

    Lee, Seong-Whan

    2014-01-01

    Recently, there have been great interests for computer-aided diagnosis of Alzheimer’s disease (AD) and its prodromal stage, mild cognitive impairment (MCI). Unlike the previous methods that considered simple low-level features such as gray matter tissue volumes from MRI, and mean signal intensities from PET, in this paper, we propose a deep learning-based latent feature representation with a stacked auto-encoder (SAE). We believe that there exist latent non-linear complicated patterns inherent in the low-level features such as relations among features. Combining the latent information with the original features helps build a robust model in AD/MCI classification, with high diagnostic accuracy. Furthermore, thanks to the unsupervised characteristic of the pre-training in deep learning, we can benefit from the target-unrelated samples to initialize parameters of SAE, thus finding optimal parameters in fine-tuning with the target-related samples, and further enhancing the classification performances across four binary classification problems: AD vs. healthy normal control (HC), MCI vs. HC, AD vs. MCI, and MCI converter (MCI-C) vs. MCI non-converter (MCI-NC). In our experiments on ADNI dataset, we validated the effectiveness of the proposed method, showing the accuracies of 98.8, 90.7, 83.7, and 83.3 % for AD/HC, MCI/HC, AD/MCI, and MCI-C/MCI-NC classification, respectively. We believe that deep learning can shed new light on the neuroimaging data analysis, and our work presented the applicability of this method to brain disease diagnosis. PMID:24363140

  1. Improving plant functional groups for dynamic models of biodiversity: at the crossroads between functional and community ecology

    PubMed Central

    Isabelle, Boulangeat; Pauline, Philippe; Sylvain, Abdulhak; Roland, Douzet; Luc, Garraud; Sébastien, Lavergne; Sandra, Lavorel; Jérémie, Van Es; Pascal, Vittoz; Wilfried, Thuiller

    2013-01-01

    The pace of on-going climate change calls for reliable plant biodiversity scenarios. Traditional dynamic vegetation models use plant functional types that are summarized to such an extent that they become meaningless for biodiversity scenarios. Hybrid dynamic vegetation models of intermediate complexity (hybrid-DVMs) have recently been developed to address this issue. These models, at the crossroads between phenomenological and process-based models, are able to involve an intermediate number of well-chosen plant functional groups (PFGs). The challenge is to build meaningful PFGs that are representative of plant biodiversity, and consistent with the parameters and processes of hybrid-DVMs. Here, we propose and test a framework based on few selected traits to define a limited number of PFGs, which are both representative of the diversity (functional and taxonomic) of the flora in the Ecrins National Park, and adapted to hybrid-DVMs. This new classification scheme, together with recent advances in vegetation modeling, constitutes a step forward for mechanistic biodiversity modeling. PMID:24403847

  2. Effect of Clustering Algorithm on Establishing Markov State Model for Molecular Dynamics Simulations.

    PubMed

    Li, Yan; Dong, Zigang

    2016-06-27

    Recently, the Markov state model has been applied for kinetic analysis of molecular dynamics simulations. However, discretization of the conformational space remains a primary challenge in model building, and it is not clear how the space decomposition by distinct clustering strategies exerts influence on the model output. In this work, different clustering algorithms are employed to partition the conformational space sampled in opening and closing of fatty acid binding protein 4 as well as inactivation and activation of the epidermal growth factor receptor. Various classifications are achieved, and Markov models are set up accordingly. On the basis of the models, the total net flux and transition rate are calculated between two distinct states. Our results indicate that geometric and kinetic clustering perform equally well. The construction and outcome of Markov models are heavily dependent on the data traits. Compared to other methods, a combination of Bayesian and hierarchical clustering is feasible in identification of metastable states.

  3. Multi-temporal and multi-source remote sensing image classification by nonlinear relative normalization

    NASA Astrophysics Data System (ADS)

    Tuia, Devis; Marcos, Diego; Camps-Valls, Gustau

    2016-10-01

    Remote sensing image classification exploiting multiple sensors is a very challenging problem: data from different modalities are affected by spectral distortions and mis-alignments of all kinds, and this hampers re-using models built for one image to be used successfully in other scenes. In order to adapt and transfer models across image acquisitions, one must be able to cope with datasets that are not co-registered, acquired under different illumination and atmospheric conditions, by different sensors, and with scarce ground references. Traditionally, methods based on histogram matching have been used. However, they fail when densities have very different shapes or when there is no corresponding band to be matched between the images. An alternative builds upon manifold alignment. Manifold alignment performs a multidimensional relative normalization of the data prior to product generation that can cope with data of different dimensionality (e.g. different number of bands) and possibly unpaired examples. Aligning data distributions is an appealing strategy, since it allows to provide data spaces that are more similar to each other, regardless of the subsequent use of the transformed data. In this paper, we study a methodology that aligns data from different domains in a nonlinear way through kernelization. We introduce the Kernel Manifold Alignment (KEMA) method, which provides a flexible and discriminative projection map, exploits only a few labeled samples (or semantic ties) in each domain, and reduces to solving a generalized eigenvalue problem. We successfully test KEMA in multi-temporal and multi-source very high resolution classification tasks, as well as on the task of making a model invariant to shadowing for hyperspectral imaging.

  4. Machine-learning in grading of gliomas based on multi-parametric magnetic resonance imaging at 3T.

    PubMed

    Citak-Er, Fusun; Firat, Zeynep; Kovanlikaya, Ilhami; Ture, Ugur; Ozturk-Isik, Esin

    2018-06-15

    The objective of this study was to assess the contribution of multi-parametric (mp) magnetic resonance imaging (MRI) quantitative features in the machine learning-based grading of gliomas with a multi-region-of-interests approach. Forty-three patients who were newly diagnosed as having a glioma were included in this study. The patients were scanned prior to any therapy using a standard brain tumor magnetic resonance (MR) imaging protocol that included T1 and T2-weighted, diffusion-weighted, diffusion tensor, MR perfusion and MR spectroscopic imaging. Three different regions-of-interest were drawn for each subject to encompass tumor, immediate tumor periphery, and distant peritumoral edema/normal. The normalized mp-MRI features were used to build machine-learning models for differentiating low-grade gliomas (WHO grades I and II) from high grades (WHO grades III and IV). In order to assess the contribution of regional mp-MRI quantitative features to the classification models, a support vector machine-based recursive feature elimination method was applied prior to classification. A machine-learning model based on support vector machine algorithm with linear kernel achieved an accuracy of 93.0%, a specificity of 86.7%, and a sensitivity of 96.4% for the grading of gliomas using ten-fold cross validation based on the proposed subset of the mp-MRI features. In this study, machine-learning based on multiregional and multi-parametric MRI data has proven to be an important tool in grading glial tumors accurately even in this limited patient population. Future studies are needed to investigate the use of machine learning algorithms for brain tumor classification in a larger patient cohort. Copyright © 2018. Published by Elsevier Ltd.

  5. The Comparative Experimental Study of Multilabel Classification for Diagnosis Assistant Based on Chinese Obstetric EMRs

    PubMed Central

    Zhang, Kunli; Zhao, Yueshu; Zan, Hongying; Zhuang, Lei

    2018-01-01

    Obstetric electronic medical records (EMRs) contain massive amounts of medical data and health information. The information extraction and diagnosis assistants of obstetric EMRs are of great significance in improving the fertility level of the population. The admitting diagnosis in the first course record of the EMR is reasoned from various sources, such as chief complaints, auxiliary examinations, and physical examinations. This paper treats the diagnosis assistant as a multilabel classification task based on the analyses of obstetric EMRs. The latent Dirichlet allocation (LDA) topic and the word vector are used as features and the four multilabel classification methods, BP-MLL (backpropagation multilabel learning), RAkEL (RAndom k labELsets), MLkNN (multilabel k-nearest neighbor), and CC (chain classifier), are utilized to build the diagnosis assistant models. Experimental results conducted on real cases show that the BP-MLL achieves the best performance with an average precision up to 0.7413 ± 0.0100 when the number of label sets and the word dimensions are 71 and 100, respectively. The result of the diagnosis assistant can be introduced as a supplementary learning method for medical students. Additionally, the method can be used not only for obstetric EMRs but also for other medical records. PMID:29666671

  6. Land cover and land use mapping of the iSimangaliso Wetland Park, South Africa: comparison of oblique and orthogonal random forest algorithms

    NASA Astrophysics Data System (ADS)

    Bassa, Zaakirah; Bob, Urmilla; Szantoi, Zoltan; Ismail, Riyad

    2016-01-01

    In recent years, the popularity of tree-based ensemble methods for land cover classification has increased significantly. Using WorldView-2 image data, we evaluate the potential of the oblique random forest algorithm (oRF) to classify a highly heterogeneous protected area. In contrast to the random forest (RF) algorithm, the oRF algorithm builds multivariate trees by learning the optimal split using a supervised model. The oRF binary algorithm is adapted to a multiclass land cover and land use application using both the "one-against-one" and "one-against-all" combination approaches. Results show that the oRF algorithms are capable of achieving high classification accuracies (>80%). However, there was no statistical difference in classification accuracies obtained by the oRF algorithms and the more popular RF algorithm. For all the algorithms, user accuracies (UAs) and producer accuracies (PAs) >80% were recorded for most of the classes. Both the RF and oRF algorithms poorly classified the indigenous forest class as indicated by the low UAs and PAs. Finally, the results from this study advocate and support the utility of the oRF algorithm for land cover and land use mapping of protected areas using WorldView-2 image data.

  7. Quality-of-care research in mental health: responding to the challenge.

    PubMed

    McGlynn, E A; Norquist, G S; Wells, K B; Sullivan, G; Liberman, R P

    1988-01-01

    Quality-of-care research in mental health is in the developmental stages, which affords an opportunity to take an integrative approach, building on principles from efficacy, effectiveness, quality assessment, and quality assurance research. We propose an analytic strategy for designing research on the quality of mental health services using an adaptation of the structure, process, and outcome classification scheme. As a concrete illustration of our approach, we discuss research on a particular target population-patients with chronic schizophrenia. Future research should focus on developing models of treatment, establishing criteria and standards for outcomes and processes, and gathering data on community practices.

  8. Rapid sample classification using an open port sampling interface coupled with liquid introduction atmospheric pressure ionization mass spectrometry.

    PubMed

    Van Berkel, Gary J; Kertesz, Vilmos

    2017-02-15

    An "Open Access"-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendor-provided software libraries. Sample classification based on spectral comparison utilized the spectral contrast angle method. Using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. This work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.

  9. A new adaptive L1-norm for optimal descriptor selection of high-dimensional QSAR classification model for anti-hepatitis C virus activity of thiourea derivatives.

    PubMed

    Algamal, Z Y; Lee, M H

    2017-01-01

    A high-dimensional quantitative structure-activity relationship (QSAR) classification model typically contains a large number of irrelevant and redundant descriptors. In this paper, a new design of descriptor selection for the QSAR classification model estimation method is proposed by adding a new weight inside L1-norm. The experimental results of classifying the anti-hepatitis C virus activity of thiourea derivatives demonstrate that the proposed descriptor selection method in the QSAR classification model performs effectively and competitively compared with other existing penalized methods in terms of classification performance on both the training and the testing datasets. Moreover, it is noteworthy that the results obtained in terms of stability test and applicability domain provide a robust QSAR classification model. It is evident from the results that the developed QSAR classification model could conceivably be employed for further high-dimensional QSAR classification studies.

  10. Estimation of vulnerability functions based on a global earthquake damage database

    NASA Astrophysics Data System (ADS)

    Spence, R. J. S.; Coburn, A. W.; Ruffle, S. J.

    2009-04-01

    Developing a better approach to the estimation of future earthquake losses, and in particular to the understanding of the inherent uncertainties in loss models, is vital to confidence in modelling potential losses in insurance or for mitigation. For most areas of the world there is currently insufficient knowledge of the current building stock for vulnerability estimates to be based on calculations of structural performance. In such areas, the most reliable basis for estimating vulnerability is performance of the building stock in past earthquakes, using damage databases, and comparison with consistent estimates of ground motion. This paper will present a new approach to the estimation of vulnerabilities using the recently launched Cambridge University Damage Database (CUEDD). CUEDD is based on data assembled by the Martin Centre at Cambridge University since 1980, complemented by other more-recently published and some unpublished data. The database assembles in a single, organised, expandable and web-accessible database, summary information on worldwide post-earthquake building damage surveys which have been carried out since the 1960's. Currently it contains data on the performance of more than 750,000 individual buildings, in 200 surveys following 40 separate earthquakes. The database includes building typologies, damage levels, location of each survey. It is mounted on a GIS mapping system and links to the USGS Shakemaps of each earthquake which enables the macroseismic intensity and other ground motion parameters to be defined for each survey and location. Fields of data for each building damage survey include: · Basic earthquake data and its sources · Details of the survey location and intensity and other ground motion observations or assignments at that location · Building and damage level classification, and tabulated damage survey results · Photos showing typical examples of damage. In future planned extensions of the database information on human casualties will also be assembled. The database also contains analytical tools enabling data from similar locations, building classes or ground motion levels to be assembled and thus vulnerability relationships derived for any chosen ground motion parameter, for a given class of building, and for particular countries or regions. The paper presents examples of vulnerability relationships for particular classes of buildings and regions of the world, together with the estimated uncertainty ranges. It will discuss the applicability of such vulnerability functions in earthquake loss assessment for insurance purposes or for earthquake risk mitigation.

  11. Modeling occupancy distribution in large spaces with multi-feature classification algorithm

    DOE PAGES

    Wang, Wei; Chen, Jiayu; Hong, Tianzhen

    2018-04-07

    We present that occupancy information enables robust and flexible control of heating, ventilation, and air-conditioning (HVAC) systems in buildings. In large spaces, multiple HVAC terminals are typically installed to provide cooperative services for different thermal zones, and the occupancy information determines the cooperation among terminals. However, a person count at room-level does not adequately optimize HVAC system operation due to the movement of occupants within the room that creates uneven load distribution. Without accurate knowledge of the occupants’ spatial distribution, the uneven distribution of occupants often results in under-cooling/heating or over-cooling/heating in some thermal zones. Therefore, the lack of high-resolutionmore » occupancy distribution is often perceived as a bottleneck for future improvements to HVAC operation efficiency. To fill this gap, this study proposes a multi-feature k-Nearest-Neighbors (k-NN) classification algorithm to extract occupancy distribution through reliable, low-cost Bluetooth Low Energy (BLE) networks. An on-site experiment was conducted in a typical office of an institutional building to demonstrate the proposed methods, and the experiment outcomes of three case studies were examined to validate detection accuracy. One method based on City Block Distance (CBD) was used to measure the distance between detected occupancy distribution and ground truth and assess the results of occupancy distribution. Finally, the results show the accuracy when CBD = 1 is over 71.4% and the accuracy when CBD = 2 can reach up to 92.9%.« less

  12. Modeling occupancy distribution in large spaces with multi-feature classification algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Wei; Chen, Jiayu; Hong, Tianzhen

    We present that occupancy information enables robust and flexible control of heating, ventilation, and air-conditioning (HVAC) systems in buildings. In large spaces, multiple HVAC terminals are typically installed to provide cooperative services for different thermal zones, and the occupancy information determines the cooperation among terminals. However, a person count at room-level does not adequately optimize HVAC system operation due to the movement of occupants within the room that creates uneven load distribution. Without accurate knowledge of the occupants’ spatial distribution, the uneven distribution of occupants often results in under-cooling/heating or over-cooling/heating in some thermal zones. Therefore, the lack of high-resolutionmore » occupancy distribution is often perceived as a bottleneck for future improvements to HVAC operation efficiency. To fill this gap, this study proposes a multi-feature k-Nearest-Neighbors (k-NN) classification algorithm to extract occupancy distribution through reliable, low-cost Bluetooth Low Energy (BLE) networks. An on-site experiment was conducted in a typical office of an institutional building to demonstrate the proposed methods, and the experiment outcomes of three case studies were examined to validate detection accuracy. One method based on City Block Distance (CBD) was used to measure the distance between detected occupancy distribution and ground truth and assess the results of occupancy distribution. Finally, the results show the accuracy when CBD = 1 is over 71.4% and the accuracy when CBD = 2 can reach up to 92.9%.« less

  13. In vivo classification of human skin burns using machine learning and quantitative features captured by optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Singla, Neeru; Srivastava, Vishal; Singh Mehta, Dalip

    2018-02-01

    We report the first fully automated detection of human skin burn injuries in vivo, with the goal of automatic surgical margin assessment based on optical coherence tomography (OCT) images. Our proposed automated procedure entails building a machine-learning-based classifier by extracting quantitative features from normal and burn tissue images recorded by OCT. In this study, 56 samples (28 normal, 28 burned) were imaged by OCT and eight features were extracted. A linear model classifier was trained using 34 samples and 22 samples were used to test the model. Sensitivity of 91.6% and specificity of 90% were obtained. Our results demonstrate the capability of a computer-aided technique for accurately and automatically identifying burn tissue resection margins during surgical treatment.

  14. Data-driven Modeling of Metal-oxide Sensors with Dynamic Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Gosangi, Rakesh; Gutierrez-Osuna, Ricardo

    2011-09-01

    We present a data-driven probabilistic framework to model the transient response of MOX sensors modulated with a sequence of voltage steps. Analytical models of MOX sensors are usually built based on the physico-chemical properties of the sensing materials. Although building these models provides an insight into the sensor behavior, they also require a thorough understanding of the underlying operating principles. Here we propose a data-driven approach to characterize the dynamical relationship between sensor inputs and outputs. Namely, we use dynamic Bayesian networks (DBNs), probabilistic models that represent temporal relations between a set of random variables. We identify a set of control variables that influence the sensor responses, create a graphical representation that captures the causal relations between these variables, and finally train the model with experimental data. We validated the approach on experimental data in terms of predictive accuracy and classification performance. Our results show that DBNs can accurately predict the dynamic response of MOX sensors, as well as capture the discriminatory information present in the sensor transients.

  15. Earthquake Building Damage Mapping Based on Feature Analyzing Method from Synthetic Aperture Radar Data

    NASA Astrophysics Data System (ADS)

    An, L.; Zhang, J.; Gong, L.

    2018-04-01

    Playing an important role in gathering information of social infrastructure damage, Synthetic Aperture Radar (SAR) remote sensing is a useful tool for monitoring earthquake disasters. With the wide application of this technique, a standard method, comparing post-seismic to pre-seismic data, become common. However, multi-temporal SAR processes, are not always achievable. To develop a post-seismic data only method for building damage detection, is of great importance. In this paper, the authors are now initiating experimental investigation to establish an object-based feature analysing classification method for building damage recognition.

  16. Beyond access: a case study on the intersection between accessibility, sustainability, and universal design.

    PubMed

    Gossett, Andrea; Mirza, Mansha; Barnds, Ann Kathleen; Feidt, Daisy

    2009-11-01

    A growing emphasis has been placed on providing equal opportunities for all people, particularly people with disabilities, to support participation. Barriers to participation are represented in part by physical space restrictions. This article explores the decision-making process during the construction of a new office building housing a disability-rights organization. The building project featured in this study was developed on the principles of universal design, maximal accessibility, and sustainability to support access and participation. A qualitative case study approach was used involving collection of data through in-depth interviews with key decision-makers; non-participant observations at design meetings; and on-site tours. Qualitative thematic analysis along with the development of a classification system was used to understand specific building elements and the relevant decision processes from which they resulted. Recording and analyzing the design process revealed several key issues including grassroots involvement of stakeholders; interaction between universal design and sustainable design; addressing diversity through flexibility and universality; and segregationist accessibility versus universal design. This case study revealed complex interactions between accessibility, universal design, and sustainability. Two visual models were proposed to understand and analyze these complexities.

  17. A new tool for post-AGB SED classification

    NASA Astrophysics Data System (ADS)

    Bendjoya, P.; Suarez, O.; Galluccio, L.; Michel, O.

    We present the results of an unsupervised classification method applied on a set of 344 spectral energy distributions (SED) of post-AGB stars extracted from the Torun catalogue of Galactic post-AGB stars. This method aims to find a new unbiased method for post-AGB star classification based on the information contained in the IR region of the SED (fluxes, IR excess, colours). We used the data from IRAS and MSX satellites, and from the 2MASS survey. We applied a classification method based on the construction of the dataset of a minimal spanning tree (MST) with the Prim's algorithm. In order to build this tree, different metrics have been tested on both flux and color indices. Our method is able to classify the set of 344 post-AGB stars in 9 distinct groups according to their SEDs.

  18. Towards Automatic Classification of Wikipedia Content

    NASA Astrophysics Data System (ADS)

    Szymański, Julian

    Wikipedia - the Free Encyclopedia encounters the problem of proper classification of new articles everyday. The process of assignment of articles to categories is performed manually and it is a time consuming task. It requires knowledge about Wikipedia structure, which is beyond typical editor competence, which leads to human-caused mistakes - omitting or wrong assignments of articles to categories. The article presents application of SVM classifier for automatic classification of documents from The Free Encyclopedia. The classifier application has been tested while using two text representations: inter-documents connections (hyperlinks) and word content. The results of the performed experiments evaluated on hand crafted data show that the Wikipedia classification process can be partially automated. The proposed approach can be used for building a decision support system which suggests editors the best categories that fit new content entered to Wikipedia.

  19. ADM. Service Building (TAN603). Elevations of all facades with door ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    ADM. Service Building (TAN-603). Elevations of all facades with door details and detail of kitchen. Section through garage area shows second level of steel decking. Equipment and laboratory furniture schedule. Ralph M. Parsons 902-2-ANP-603-A 44. Date: December 1952. Approved by INEEL Classification Office for public release. INEEL index code no. 033-0603-00-693-106719 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  20. Research of information classification and strategy intelligence extract algorithm based on military strategy hall

    NASA Astrophysics Data System (ADS)

    Chen, Lei; Li, Dehua; Yang, Jie

    2007-12-01

    Constructing virtual international strategy environment needs many kinds of information, such as economy, politic, military, diploma, culture, science, etc. So it is very important to build an information auto-extract, classification, recombination and analysis management system with high efficiency as the foundation and component of military strategy hall. This paper firstly use improved Boost algorithm to classify obtained initial information, then use a strategy intelligence extract algorithm to extract strategy intelligence from initial information to help strategist to analysis information.

  1. Building Combat Strength through Logistics: Translating the New Air Force Logistics Concept of Operations into Action

    DTIC Science & Technology

    1988-03-31

    MARCI 1988 iAm U m WI 4EUnclT CLSIIAION OF THIS PAGE REPORT DOCUMENTATION PAGE is REPORT SECURITY CLASSIFICATION lb. RESTRICTIVE MARKINGS 2.. SECUR ...logistics system of the future more capable of supporting the full spectrumn of war 20 OISTRIaSUTION.’AVAILAeILiTY 0" ABSTRACT 21 ABSTRACT SECURITY ... SECURITY CLASSIFICATION OT: THIS PAGF Unclas ’SCUFUTY Cý= I!FICATION OF THIS PAGE 1,Qwcont.) scenarios. Today’s logistics processes assume wartime

  2. Attributes of the Federal Energy Management Program's Federal Site Building Characteristics Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loper, Susan A.; Sandusky, William F.

    2010-12-31

    Typically, the Federal building stock is referred to as a group of about one-half million buildings throughout the United States. Additional information beyond this level is generally limited to distribution of that total by agency and maybe distribution of the total by state. However, additional characterization of the Federal building stock is required as the Federal sector seeks ways to implement efficiency projects to reduce energy and water use intensity as mandated by legislation and Executive Order. Using a Federal facility database that was assembled for use in a geographic information system tool, additional characterization of the Federal building stockmore » is provided including information regarding the geographical distribution of sites, building counts and percentage of total by agency, distribution of sites and building totals by agency, distribution of building count and floor space by Federal building type classification by agency, and rank ordering of sites, buildings, and floor space by state. A case study is provided regarding how the building stock has changed for the Department of Energy from 2000 through 2008.« less

  3. Transfer Learning for Class Imbalance Problems with Inadequate Data.

    PubMed

    Al-Stouhi, Samir; Reddy, Chandan K

    2016-07-01

    A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.

  4. Classification of basic facilities for high-rise residential: A survey from 100 housing scheme in Kajang area

    NASA Astrophysics Data System (ADS)

    Ani, Adi Irfan Che; Sairi, Ahmad; Tawil, Norngainy Mohd; Wahab, Siti Rashidah Hanum Abd; Razak, Muhd Zulhanif Abd

    2016-08-01

    High demand for housing and limited land in town area has increasing the provision of high-rise residential scheme. This type of housing has different owners but share the same land lot and common facilities. Thus, maintenance works of the buildings and common facilities must be well organized. The purpose of this paper is to identify and classify basic facilities for high-rise residential building hoping to improve the management of the scheme. The method adopted is a survey on 100 high-rise residential schemes that ranged from affordable housing to high cost housing by using a snowball sampling. The scope of this research is within Kajang area, which is rapidly developed with high-rise housing. The objective of the survey is to list out all facilities in every sample of the schemes. The result confirmed that pre-determined 11 classifications hold true and can provide the realistic classification for high-rise residential scheme. This paper proposed for redefinition of facilities provided to create a better management system and give a clear definition on the type of high-rise residential based on its facilities.

  5. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    PubMed

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  6. How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach.

    PubMed

    Ichikawa, Daisuke; Saito, Toki; Ujita, Waka; Oyama, Hiroshi

    2016-12-01

    Our purpose was to develop a new machine-learning approach (a virtual health check-up) toward identification of those at high risk of hyperuricemia. Applying the system to general health check-ups is expected to reduce medical costs compared with administering an additional test. Data were collected during annual health check-ups performed in Japan between 2011 and 2013 (inclusive). We prepared training and test datasets from the health check-up data to build prediction models; these were composed of 43,524 and 17,789 persons, respectively. Gradient-boosting decision tree (GBDT), random forest (RF), and logistic regression (LR) approaches were trained using the training dataset and were then used to predict hyperuricemia in the test dataset. Undersampling was applied to build the prediction models to deal with the imbalanced class dataset. The results showed that the RF and GBDT approaches afforded the best performances in terms of sensitivity and specificity, respectively. The area under the curve (AUC) values of the models, which reflected the total discriminative ability of the classification, were 0.796 [95% confidence interval (CI): 0.766-0.825] for the GBDT, 0.784 [95% CI: 0.752-0.815] for the RF, and 0.785 [95% CI: 0.752-0.819] for the LR approaches. No significant differences were observed between pairs of each approach. Small changes occurred in the AUCs after applying undersampling to build the models. We developed a virtual health check-up that predicted the development of hyperuricemia using machine-learning methods. The GBDT, RF, and LR methods had similar predictive capability. Undersampling did not remarkably improve predictive power. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Open Dataset for the Automatic Recognition of Sedentary Behaviors.

    PubMed

    Possos, William; Cruz, Robinson; Cerón, Jesús D; López, Diego M; Sierra-Torres, Carlos H

    2017-01-01

    Sedentarism is associated with the development of noncommunicable diseases (NCD) such as cardiovascular diseases (CVD), type 2 diabetes, and cancer. Therefore, the identification of specific sedentary behaviors (TV viewing, sitting at work, driving, relaxing, etc.) is especially relevant for planning personalized prevention programs. To build and evaluate a public a dataset for the automatic recognition (classification) of sedentary behaviors. The dataset included data from 30 subjects, who performed 23 sedentary behaviors while wearing a commercial wearable on the wrist, a smartphone on the hip and another in the thigh. Bluetooth Low Energy (BLE) beacons were used in order to improve the automatic classification of different sedentary behaviors. The study also compared six well know data mining classification techniques in order to identify the more precise method of solving the classification problem of the 23 defined behaviors. A better classification accuracy was obtained using the Random Forest algorithm and when data were collected from the phone on the hip. Furthermore, the use of beacons as a reference for obtaining the symbolic location of the individual improved the precision of the classification.

  8. The United Nations Framework Classification for World Petroleum Resources

    USGS Publications Warehouse

    Ahlbrandt, T.S.; Blystad, P.; Young, E.D.; Slavov, S.; Heiberg, S.

    2003-01-01

    The United Nations has developed an international framework classification for solid fuels and minerals (UNFC). This is now being extended to petroleum by building on the joint classification of the Society of Petroleum Engineers (SPE), the World Petroleum Congresses (WPC) and the American Association of Petroleum Geologists (AAPG). The UNFC is a 3-dimansional classification. This: Is necessary in order to migrate accounts of resource quantities that are developed on one or two of the axes to the common basis; Provides for more precise reporting and analysis. This is particularly useful in analyses of contingent resources. The characteristics of the SPE/WPC/AAPG classification has been preserved and enhanced to facilitate improved international and national petroleum resource management, corporate business process management and financial reporting. A UN intergovernmental committee responsible for extending the UNFC to extractive energy resources (coal, petroleum and uranium) will meet in Geneva on October 30th and 31st to review experiences gained and comments received during 2003. A recommended classification will then be delivered for consideration to the United Nations through the Committee on Sustainable Energy of the Economic Commission for Europe (UN ECE).

  9. Feasibility of Multispectral Airborne Laser Scanning for Land Cover Classification, Road Mapping and Map Updating

    NASA Astrophysics Data System (ADS)

    Matikainen, L.; Karila, K.; Hyyppä, J.; Puttonen, E.; Litkey, P.; Ahokas, E.

    2017-10-01

    This article summarises our first results and experiences on the use of multispectral airborne laser scanner (ALS) data. Optech Titan multispectral ALS data over a large suburban area in Finland were acquired on three different dates in 2015-2016. We investigated the feasibility of the data from the first date for land cover classification and road mapping. Object-based analyses with segmentation and random forests classification were used. The potential of the data for change detection of buildings and roads was also demonstrated. The overall accuracy of land cover classification results with six classes was 96 % compared with validation points. The data also showed high potential for road detection, road surface classification and change detection. The multispectral intensity information appeared to be very important for automated classifications. Compared to passive aerial images, the intensity images have interesting advantages, such as the lack of shadows. Currently, we focus on analyses and applications with the multitemporal multispectral data. Important questions include, for example, the potential and challenges of the multitemporal data for change detection.

  10. Defining environmental flows requirements at regional scale by using meso-scale habitat models and catchments classification

    NASA Astrophysics Data System (ADS)

    Vezza, Paolo; Comoglio, Claudio; Rosso, Maurizio

    2010-05-01

    The alterations of the natural flow regime and in-stream channel modification due to abstraction from watercourses act on biota through an hydraulic template, which is mediated by channel morphology. Modeling channel hydro-morphology is needed in order to evaluate how much habitat is available for selected fauna under specific environmental conditions, and consequently to assist decision makers in planning options for regulated river management. Meso-scale habitat modeling methods (e.g., MesoHABSIM) offer advantages over the traditional physical habitat evaluation, involving a larger range of habitat variables, allowing longer length of surveyed rivers and enabling understanding of fish behavior at larger spatial scale. In this study we defined a bottom-up method for the ecological discharge evaluation at regional scale, focusing on catchments smaller than 50 km2, most of them located within mountainous areas of Apennines and Alps mountain range in Piedmont (NW Italy). Within the regional study domain we identified 30 representative catchments not affected by water abstractions in order to build up the habitat-flow relationship, to be used as reference when evaluating regulated watercourses or new projects. For each stream we chose a representative reach and obtained fish data by sampling every single functional habitat (i.e. meso-habitat) within the site, keeping separated each area by using nets. The target species were brown trout (Salmo trutta), marble trout (Salmo trutta marmoratus), bullhead (Cottus gobius), chub (Leuciscus cephalus), barbel (Barbus barbus), vairone (Leuciscus souffia) and other rheophilic Cyprinids. The fish habitat suitability criteria was obtained from the observation of habitat use by a selected organism described with a multivariate relationship between habitat characteristics and fish presence. Habitat type, mean slope, cover, biotic choriotop and substrate, stream depth and velocity, water pH, temperature and percentage of dissolved oxygen were collected for each sampled area and considered as independent variables. According to the MesoHABSIM method, we performed a stepwise forward logistic regression in order to build up a biological model identifying the habitat characteristics mostly used by a target fish. For each stream we predicted changes in habitat area over a range of discharges by building the habitat-flow rating curves. Finally, in order to define a regional criteria needed to fulfill environmental flow requirements, we split the study domain according to the regression tree classification criterion defining homogenous sub-regions distinct on both environmental flows and catchment characteristics.

  11. Comparative study of building footprint estimation methods from LiDAR point clouds

    NASA Astrophysics Data System (ADS)

    Rozas, E.; Rivera, F. F.; Cabaleiro, J. C.; Pena, T. F.; Vilariño, D. L.

    2017-10-01

    Building area calculation from LiDAR points is still a difficult task with no clear solution. Their different characteristics, such as shape or size, have made the process too complex to automate. However, several algorithms and techniques have been used in order to obtain an approximated hull. 3D-building reconstruction or urban planning are examples of important applications that benefit of accurate building footprint estimations. In this paper, we have carried out a study of accuracy in the estimation of the footprint of buildings from LiDAR points. The analysis focuses on the processing steps following the object recognition and classification, assuming that labeling of building points have been previously performed. Then, we perform an in-depth analysis of the influence of the point density over the accuracy of the building area estimation. In addition, a set of buildings with different size and shape were manually classified, in such a way that they can be used as benchmark.

  12. Study on the flood simulation techniques for estimation of health risk in Dhaka city, Bangladesh

    NASA Astrophysics Data System (ADS)

    Hashimoto, M.; Suetsugi, T.; Sunada, K.; ICRE

    2011-12-01

    Although some studies have been carried out on the spread of infectious disease with the flooding, the relation between flooding and the infectious expansion has not been clarified yet. The improvement of the calculation precision of inundation and its relation with the infectious disease, surveyed epidemiologically, are therefore investigated in a case study in Dhaka city, Bangladesh. The inundation was computed using a flood simulation model that is numerical 2D-model. The "sensitivity to inundation" of hydraulic factors such as drainage channel, dike, and the building occupied ratio was examined because of the lack of digital data set related to flood simulation. Each element in the flood simulation model was incorporated progressively and results were compared with the calculation result as inspection materials by the inundation classification from the existing study (Mollah et al., 2007). The results show that the influences by ''dyke'' and "drainage channel" factors are remarkable to water level near each facility. The inundation level and duration have influence on wide areas when "building occupied ratio" is also considered. The correlation between maximum inundation depth and health risk (DALY, Mortality, Morbidity) was found, but the validation of the inundation model for this case has not been performed yet. The flood simulation model needs to be validated by observed inundation depth. The drainage facilities such as sewer network or the pumping system will be also considered in the further research to improve the precision of the inundation model.

  13. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.

  14. A further step toward an optimal ensemble of classifiers for peptide classification, a case study: HIV protease.

    PubMed

    Nanni, Loris; Lumini, Alessandra

    2009-01-01

    The focuses of this work are: to propose a novel method for building an ensemble of classifiers for peptide classification based on substitution matrices; to show the importance to select a proper set of the parameters of the classifiers that build the ensemble of learning systems. The HIV-1 protease cleavage site prediction problem is here studied. The results obtained by a blind testing protocol are reported, the comparison with other state-of-the-art approaches, based on ensemble of classifiers, allows to quantify the performance improvement obtained by the systems proposed in this paper. The simulation based on experimentally determined protease cleavage data has demonstrated the success of these new ensemble algorithms. Particularly interesting it is to note that also if the HIV-1 protease cleavage site prediction problem is considered linearly separable we obtain the best performance using an ensemble of non-linear classifiers.

  15. Behavior identification based on geotagged photo data set.

    PubMed

    Liu, Guo-qi; Zhang, Yi-jia; Fu, Ying-mao; Liu, Ying

    2014-01-01

    The popularity of mobile devices has produced a set of image data with geographic information, time information, and text description information, which is called geotagged photo data set. The division of this kind of data by its behavior and the location not only can identify the user's important location and daily behavior, but also helps users to sort the huge image data. This paper proposes a method to build an index based on multiple classification result, which can divide the data set multiple times and distribute labels to the data to build index according to the estimated probability of classification results in order to accomplish the identification of users' important location and daily behaviors. This paper collects 1400 discrete sets of data as experimental data to verify the method proposed in this paper. The result of the experiment shows that the index and actual tagging results have a high inosculation.

  16. Cascaded deep decision networks for classification of endoscopic images

    NASA Astrophysics Data System (ADS)

    Murthy, Venkatesh N.; Singh, Vivek; Sun, Shanhui; Bhattacharya, Subhabrata; Chen, Terrence; Comaniciu, Dorin

    2017-02-01

    Both traditional and wireless capsule endoscopes can generate tens of thousands of images for each patient. It is desirable to have the majority of irrelevant images filtered out by automatic algorithms during an offline review process or to have automatic indication for highly suspicious areas during an online guidance. This also applies to the newly invented endomicroscopy, where online indication of tumor classification plays a significant role. Image classification is a standard pattern recognition problem and is well studied in the literature. However, performance on the challenging endoscopic images still has room for improvement. In this paper, we present a novel Cascaded Deep Decision Network (CDDN) to improve image classification performance over standard Deep neural network based methods. During the learning phase, CDDN automatically builds a network which discards samples that are classified with high confidence scores by a previously trained network and concentrates only on the challenging samples which would be handled by the subsequent expert shallow networks. We validate CDDN using two different types of endoscopic imaging, which includes a polyp classification dataset and a tumor classification dataset. From both datasets we show that CDDN can outperform other methods by about 10%. In addition, CDDN can also be applied to other image classification problems.

  17. JDINAC: joint density-based non-parametric differential interaction network analysis and classification using high-dimensional sparse omics data.

    PubMed

    Ji, Jiadong; He, Di; Feng, Yang; He, Yong; Xue, Fuzhong; Xie, Lei

    2017-10-01

    A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data. R scripts available at https://github.com/jijiadong/JDINAC. lxie@iscb.org. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  18. Conceptual-driven classification for coding advise in health insurance reimbursement.

    PubMed

    Li, Sheng-Tun; Chen, Chih-Chuan; Huang, Fernando

    2011-01-01

    With the non-stop increases in medical treatment fees, the economic survival of a hospital in Taiwan relies on the reimbursements received from the Bureau of National Health Insurance, which in turn depend on the accuracy and completeness of the content of the discharge summaries as well as the correctness of their International Classification of Diseases (ICD) codes. The purpose of this research is to enforce the entire disease classification framework by supporting disease classification specialists in the coding process. This study developed an ICD code advisory system (ICD-AS) that performed knowledge discovery from discharge summaries and suggested ICD codes. Natural language processing and information retrieval techniques based on Zipf's Law were applied to process the content of discharge summaries, and fuzzy formal concept analysis was used to analyze and represent the relationships between the medical terms identified by MeSH. In addition, a certainty factor used as reference during the coding process was calculated to account for uncertainty and strengthen the credibility of the outcome. Two sets of 360 and 2579 textual discharge summaries of patients suffering from cerebrovascular disease was processed to build up ICD-AS and to evaluate the prediction performance. A number of experiments were conducted to investigate the impact of system parameters on accuracy and compare the proposed model to traditional classification techniques including linear-kernel support vector machines. The comparison results showed that the proposed system achieves the better overall performance in terms of several measures. In addition, some useful implication rules were obtained, which improve comprehension of the field of cerebrovascular disease and give insights to the relationships between relevant medical terms. Our system contributes valuable guidance to disease classification specialists in the process of coding discharge summaries, which consequently brings benefits in aspects of patient, hospital, and healthcare system. Copyright © 2010 Elsevier B.V. All rights reserved.

  19. A new PUB-working group on SLope InterComparison Experiments (SLICE)

    NASA Astrophysics Data System (ADS)

    McGuire, K.; Retter, M.; Freer, J.; Troch, P.; McDonnell, J.

    2006-05-01

    The International Association of Hydrological Sciences (IAHS) decade on Prediction in Ungauged Basins (PUB) has the scientific goal to shift hydrology from calibration reliant models to new and rich understanding- based models. To support this, six PUB science themes have been developed under the PUB Science Steering group. Theme 1 covers basin inter-comparison and classification. The SLope InterComparison Experiment (SLICE) is a newly-formed working group aligned with theme 1. Its 2- year target is to promote the improved understanding of regional hydrological characteristics via hillslope inter- comparison studies and top-down analysis of data from hillslope experiments from around the world. It will further deliver the major building blocks of a catchment classification system. A first workshop of SLICE took place 26-28 September 2005 at the HJ Andrews Experimental Forest, Oregon, USA. 40 participants from seven countries were in attendance. The program consisted of keynote presentations on the state-of-the-art of hillslope hydrology, outlining a hillslope classification system, and through small group discussion, a focus on the following questions: a.) How can we capture flow path heterogeneity at the hillslope scale with residence time distributions? b.) Can networks help characterize hillslope subsurface systems? c.) What patterns are useful to characterize in a hillslope comparison context? d.) How does bedrock permeability condition hillslope response? e.) Can we actually observe pressure waves in the field and/or how likely are they to exist at the hillslope continuum scale? The poster presents an overview of the workshop outcomes and directions of future work.

  20. Building the United States National Vegetation Classification

    USGS Publications Warehouse

    Franklin, S.B.; Faber-Langendoen, D.; Jennings, M.; Keeler-Wolf, T.; Loucks, O.; Peet, R.; Roberts, D.; McKerrow, A.

    2012-01-01

    The Federal Geographic Data Committee (FGDC) Vegetation Subcommittee, the Ecological Society of America Panel on Vegetation Classification, and NatureServe have worked together to develop the United States National Vegetation Classification (USNVC). The current standard was accepted in 2008 and fosters consistency across Federal agencies and non-federal partners for the description of each vegetation concept and its hierarchical classification. The USNVC is structured as a dynamic standard, where changes to types at any level may be proposed at any time as new information comes in. But, because much information already exists from previous work, the NVC partners first established methods for screening existing types to determine their acceptability with respect to the 2008 standard. Current efforts include a screening process to assign confidence to Association and Group level descriptions, and a review of the upper three levels of the classification. For the upper levels especially, the expectation is that the review process includes international scientists. Immediate future efforts include the review of remaining levels and the development of a proposal review process.

  1. The 7th lung cancer TNM classification and staging system: Review of the changes and implications.

    PubMed

    Mirsadraee, Saeed; Oswal, Dilip; Alizadeh, Yalda; Caulo, Andrea; van Beek, Edwin

    2012-04-28

    Lung cancer is the most common cause of death from cancer in males, accounting for more than 1.4 million deaths in 2008. It is a growing concern in China, Asia and Africa as well. Accurate staging of the disease is an important part of the management as it provides estimation of patient's prognosis and identifies treatment sterategies. It also helps to build a database for future staging projects. A major revision of lung cancer staging has been announced with effect from January 2010. The new classification is based on a larger surgical and non-surgical cohort of patients, and thus more accurate in terms of outcome prediction compared to the previous classification. There are several original papers regarding this new classification which give comprehensive description of the methodology, the changes in the staging and the statistical analysis. This overview is a simplified description of the changes in the new classification and their potential impact on patients' treatment and prognosis.

  2. Graph wavelet alignment kernels for drug virtual screening.

    PubMed

    Smalter, Aaron; Huan, Jun; Lushington, Gerald

    2009-06-01

    In this paper, we introduce a novel statistical modeling technique for target property prediction, with applications to virtual screening and drug design. In our method, we use graphs to model chemical structures and apply a wavelet analysis of graphs to summarize features capturing graph local topology. We design a novel graph kernel function to utilize the topology features to build predictive models for chemicals via Support Vector Machine classifier. We call the new graph kernel a graph wavelet-alignment kernel. We have evaluated the efficacy of the wavelet-alignment kernel using a set of chemical structure-activity prediction benchmarks. Our results indicate that the use of the kernel function yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art chemical classification approaches. In addition, our results also show that the use of wavelet functions significantly decreases the computational costs for graph kernel computation with more than ten fold speedup.

  3. Multivariate analysis of standoff laser-induced breakdown spectroscopy spectra for classification of explosive-containing residues

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De Lucia, Frank C. Jr.; Gottfried, Jennifer L.; Munson, Chase A.

    2008-11-01

    A technique being evaluated for standoff explosives detection is laser-induced breakdown spectroscopy (LIBS). LIBS is a real-time sensor technology that uses components that can be configured into a ruggedized standoff instrument. The U.S. Army Research Laboratory has been coupling standoff LIBS spectra with chemometrics for several years now in order to discriminate between explosives and nonexplosives. We have investigated the use of partial least squares discriminant analysis (PLS-DA) for explosives detection. We have extended our study of PLS-DA to more complex sample types, including binary mixtures, different types of explosives, and samples not included in the model. We demonstrate themore » importance of building the PLS-DA model by iteratively testing it against sample test sets. Independent test sets are used to test the robustness of the final model.« less

  4. Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure-Activity Modeling and Dataset Comparison.

    PubMed

    Kireeva, N; Baskin, I I; Gaspar, H A; Horvath, D; Marcou, G; Varnek, A

    2012-04-01

    Here, the utility of Generative Topographic Maps (GTM) for data visualization, structure-activity modeling and database comparison is evaluated, on hand of subsets of the Database of Useful Decoys (DUD). Unlike other popular dimensionality reduction approaches like Principal Component Analysis, Sammon Mapping or Self-Organizing Maps, the great advantage of GTMs is providing data probability distribution functions (PDF), both in the high-dimensional space defined by molecular descriptors and in 2D latent space. PDFs for the molecules of different activity classes were successfully used to build classification models in the framework of the Bayesian approach. Because PDFs are represented by a mixture of Gaussian functions, the Bhattacharyya kernel has been proposed as a measure of the overlap of datasets, which leads to an elegant method of global comparison of chemical libraries. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Remote measurement of surface roughness, surface reflectance, and body reflectance with LiDAR.

    PubMed

    Li, Xiaolu; Liang, Yu

    2015-10-20

    Light detection and ranging (LiDAR) intensity data are attracting increasing attention because of the great potential for use of such data in a variety of remote sensing applications. To fully investigate the data potential for target classification and identification, we carried out a series of experiments with typical urban building materials and employed our reconstructed built-in-lab LiDAR system. Received intensity data were analyzed on the basis of the derived bidirectional reflectance distribution function (BRDF) model and the established integration method. With an improved fitting algorithm, parameters involved in the BRDF model can be obtained to depict the surface characteristics. One of these parameters related to surface roughness was converted to a most used roughness parameter, the arithmetical mean deviation of the roughness profile (Ra), which can be used to validate the feasibility of the BRDF model in surface characterizations and performance evaluations.

  6. Stratification of a cityscape using census and land use variables for inventory of building materials

    USGS Publications Warehouse

    Rosenfield, G.H.; Fitzpatrick-Lins, K.; Johnson, T.L.

    1987-01-01

    A cityscape (or any landscape) can be stratified into environmental units using multiple variables of information. For the purposes of sampling building materials, census and land use variables were used to identify similar strata. In the Metropolitan Statistical Area of a cityscape, the census tract is the smallest unit for which census data are summarized and digitized boundaries are available. For purposes of this analysis, census data on total population, total number of housing units, and number of singleunit dwellings were aggregated into variables of persons per square kilometer and proportion of housing units in single-unit dwellings. The level 2 categories of the U.S. Geological Survey's land use and land cover data base were aggregated into variables of proportion of residential land with buildings, proportion of nonresidential land with buildings, and proportion of open land. The cityscape was stratified, from these variables, into environmental strata of Urban Central Business District, Urban Livelihood Industrial Commercial, Urban Multi-Family Residential, Urban Single Family Residential, Non-Urban Suburbanizing, and Non-Urban Rural. The New England region was chosen as a region with commonality of building materials, and a procedure developed for trial classification of census tracts into one of the strata. Final stratification was performed by discriminant analysis using the trial classification and prior probabilities as weights. The procedure was applied to several cities, and the results analyzed by correlation analysis from a field sample of building materials. The methodology developed for stratification of a cityscape using multiple variables has application to many other types of environmental studies, including forest inventory, hydrologic unit management, waste disposal, transportation studies, and other urban studies. Multivariate analysis techniques have recently been used for urban stratification in England. ?? 1987 Annals of Regional Science.

  7. QSAR modeling and chemical space analysis of antimalarial compounds

    NASA Astrophysics Data System (ADS)

    Sidorov, Pavel; Viira, Birgit; Davioud-Charvet, Elisabeth; Maran, Uko; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre

    2017-05-01

    Generative topographic mapping (GTM) has been used to visualize and analyze the chemical space of antimalarial compounds as well as to build predictive models linking structure of molecules with their antimalarial activity. For this, a database, including 3000 molecules tested in one or several of 17 anti- Plasmodium activity assessment protocols, has been compiled by assembling experimental data from in-house and ChEMBL databases. GTM classification models built on subsets corresponding to individual bioassays perform similarly to the earlier reported SVM models. Zones preferentially populated by active and inactive molecules, respectively, clearly emerge in the class landscapes supported by the GTM model. Their analysis resulted in identification of privileged structural motifs of potential antimalarial compounds. Projection of marketed antimalarial drugs on this map allowed us to delineate several areas in the chemical space corresponding to different mechanisms of antimalarial activity. This helped us to make a suggestion about the mode of action of the molecules populating these zones.

  8. QSAR modeling and chemical space analysis of antimalarial compounds.

    PubMed

    Sidorov, Pavel; Viira, Birgit; Davioud-Charvet, Elisabeth; Maran, Uko; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre

    2017-05-01

    Generative topographic mapping (GTM) has been used to visualize and analyze the chemical space of antimalarial compounds as well as to build predictive models linking structure of molecules with their antimalarial activity. For this, a database, including ~3000 molecules tested in one or several of 17 anti-Plasmodium activity assessment protocols, has been compiled by assembling experimental data from in-house and ChEMBL databases. GTM classification models built on subsets corresponding to individual bioassays perform similarly to the earlier reported SVM models. Zones preferentially populated by active and inactive molecules, respectively, clearly emerge in the class landscapes supported by the GTM model. Their analysis resulted in identification of privileged structural motifs of potential antimalarial compounds. Projection of marketed antimalarial drugs on this map allowed us to delineate several areas in the chemical space corresponding to different mechanisms of antimalarial activity. This helped us to make a suggestion about the mode of action of the molecules populating these zones.

  9. The Integrated Model of Sustainability Perspective in Spermatophyta Learning Based on Local Wisdom

    NASA Astrophysics Data System (ADS)

    Hartadiyati, E.; Rizqiyah, K.; Wiyanto; Rusilowati, A.; Prasetia, A. P. B.

    2017-09-01

    In present condition, culture is diminished, the change of social order toward the generation that has no policy and pro-sustainability; As well as the advancement of science and technology are often treated unwisely so as to excite local wisdom. It is therefore necessary to explore intra-curricular local wisdom in schools. This study aims to produce an integration model of sustainability perspectives based on local wisdom on spermatophyta material that is feasible and effective. This research uses define, design and develop stages to an integration model of sustainability perspectives based on local wisdom on spermatophyta material. The resulting product is an integration model of socio-cultural, economic and environmental sustainability perspective and formulated with preventive, preserve and build action on spermatophyta material consisting of identification and classification, metagenesis and the role of spermatophyta for human life. The integration model of sustainability perspective in learning spermatophyta based on local wisdom is considered proven to be effective in raising sustainability’s awareness of high school students.

  10. Modeling the Biodegradability of Chemical Compounds Using the Online CHEmical Modeling Environment (OCHEM)

    PubMed Central

    Vorberg, Susann

    2013-01-01

    Abstract Biodegradability describes the capacity of substances to be mineralized by free‐living bacteria. It is a crucial property in estimating a compound’s long‐term impact on the environment. The ability to reliably predict biodegradability would reduce the need for laborious experimental testing. However, this endpoint is difficult to model due to unavailability or inconsistency of experimental data. Our approach makes use of the Online Chemical Modeling Environment (OCHEM) and its rich supply of machine learning methods and descriptor sets to build classification models for ready biodegradability. These models were analyzed to determine the relationship between characteristic structural properties and biodegradation activity. The distinguishing feature of the developed models is their ability to estimate the accuracy of prediction for each individual compound. The models developed using seven individual descriptor sets were combined in a consensus model, which provided the highest accuracy. The identified overrepresented structural fragments can be used by chemists to improve the biodegradability of new chemical compounds. The consensus model, the datasets used, and the calculated structural fragments are publicly available at http://ochem.eu/article/31660. PMID:27485201

  11. A comprehensive analysis of earthquake damage patterns using high dimensional model representation feature selection

    NASA Astrophysics Data System (ADS)

    Taşkin Kaya, Gülşen

    2013-10-01

    Recently, earthquake damage assessment using satellite images has been a very popular ongoing research direction. Especially with the availability of very high resolution (VHR) satellite images, a quite detailed damage map based on building scale has been produced, and various studies have also been conducted in the literature. As the spatial resolution of satellite images increases, distinguishability of damage patterns becomes more cruel especially in case of using only the spectral information during classification. In order to overcome this difficulty, textural information needs to be involved to the classification to improve the visual quality and reliability of damage map. There are many kinds of textural information which can be derived from VHR satellite images depending on the algorithm used. However, extraction of textural information and evaluation of them have been generally a time consuming process especially for the large areas affected from the earthquake due to the size of VHR image. Therefore, in order to provide a quick damage map, the most useful features describing damage patterns needs to be known in advance as well as the redundant features. In this study, a very high resolution satellite image after Iran, Bam earthquake was used to identify the earthquake damage. Not only the spectral information, textural information was also used during the classification. For textural information, second order Haralick features were extracted from the panchromatic image for the area of interest using gray level co-occurrence matrix with different size of windows and directions. In addition to using spatial features in classification, the most useful features representing the damage characteristic were selected with a novel feature selection method based on high dimensional model representation (HDMR) giving sensitivity of each feature during classification. The method called HDMR was recently proposed as an efficient tool to capture the input-output relationships in high-dimensional systems for many problems in science and engineering. The HDMR method is developed to improve the efficiency of the deducing high dimensional behaviors. The method is formed by a particular organization of low dimensional component functions, in which each function is the contribution of one or more input variables to the output variables.

  12. Do outdoor environmental noise and atmospheric NO2 levels spatially overlap in urban areas?

    PubMed

    Tenailleau, Quentin M; Bernard, Nadine; Pujol, Sophie; Parmentier, Anne-Laure; Boilleaut, Mathieu; Houot, Hélène; Joly, Daniel; Mauny, Frédéric

    2016-07-01

    The urban environment holds numerous emission sources for air and noise pollution, creating optimum conditions for environmental multi-exposure situations. Evaluation of the joint-exposure levels is the main obstacle for multi-exposure studies and one of the biggest challenges of the next decade. The present study aims to describe the noise/NO2 multi-exposure situations in the urban environment by exploring the possible discordant and concordant situations of both exposures. Fine-scale diffusion models were developed in the European medium-sized city of Besançon (France), and a classification method was used to evaluate the multi-exposure situations in the façade perimeter of 10,825 buildings. Although correlated (Pearson's r = 0.64, p < 0.01), urban spatial distributions of the noise and NO2 around buildings do not overlap, and 30% of the buildings were considered to be discordant in terms of the noise and NO2 exposure levels. This discrepancy is spatially structured and associated with variables describing the building's environment. Our results support the presence of several co-existing, multi-exposure situations across the city impacted by both the urban morphology and the emission and diffusion/propagation phases of each pollutant. Identifying the mechanisms of discrepancy and convergence of multi-exposure situations could help improve the health risk assessment and public health. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. The SpeX Prism Library Analysis Toolkit: Design Considerations and First Results

    NASA Astrophysics Data System (ADS)

    Burgasser, Adam J.; Aganze, Christian; Escala, Ivana; Lopez, Mike; Choban, Caleb; Jin, Yuhui; Iyer, Aishwarya; Tallis, Melisa; Suarez, Adrian; Sahi, Maitrayee

    2016-01-01

    Various observational and theoretical spectral libraries now exist for galaxies, stars, planets and other objects, which have proven useful for classification, interpretation, simulation and model development. Effective use of these libraries relies on analysis tools, which are often left to users to develop. In this poster, we describe a program to develop a combined spectral data repository and Python-based analysis toolkit for low-resolution spectra of very low mass dwarfs (late M, L and T dwarfs), which enables visualization, spectral index analysis, classification, atmosphere model comparison, and binary modeling for nearly 2000 library spectra and user-submitted data. The SpeX Prism Library Analysis Toolkit (SPLAT) is being constructed as a collaborative, student-centered, learning-through-research model with high school, undergraduate and graduate students and regional science teachers, who populate the database and build the analysis tools through quarterly challenge exercises and summer research projects. In this poster, I describe the design considerations of the toolkit, its current status and development plan, and report the first published results led by undergraduate students. The combined data and analysis tools are ideal for characterizing cool stellar and exoplanetary atmospheres (including direct exoplanetary spectra observations by Gemini/GPI, VLT/SPHERE, and JWST), and the toolkit design can be readily adapted for other spectral datasets as well.This material is based upon work supported by the National Aeronautics and Space Administration under Grant No. NNX15AI75G. SPLAT code can be found at https://github.com/aburgasser/splat.

  14. A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data

    NASA Astrophysics Data System (ADS)

    Gajda, Agnieszka; Wójtowicz-Nowakowska, Anna

    2013-04-01

    A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data Land cover maps are generally produced on the basis of high resolution imagery. Recently, LiDAR (Light Detection and Ranging) data have been brought into use in diverse applications including land cover mapping. In this study we attempted to assess the accuracy of land cover classification using both high resolution aerial imagery and LiDAR data (airborne laser scanning, ALS), testing two classification approaches: a pixel-based classification and object-oriented image analysis (OBIA). The study was conducted on three test areas (3 km2 each) in the administrative area of Kraków, Poland, along the course of the Vistula River. They represent three different dominating land cover types of the Vistula River valley. Test site 1 had a semi-natural vegetation, with riparian forests and shrubs, test site 2 represented a densely built-up area, and test site 3 was an industrial site. Point clouds from ALS and ortophotomaps were both captured in November 2007. Point cloud density was on average 16 pt/m2 and it contained additional information about intensity and encoded RGB values. Ortophotomaps had a spatial resolution of 10 cm. From point clouds two raster maps were generated: intensity (1) and (2) normalised Digital Surface Model (nDSM), both with the spatial resolution of 50 cm. To classify the aerial data, a supervised classification approach was selected. Pixel based classification was carried out in ERDAS Imagine software. Ortophotomaps and intensity and nDSM rasters were used in classification. 15 homogenous training areas representing each cover class were chosen. Classified pixels were clumped to avoid salt and pepper effect. Object oriented image object classification was carried out in eCognition software, which implements both the optical and ALS data. Elevation layers (intensity, firs/last reflection, etc.) were used at segmentation stage due to proper wages usage. Thus a more precise and unambiguous boundaries of segments (objects) were received. As a results of the classification 5 classes of land cover (buildings, water, high and low vegetation and others) were extracted. Both pixel-based image analysis and OBIA were conducted with a minimum mapping unit of 10m2. Results were validated on the basis on manual classification and random points (80 per test area), reference data set was manually interpreted using ortophotomaps and expert knowledge of the test site areas.

  15. Using classification models for the generation of disease-specific medications from biomedical literature and clinical data repository.

    PubMed

    Wang, Liqin; Haug, Peter J; Del Fiol, Guilherme

    2017-05-01

    Mining disease-specific associations from existing knowledge resources can be useful for building disease-specific ontologies and supporting knowledge-based applications. Many association mining techniques have been exploited. However, the challenge remains when those extracted associations contained much noise. It is unreliable to determine the relevance of the association by simply setting up arbitrary cut-off points on multiple scores of relevance; and it would be expensive to ask human experts to manually review a large number of associations. We propose that machine-learning-based classification can be used to separate the signal from the noise, and to provide a feasible approach to create and maintain disease-specific vocabularies. We initially focused on disease-medication associations for the purpose of simplicity. For a disease of interest, we extracted potentially treatment-related drug concepts from biomedical literature citations and from a local clinical data repository. Each concept was associated with multiple measures of relevance (i.e., features) such as frequency of occurrence. For the machine purpose of learning, we formed nine datasets for three diseases with each disease having two single-source datasets and one from the combination of previous two datasets. All the datasets were labeled using existing reference standards. Thereafter, we conducted two experiments: (1) to test if adding features from the clinical data repository would improve the performance of classification achieved using features from the biomedical literature only, and (2) to determine if classifier(s) trained with known medication-disease data sets would be generalizable to new disease(s). Simple logistic regression and LogitBoost were two classifiers identified as the preferred models separately for the biomedical-literature datasets and combined datasets. The performance of the classification using combined features provided significant improvement beyond that using biomedical-literature features alone (p-value<0.001). The performance of the classifier built from known diseases to predict associated concepts for new diseases showed no significant difference from the performance of the classifier built and tested using the new disease's dataset. It is feasible to use classification approaches to automatically predict the relevance of a concept to a disease of interest. It is useful to combine features from disparate sources for the task of classification. Classifiers built from known diseases were generalizable to new diseases. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Building Loads Analysis and System Thermodynamics (BLAST) Program Users Manual. Volume One. Supplement (Version 3.0).

    DTIC Science & Technology

    1981-03-01

    AD-A B99 054 CONSTRUCTION EN INEERIN RESEARCH LAB (ARMY) CHAMPAIGN IL F/ 9/2 BUILDING LOADS ANALYSIS AND SYSTEM THERMOD NAMICS (BLAST) PROGR...continued. systems , (11) induction unit systems , (12) direct-drive chillers, and (13) purchased steam from utilities. BLAST Version 3.0 also offers the user...their BLAST input. II UNCLASSIFIED SECURITY CLASSIFICATION OF THIS PAGEftin Date Rnerod) FOREWORD This report was prepared for the Air Force Systems

  17. A&M. TAN607. Elevation for secondphase expansion of A&M Building. Work ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    A&M. TAN-607. Elevation for second-phase expansion of A&M Building. Work areas south of the Carpentry Shop. High-bay shop, decontamination room at south-most end. Approved by INEEL Classification Office for public release. Ralph M. Parsons 1299-5-ANP/GE-3-607-A 106. Date: August 1956. INEEL index code no. 034-0607-00-693-107166 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  18. Cascaded VLSI neural network architecture for on-line learning

    NASA Technical Reports Server (NTRS)

    Thakoor, Anilkumar P. (Inventor); Duong, Tuan A. (Inventor); Daud, Taher (Inventor)

    1992-01-01

    High-speed, analog, fully-parallel, and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A computation intensive feature classification application was demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as an application specific coprocessor for solving real world problems at extremely high data rates.

  19. Cascaded VLSI neural network architecture for on-line learning

    NASA Technical Reports Server (NTRS)

    Duong, Tuan A. (Inventor); Daud, Taher (Inventor); Thakoor, Anilkumar P. (Inventor)

    1995-01-01

    High-speed, analog, fully-parallel and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware-compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A comparison-intensive feature classification application has been demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as application-specific-coprocessors for solving real-world problems at extremely high data rates.

  20. Development Index, A Proposed Pattern for Organizing and Facilitating the Flow of Information Needed By Man in Furthering His Own Development, With Particular Reference to the Development of Buildings and Communities and Other Forms of Environmental Control.

    ERIC Educational Resources Information Center

    Michigan Univ., Ann Arbor.

    The organization of knowledge related to the development of the environment and the building industry is provided in this index which provides a framework or classification system for a broad range of information. Man's development in terms of environmental structuring and control is discussed as development goals, development cycle, and…

  1. A Saliency Guided Semi-Supervised Building Change Detection Method for High Resolution Remote Sensing Images

    PubMed Central

    Hou, Bin; Wang, Yunhong; Liu, Qingjie

    2016-01-01

    Characterizations of up to date information of the Earth’s surface are an important application providing insights to urban planning, resources monitoring and environmental studies. A large number of change detection (CD) methods have been developed to solve them by utilizing remote sensing (RS) images. The advent of high resolution (HR) remote sensing images further provides challenges to traditional CD methods and opportunities to object-based CD methods. While several kinds of geospatial objects are recognized, this manuscript mainly focuses on buildings. Specifically, we propose a novel automatic approach combining pixel-based strategies with object-based ones for detecting building changes with HR remote sensing images. A multiresolution contextual morphological transformation called extended morphological attribute profiles (EMAPs) allows the extraction of geometrical features related to the structures within the scene at different scales. Pixel-based post-classification is executed on EMAPs using hierarchical fuzzy clustering. Subsequently, the hierarchical fuzzy frequency vector histograms are formed based on the image-objects acquired by simple linear iterative clustering (SLIC) segmentation. Then, saliency and morphological building index (MBI) extracted on difference images are used to generate a pseudo training set. Ultimately, object-based semi-supervised classification is implemented on this training set by applying random forest (RF). Most of the important changes are detected by the proposed method in our experiments. This study was checked for effectiveness using visual evaluation and numerical evaluation. PMID:27618903

  2. A Saliency Guided Semi-Supervised Building Change Detection Method for High Resolution Remote Sensing Images.

    PubMed

    Hou, Bin; Wang, Yunhong; Liu, Qingjie

    2016-08-27

    Characterizations of up to date information of the Earth's surface are an important application providing insights to urban planning, resources monitoring and environmental studies. A large number of change detection (CD) methods have been developed to solve them by utilizing remote sensing (RS) images. The advent of high resolution (HR) remote sensing images further provides challenges to traditional CD methods and opportunities to object-based CD methods. While several kinds of geospatial objects are recognized, this manuscript mainly focuses on buildings. Specifically, we propose a novel automatic approach combining pixel-based strategies with object-based ones for detecting building changes with HR remote sensing images. A multiresolution contextual morphological transformation called extended morphological attribute profiles (EMAPs) allows the extraction of geometrical features related to the structures within the scene at different scales. Pixel-based post-classification is executed on EMAPs using hierarchical fuzzy clustering. Subsequently, the hierarchical fuzzy frequency vector histograms are formed based on the image-objects acquired by simple linear iterative clustering (SLIC) segmentation. Then, saliency and morphological building index (MBI) extracted on difference images are used to generate a pseudo training set. Ultimately, object-based semi-supervised classification is implemented on this training set by applying random forest (RF). Most of the important changes are detected by the proposed method in our experiments. This study was checked for effectiveness using visual evaluation and numerical evaluation.

  3. Approximate classification of mining tremors harmfulness based on free-field and building foundation vibrations

    NASA Astrophysics Data System (ADS)

    Kuzniar, Krystyna; Stec, Krystyna; Tatara, Tadeusz

    2018-04-01

    The paper compares the results of an approximate evaluation of mining tremors harmfulness performed on the basis of free-field and simultaneously measured building foundation vibrations. The focus is on the office building located in the Upper Silesian Basin (USB). The empirical Mining Intensity Scale GSI-GZWKW-2012 has been applied to classify the harmfulness of the rockbursts. This scale is based on the measurements of free-field vibrations but, for research purposes, it was also used in the cases of building foundation vibrations. The analysis was carried out using the set of 156 pairs ground - foundation of velocity vibration records as well as the set of 156 pairs of acceleration records induced by the same mining tremors.

  4. Comparing the performance of flat and hierarchical Habitat/Land-Cover classification models in a NATURA 2000 site

    NASA Astrophysics Data System (ADS)

    Gavish, Yoni; O'Connell, Jerome; Marsh, Charles J.; Tarantino, Cristina; Blonda, Palma; Tomaselli, Valeria; Kunin, William E.

    2018-02-01

    The increasing need for high quality Habitat/Land-Cover (H/LC) maps has triggered considerable research into novel machine-learning based classification models. In many cases, H/LC classes follow pre-defined hierarchical classification schemes (e.g., CORINE), in which fine H/LC categories are thematically nested within more general categories. However, none of the existing machine-learning algorithms account for this pre-defined hierarchical structure. Here we introduce a novel Random Forest (RF) based application of hierarchical classification, which fits a separate local classification model in every branching point of the thematic tree, and then integrates all the different local models to a single global prediction. We applied the hierarchal RF approach in a NATURA 2000 site in Italy, using two land-cover (CORINE, FAO-LCCS) and one habitat classification scheme (EUNIS) that differ from one another in the shape of the class hierarchy. For all 3 classification schemes, both the hierarchical model and a flat model alternative provided accurate predictions, with kappa values mostly above 0.9 (despite using only 2.2-3.2% of the study area as training cells). The flat approach slightly outperformed the hierarchical models when the hierarchy was relatively simple, while the hierarchical model worked better under more complex thematic hierarchies. Most misclassifications came from habitat pairs that are thematically distant yet spectrally similar. In 2 out of 3 classification schemes, the additional constraints of the hierarchical model resulted with fewer such serious misclassifications relative to the flat model. The hierarchical model also provided valuable information on variable importance which can shed light into "black-box" based machine learning algorithms like RF. We suggest various ways by which hierarchical classification models can increase the accuracy and interpretability of H/LC classification maps.

  5. Pastureland ESD concepts and current development

    USDA-ARS?s Scientific Manuscript database

    Bringing pastureland classification into an ecological site framework gives us the opportunity to build on the extensive experience of rangeland scientists and managers with this process. Unlike rangelands, pasture plant communities are dominated by naturalized species and are maintained by manageme...

  6. Classifications for Cesarean Section: A Systematic Review

    PubMed Central

    Torloni, Maria Regina; Betran, Ana Pilar; Souza, Joao Paulo; Widmer, Mariana; Allen, Tomas; Gulmezoglu, Metin; Merialdi, Mario

    2011-01-01

    Background Rising cesarean section (CS) rates are a major public health concern and cause worldwide debates. To propose and implement effective measures to reduce or increase CS rates where necessary requires an appropriate classification. Despite several existing CS classifications, there has not yet been a systematic review of these. This study aimed to 1) identify the main CS classifications used worldwide, 2) analyze advantages and deficiencies of each system. Methods and Findings Three electronic databases were searched for classifications published 1968–2008. Two reviewers independently assessed classifications using a form created based on items rated as important by international experts. Seven domains (ease, clarity, mutually exclusive categories, totally inclusive classification, prospective identification of categories, reproducibility, implementability) were assessed and graded. Classifications were tested in 12 hypothetical clinical case-scenarios. From a total of 2948 citations, 60 were selected for full-text evaluation and 27 classifications identified. Indications classifications present important limitations and their overall score ranged from 2–9 (maximum grade = 14). Degree of urgency classifications also had several drawbacks (overall scores 6–9). Woman-based classifications performed best (scores 5–14). Other types of classifications require data not routinely collected and may not be relevant in all settings (scores 3–8). Conclusions This review and critical appraisal of CS classifications is a methodologically sound contribution to establish the basis for the appropriate monitoring and rational use of CS. Results suggest that women-based classifications in general, and Robson's classification, in particular, would be in the best position to fulfill current international and local needs and that efforts to develop an internationally applicable CS classification would be most appropriately placed in building upon this classification. The use of a single CS classification will facilitate auditing, analyzing and comparing CS rates across different settings and help to create and implement effective strategies specifically targeted to optimize CS rates where necessary. PMID:21283801

  7. A Model Assessment and Classification System for Men and Women in Correctional Institutions.

    ERIC Educational Resources Information Center

    Hellervik, Lowell W.; And Others

    The report describes a manpower assessment and classification system for criminal offenders directed towards making practical training and job classification decisions. The model is not concerned with custody classifications except as they affect occupational/training possibilities. The model combines traditional procedures of vocational…

  8. A bio-optical model for integration into ecosystem models for the Ligurian Sea

    NASA Astrophysics Data System (ADS)

    Bengil, Fethi; McKee, David; Beşiktepe, Sükrü T.; Sanjuan Calzado, Violeta; Trees, Charles

    2016-12-01

    A bio-optical model has been developed for the Ligurian Sea which encompasses both deep, oceanic Case 1 waters and shallow, coastal Case 2 waters. The model builds on earlier Case 1 models for the region and uses field data collected on the BP09 research cruise to establish new relationships for non-biogenic particles and CDOM. The bio-optical model reproduces in situ IOPs accurately and is used to parameterize radiative transfer simulations which demonstrate its utility for modeling underwater light levels and above surface remote sensing reflectance. Prediction of euphotic depth is found to be accurate to within ∼3.2 m (RMSE). Previously published light field models work well for deep oceanic parts of the Ligurian Sea that fit the Case 1 classification. However, they are found to significantly over-estimate euphotic depth in optically complex coastal waters where the influence of non-biogenic materials is strongest. For these coastal waters, the combination of the bio-optical model proposed here and full radiative transfer simulations provides significantly more accurate predictions of euphotic depth.

  9. Engineering Change Management Method Framework in Mechanical Engineering

    NASA Astrophysics Data System (ADS)

    Stekolschik, Alexander

    2016-11-01

    Engineering changes make an impact on different process chains in and outside the company, and lead to most error costs and time shifts. In fact, 30 to 50 per cent of development costs result from technical changes. Controlling engineering change processes can help us to avoid errors and risks, and contribute to cost optimization and a shorter time to market. This paper presents a method framework for controlling engineering changes at mechanical engineering companies. The developed classification of engineering changes and accordingly process requirements build the basis for the method framework. The developed method framework comprises two main areas: special data objects managed in different engineering IT tools and process framework. Objects from both areas are building blocks that can be selected to the overall business process based on the engineering process type and change classification. The process framework contains steps for the creation of change objects (both for overall change and for parts), change implementation, and release. Companies can select singleprocess building blocks from the framework, depending on the product development process and change impact. The developed change framework has been implemented at a division (10,000 employees) of a big German mechanical engineering company.

  10. Mammographic mass classification based on possibility theory

    NASA Astrophysics Data System (ADS)

    Hmida, Marwa; Hamrouni, Kamel; Solaiman, Basel; Boussetta, Sana

    2017-03-01

    Shape and margin features are very important for differentiating between benign and malignant masses in mammographic images. In fact, benign masses are usually round and oval and have smooth contours. However, malignant tumors have generally irregular shape and appear lobulated or speculated in margins. This knowledge suffers from imprecision and ambiguity. Therefore, this paper deals with the problem of mass classification by using shape and margin features while taking into account the uncertainty linked to the degree of truth of the available information and the imprecision related to its content. Thus, in this work, we proposed a novel mass classification approach which provides a possibility based representation of the extracted shape features and builds a possibility knowledge basis in order to evaluate the possibility degree of malignancy and benignity for each mass. For experimentation, the MIAS database was used and the classification results show the great performance of our approach in spite of using simple features.

  11. Optical tomographic detection of rheumatoid arthritis with computer-aided classification schemes

    NASA Astrophysics Data System (ADS)

    Klose, Christian D.; Klose, Alexander D.; Netz, Uwe; Beuthan, Jürgen; Hielscher, Andreas H.

    2009-02-01

    A recent research study has shown that combining multiple parameters, drawn from optical tomographic images, leads to better classification results to identifying human finger joints that are affected or not affected by rheumatic arthritis RA. Building up on the research findings of the previous study, this article presents an advanced computer-aided classification approach for interpreting optical image data to detect RA in finger joints. Additional data are used including, for example, maximum and minimum values of the absorption coefficient as well as their ratios and image variances. Classification performances obtained by the proposed method were evaluated in terms of sensitivity, specificity, Youden index and area under the curve AUC. Results were compared to different benchmarks ("gold standard"): magnet resonance, ultrasound and clinical evaluation. Maximum accuracies (AUC=0.88) were reached when combining minimum/maximum-ratios and image variances and using ultrasound as gold standard.

  12. A Comprehensive Analysis on Wearable Acceleration Sensors in Human Activity Recognition.

    PubMed

    Janidarmian, Majid; Roshan Fekr, Atena; Radecka, Katarzyna; Zilic, Zeljko

    2017-03-07

    Sensor-based motion recognition integrates the emerging area of wearable sensors with novel machine learning techniques to make sense of low-level sensor data and provide rich contextual information in a real-life application. Although Human Activity Recognition (HAR) problem has been drawing the attention of researchers, it is still a subject of much debate due to the diverse nature of human activities and their tracking methods. Finding the best predictive model in this problem while considering different sources of heterogeneities can be very difficult to analyze theoretically, which stresses the need of an experimental study. Therefore, in this paper, we first create the most complete dataset, focusing on accelerometer sensors, with various sources of heterogeneities. We then conduct an extensive analysis on feature representations and classification techniques (the most comprehensive comparison yet with 293 classifiers) for activity recognition. Principal component analysis is applied to reduce the feature vector dimension while keeping essential information. The average classification accuracy of eight sensor positions is reported to be 96.44% ± 1.62% with 10-fold evaluation, whereas accuracy of 79.92% ± 9.68% is reached in the subject-independent evaluation. This study presents significant evidence that we can build predictive models for HAR problem under more realistic conditions, and still achieve highly accurate results.

  13. Finding New Perovskite Halides via Machine learning

    NASA Astrophysics Data System (ADS)

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-01

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning) via building a support vector machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  14. Knowledge mining from clinical datasets using rough sets and backpropagation neural network.

    PubMed

    Nahato, Kindie Biredagn; Harichandran, Khanna Nehemiah; Arputharaj, Kannan

    2015-01-01

    The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict the presence or absence of a disease by learning from the minimal set of attributes that has been extracted from the clinical dataset. In this work rough set indiscernibility relation method with backpropagation neural network (RS-BPNN) is used. This work has two stages. The first stage is handling of missing values to obtain a smooth data set and selection of appropriate attributes from the clinical dataset by indiscernibility relation method. The second stage is classification using backpropagation neural network on the selected reducts of the dataset. The classifier has been tested with hepatitis, Wisconsin breast cancer, and Statlog heart disease datasets obtained from the University of California at Irvine (UCI) machine learning repository. The accuracy obtained from the proposed method is 97.3%, 98.6%, and 90.4% for hepatitis, breast cancer, and heart disease, respectively. The proposed system provides an effective classification model for clinical datasets.

  15. An FPGA-Based Rapid Wheezing Detection System

    PubMed Central

    Lin, Bor-Shing; Yen, Tian-Shiue

    2014-01-01

    Wheezing is often treated as a crucial indicator in the diagnosis of obstructive pulmonary diseases. A rapid wheezing detection system may help physicians to monitor patients over the long-term. In this study, a portable wheezing detection system based on a field-programmable gate array (FPGA) is proposed. This system accelerates wheezing detection, and can be used as either a single-process system, or as an integrated part of another biomedical signal detection system. The system segments sound signals into 2-second units. A short-time Fourier transform was used to determine the relationship between the time and frequency components of wheezing sound data. A spectrogram was processed using 2D bilateral filtering, edge detection, multithreshold image segmentation, morphological image processing, and image labeling, to extract wheezing features according to computerized respiratory sound analysis (CORSA) standards. These features were then used to train the support vector machine (SVM) and build the classification models. The trained model was used to analyze sound data to detect wheezing. The system runs on a Xilinx Virtex-6 FPGA ML605 platform. The experimental results revealed that the system offered excellent wheezing recognition performance (0.912). The detection process can be used with a clock frequency of 51.97 MHz, and is able to perform rapid wheezing classification. PMID:24481034

  16. An Anomalous Noise Events Detector for Dynamic Road Traffic Noise Mapping in Real-Life Urban and Suburban Environments.

    PubMed

    Socoró, Joan Claudi; Alías, Francesc; Alsina-Pagès, Rosa Ma

    2017-10-12

    One of the main aspects affecting the quality of life of people living in urban and suburban areas is their continued exposure to high Road Traffic Noise (RTN) levels. Until now, noise measurements in cities have been performed by professionals, recording data in certain locations to build a noise map afterwards. However, the deployment of Wireless Acoustic Sensor Networks (WASN) has enabled automatic noise mapping in smart cities. In order to obtain a reliable picture of the RTN levels affecting citizens, Anomalous Noise Events (ANE) unrelated to road traffic should be removed from the noise map computation. To this aim, this paper introduces an Anomalous Noise Event Detector (ANED) designed to differentiate between RTN and ANE in real time within a predefined interval running on the distributed low-cost acoustic sensors of a WASN. The proposed ANED follows a two-class audio event detection and classification approach, instead of multi-class or one-class classification schemes, taking advantage of the collection of representative acoustic data in real-life environments. The experiments conducted within the DYNAMAP project, implemented on ARM-based acoustic sensors, show the feasibility of the proposal both in terms of computational cost and classification performance using standard Mel cepstral coefficients and Gaussian Mixture Models (GMM). The two-class GMM core classifier relatively improves the baseline universal GMM one-class classifier F1 measure by 18.7% and 31.8% for suburban and urban environments, respectively, within the 1-s integration interval. Nevertheless, according to the results, the classification performance of the current ANED implementation still has room for improvement.

  17. Quarry identification of historical building materials by means of laser induced breakdown spectroscopy, X-ray fluorescence and chemometric analysis

    NASA Astrophysics Data System (ADS)

    Colao, F.; Fantoni, R.; Ortiz, P.; Vazquez, M. A.; Martin, J. M.; Ortiz, R.; Idris, N.

    2010-08-01

    To characterize historical building materials according to the geographic origin of the quarries from which they have been mined, the relative content of major and trace elements were determined by means of Laser Induced Breakdown Spectroscopy (LIBS) and X-ray Fluorescence (XRF) techniques. 48 different specimens were studied and the entire samples' set was divided in two different groups: the first, used as reference set, was composed by samples mined from eight different quarries located in Seville province; the second group was composed by specimens of unknown provenance collected in several historical buildings and churches in the city of Seville. Data reduction and analysis on laser induced breakdown spectroscopy and X-ray fluorescence measurements was performed using multivariate statistical approach, namely the Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA). A clear separation among reference sample materials mined from different quarries was observed in Principal Components (PC) score plots, then a supervised soft independent modeling of class analogy classification was trained and run, aiming to assess the provenance of unknown samples according to their elemental content. The obtained results were compared with the provenance assignments made on the basis of petrographical description. This work gives experimental evidence that laser induced breakdown spectroscopy measurements on a relatively small set of elements is a fast and effective method for the purpose of origin identification.

  18. Optimized hardware framework of MLP with random hidden layers for classification applications

    NASA Astrophysics Data System (ADS)

    Zyarah, Abdullah M.; Ramesh, Abhishek; Merkel, Cory; Kudithipudi, Dhireesha

    2016-05-01

    Multilayer Perceptron Networks with random hidden layers are very efficient at automatic feature extraction and offer significant performance improvements in the training process. They essentially employ large collection of fixed, random features, and are expedient for form-factor constrained embedded platforms. In this work, a reconfigurable and scalable architecture is proposed for the MLPs with random hidden layers with a customized building block based on CORDIC algorithm. The proposed architecture also exploits fixed point operations for area efficiency. The design is validated for classification on two different datasets. An accuracy of ~ 90% for MNIST dataset and 75% for gender classification on LFW dataset was observed. The hardware has 299 speed-up over the corresponding software realization.

  19. Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning

    NASA Astrophysics Data System (ADS)

    Vetrivel, Anand; Gerke, Markus; Kerle, Norman; Nex, Francesco; Vosselman, George

    2018-06-01

    Oblique aerial images offer views of both building roofs and façades, and thus have been recognized as a potential source to detect severe building damages caused by destructive disaster events such as earthquakes. Therefore, they represent an important source of information for first responders or other stakeholders involved in the post-disaster response process. Several automated methods based on supervised learning have already been demonstrated for damage detection using oblique airborne images. However, they often do not generalize well when data from new unseen sites need to be processed, hampering their practical use. Reasons for this limitation include image and scene characteristics, though the most prominent one relates to the image features being used for training the classifier. Recently features based on deep learning approaches, such as convolutional neural networks (CNNs), have been shown to be more effective than conventional hand-crafted features, and have become the state-of-the-art in many domains, including remote sensing. Moreover, often oblique images are captured with high block overlap, facilitating the generation of dense 3D point clouds - an ideal source to derive geometric characteristics. We hypothesized that the use of CNN features, either independently or in combination with 3D point cloud features, would yield improved performance in damage detection. To this end we used CNN and 3D features, both independently and in combination, using images from manned and unmanned aerial platforms over several geographic locations that vary significantly in terms of image and scene characteristics. A multiple-kernel-learning framework, an effective way for integrating features from different modalities, was used for combining the two sets of features for classification. The results are encouraging: while CNN features produced an average classification accuracy of about 91%, the integration of 3D point cloud features led to an additional improvement of about 3% (i.e. an average classification accuracy of 94%). The significance of 3D point cloud features becomes more evident in the model transferability scenario (i.e., training and testing samples from different sites that vary slightly in the aforementioned characteristics), where the integration of CNN and 3D point cloud features significantly improved the model transferability accuracy up to a maximum of 7% compared with the accuracy achieved by CNN features alone. Overall, an average accuracy of 85% was achieved for the model transferability scenario across all experiments. Our main conclusion is that such an approach qualifies for practical use.

  20. Compensatory neurofuzzy model for discrete data classification in biomedical

    NASA Astrophysics Data System (ADS)

    Ceylan, Rahime

    2015-03-01

    Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.

  1. Concept for Classifying Facade Elements Based on Material, Geometry and Thermal Radiation Using Multimodal Uav Remote Sensing

    NASA Astrophysics Data System (ADS)

    Ilehag, R.; Schenk, A.; Hinz, S.

    2017-08-01

    This paper presents a concept for classification of facade elements, based on the material and the geometry of the elements in addition to the thermal radiation of the facade with the usage of a multimodal Unmanned Aerial Vehicle (UAV) system. Once the concept is finalized and functional, the workflow can be used for energy demand estimations for buildings by exploiting existing methods for estimation of heat transfer coefficient and the transmitted heat loss. The multimodal system consists of a thermal, a hyperspectral and an optical sensor, which can be operational with a UAV. While dealing with sensors that operate in different spectra and have different technical specifications, such as the radiometric and the geometric resolution, the challenges that are faced are presented. Addressed are the different approaches of data fusion, such as image registration, generation of 3D models by performing image matching and the means for classification based on either the geometry of the object or the pixel values. As a first step towards realizing the concept, the result from a geometric calibration with a designed multimodal calibration pattern is presented.

  2. Seismic facies analysis based on self-organizing map and empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Du, Hao-kun; Cao, Jun-xing; Xue, Ya-juan; Wang, Xing-jian

    2015-01-01

    Seismic facies analysis plays an important role in seismic interpretation and reservoir model building by offering an effective way to identify the changes in geofacies inter wells. The selections of input seismic attributes and their time window have an obvious effect on the validity of classification and require iterative experimentation and prior knowledge. In general, it is sensitive to noise when waveform serves as the input data to cluster analysis, especially with a narrow window. To conquer this limitation, the Empirical Mode Decomposition (EMD) method is introduced into waveform classification based on SOM. We first de-noise the seismic data using EMD and then cluster the data using 1D grid SOM. The main advantages of this method are resolution enhancement and noise reduction. 3D seismic data from the western Sichuan basin, China, are collected for validation. The application results show that seismic facies analysis can be improved and better help the interpretation. The powerful tolerance for noise makes the proposed method to be a better seismic facies analysis tool than classical 1D grid SOM method, especially for waveform cluster with a narrow window.

  3. Quantitative Laughter Detection, Measurement, and Classification-A Critical Survey.

    PubMed

    Cosentino, Sarah; Sessa, Salvatore; Takanishi, Atsuo

    2016-01-01

    The study of human nonverbal social behaviors has taken a more quantitative and computational approach in recent years due to the development of smart interfaces and virtual agents or robots able to interact socially. One of the most interesting nonverbal social behaviors, producing a characteristic vocal signal, is laughing. Laughter is produced in several different situations: in response to external physical, cognitive, or emotional stimuli; to negotiate social interactions; and also, pathologically, as a consequence of neural damage. For this reason, laughter has attracted researchers from many disciplines. A consequence of this multidisciplinarity is the absence of a holistic vision of this complex behavior: the methods of analysis and classification of laughter, as well as the terminology used, are heterogeneous; the findings sometimes contradictory and poorly documented. This survey aims at collecting and presenting objective measurement methods and results from a variety of different studies in different fields, to contribute to build a unified model and taxonomy of laughter. This could be successfully used for advances in several fields, from artificial intelligence and human-robot interaction to medicine and psychiatry.

  4. Improvement of information fusion-based audio steganalysis

    NASA Astrophysics Data System (ADS)

    Kraetzer, Christian; Dittmann, Jana

    2010-01-01

    In the paper we extend an existing information fusion based audio steganalysis approach by three different kinds of evaluations: The first evaluation addresses the so far neglected evaluations on sensor level fusion. Our results show that this fusion removes content dependability while being capable of achieving similar classification rates (especially for the considered global features) if compared to single classifiers on the three exemplarily tested audio data hiding algorithms. The second evaluation enhances the observations on fusion from considering only segmental features to combinations of segmental and global features, with the result of a reduction of the required computational complexity for testing by about two magnitudes while maintaining the same degree of accuracy. The third evaluation tries to build a basis for estimating the plausibility of the introduced steganalysis approach by measuring the sensibility of the models used in supervised classification of steganographic material against typical signal modification operations like de-noising or 128kBit/s MP3 encoding. Our results show that for some of the tested classifiers the probability of false alarms rises dramatically after such modifications.

  5. Learning a Mahalanobis Distance-Based Dynamic Time Warping Measure for Multivariate Time Series Classification.

    PubMed

    Mei, Jiangyuan; Liu, Meizhu; Wang, Yuan-Fang; Gao, Huijun

    2016-06-01

    Multivariate time series (MTS) datasets broadly exist in numerous fields, including health care, multimedia, finance, and biometrics. How to classify MTS accurately has become a hot research topic since it is an important element in many computer vision and pattern recognition applications. In this paper, we propose a Mahalanobis distance-based dynamic time warping (DTW) measure for MTS classification. The Mahalanobis distance builds an accurate relationship between each variable and its corresponding category. It is utilized to calculate the local distance between vectors in MTS. Then we use DTW to align those MTS which are out of synchronization or with different lengths. After that, how to learn an accurate Mahalanobis distance function becomes another key problem. This paper establishes a LogDet divergence-based metric learning with triplet constraint model which can learn Mahalanobis matrix with high precision and robustness. Furthermore, the proposed method is applied on nine MTS datasets selected from the University of California, Irvine machine learning repository and Robert T. Olszewski's homepage, and the results demonstrate the improved performance of the proposed approach.

  6. All complete intersection Calabi-Yau four-folds

    NASA Astrophysics Data System (ADS)

    Gray, James; Haupt, Alexander S.; Lukas, Andre

    2013-07-01

    We present an exhaustive, constructive, classification of the Calabi-Yau four-folds which can be described as complete intersections in products of projective spaces. A comprehensive list of 921,497 configuration matrices which represent all topologically distinct types of complete intersection Calabi-Yau four-folds is provided and can be downloaded from http://www-thphys.physics.ox.ac.uk/projects/CalabiYau/Cicy4folds/index.html. The manifolds have non-negative Euler characteristics in the range 0 ≤ χ ≤ 2610. This data set will be of use in a wide range of physical and mathematical applications. Nearly all of these four-folds are elliptically fibered and are thus of interest for F-theory model building.

  7. Supervised Machine Learning for Regionalization of Environmental Data: Distribution of Uranium in Groundwater in Ukraine

    NASA Astrophysics Data System (ADS)

    Govorov, Michael; Gienko, Gennady; Putrenko, Viktor

    2018-05-01

    In this paper, several supervised machine learning algorithms were explored to define homogeneous regions of con-centration of uranium in surface waters in Ukraine using multiple environmental parameters. The previous study was focused on finding the primary environmental parameters related to uranium in ground waters using several methods of spatial statistics and unsupervised classification. At this step, we refined the regionalization using Artifi-cial Neural Networks (ANN) techniques including Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Convolutional Neural Network (CNN). The study is focused on building local ANN models which may significantly improve the prediction results of machine learning algorithms by taking into considerations non-stationarity and autocorrelation in spatial data.

  8. A rock physics and seismic reservoir characterization study of the Rock Springs Uplift, a carbon dioxide sequestration site in Southwestern Wyoming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grana, Dario; Verma, Sumit; Pafeng, Josiane

    We present a reservoir geophysics study, including rock physics modeling and seismic inversion, of a carbon dioxide sequestration site in Southwestern Wyoming, namely the Rock Springs Uplift, and build a petrophysical model for the potential injection reservoirs for carbon dioxide sequestration. Our objectives include the facies classification and the estimation of the spatial model of porosity and permeability for two sequestration targets of interest, the Madison Limestone and the Weber Sandstone. The available dataset includes a complete set of well logs at the location of the borehole available in the area, a set of 110 core samples, and a seismicmore » survey acquired in the area around the well. The proposed study includes a formation evaluation analysis and facies classification at the well location, the calibration of a rock physics model to link petrophysical properties and elastic attributes using well log data and core samples, the elastic inversion of the pre-stack seismic data, and the estimation of the reservoir model of facies, porosity and permeability conditioned by seismic inverted elastic attributes and well log data. In particular, the rock physics relations are facies-dependent and include granular media equations for clean and shaley sandstone, and inclusion models for the dolomitized limestone. The permeability model has been computed by applying a facies-dependent porosity-permeability relation calibrated using core sample measurements. Finally, the study shows that both formations show good storage capabilities. The Madison Limestone includes a homogeneous layer of high-porosity high-permeability dolomite; the Weber Sandstone is characterized by a lower average porosity but the layer is thicker than the Madison Limestone.« less

  9. A rock physics and seismic reservoir characterization study of the Rock Springs Uplift, a carbon dioxide sequestration site in Southwestern Wyoming

    DOE PAGES

    Grana, Dario; Verma, Sumit; Pafeng, Josiane; ...

    2017-06-20

    We present a reservoir geophysics study, including rock physics modeling and seismic inversion, of a carbon dioxide sequestration site in Southwestern Wyoming, namely the Rock Springs Uplift, and build a petrophysical model for the potential injection reservoirs for carbon dioxide sequestration. Our objectives include the facies classification and the estimation of the spatial model of porosity and permeability for two sequestration targets of interest, the Madison Limestone and the Weber Sandstone. The available dataset includes a complete set of well logs at the location of the borehole available in the area, a set of 110 core samples, and a seismicmore » survey acquired in the area around the well. The proposed study includes a formation evaluation analysis and facies classification at the well location, the calibration of a rock physics model to link petrophysical properties and elastic attributes using well log data and core samples, the elastic inversion of the pre-stack seismic data, and the estimation of the reservoir model of facies, porosity and permeability conditioned by seismic inverted elastic attributes and well log data. In particular, the rock physics relations are facies-dependent and include granular media equations for clean and shaley sandstone, and inclusion models for the dolomitized limestone. The permeability model has been computed by applying a facies-dependent porosity-permeability relation calibrated using core sample measurements. Finally, the study shows that both formations show good storage capabilities. The Madison Limestone includes a homogeneous layer of high-porosity high-permeability dolomite; the Weber Sandstone is characterized by a lower average porosity but the layer is thicker than the Madison Limestone.« less

  10. Texture operator for snow particle classification into snowflake and graupel

    NASA Astrophysics Data System (ADS)

    Nurzyńska, Karolina; Kubo, Mamoru; Muramoto, Ken-ichiro

    2012-11-01

    In order to improve the estimation of precipitation, the coefficients of Z-R relation should be determined for each snow type. Therefore, it is necessary to identify the type of falling snow. Consequently, this research addresses a problem of snow particle classification into snowflake and graupel in an automatic manner (as these types are the most common in the study region). Having correctly classified precipitation events, it is believed that it will be possible to estimate the related parameters accurately. The automatic classification system presented here describes the images with texture operators. Some of them are well-known from the literature: first order features, co-occurrence matrix, grey-tone difference matrix, run length matrix, and local binary pattern, but also a novel approach to design simple local statistic operators is introduced. In this work the following texture operators are defined: mean histogram, min-max histogram, and mean-variance histogram. Moreover, building a feature vector, which is based on the structure created in many from mentioned algorithms is also suggested. For classification, the k-nearest neighbourhood classifier was applied. The results showed that it is possible to achieve correct classification accuracy above 80% by most of the techniques. The best result of 86.06%, was achieved for operator built from a structure achieved in the middle stage of the co-occurrence matrix calculation. Next, it was noticed that describing an image with two texture operators does not improve the classification results considerably. In the best case the correct classification efficiency was 87.89% for a pair of texture operators created from local binary pattern and structure build in a middle stage of grey-tone difference matrix calculation. This also suggests that the information gathered by each texture operator is redundant. Therefore, the principal component analysis was applied in order to remove the unnecessary information and additionally reduce the length of the feature vectors. The improvement of the correct classification efficiency for up to 100% is possible for methods: min-max histogram, texture operator built from structure achieved in a middle stage of co-occurrence matrix calculation, texture operator built from a structure achieved in a middle stage of grey-tone difference matrix creation, and texture operator based on a histogram, when the feature vector stores 99% of initial information.

  11. LPT. Plot plan and site layout. Includes shield test pool/EBOR ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    LPT. Plot plan and site layout. Includes shield test pool/EBOR facility. (TAN-645 and -646) low power test building (TAN-640 and -641), water storage tanks, guard house (TAN-642), pump house (TAN-644), driveways, well, chlorination building (TAN-643), septic system. Ralph M. Parsons 1229-12 ANP/GE-7-102. November 1956. Approved by INEEL Classification Office for public release. INEEL index code no. 038-0102-00-693-107261 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  12. IET. Movable test cell building (TAN624). Plans, sections, and elevations ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    IET. Movable test cell building (TAN-624). Plans, sections, and elevations show trapezoidal shape of front/rear elevations, vertical sliding door panels, wheels, periscope and camera locations, fixed concrete wall, and relationship to coupling station (TAN-620) and rail track. Ralph M. Parson 902-4-ANP-624-A 329. Date: February 1954. Approved by INEEL Classification Office for public release. INEEL Index code no. 035-0624-00-693-106911 - Idaho National Engineering Laboratory, Test Area North, Scoville, Butte County, ID

  13. Automatic identification of bird targets with radar via patterns produced by wing flapping.

    PubMed

    Zaugg, Serge; Saporta, Gilbert; van Loon, Emiel; Schmaljohann, Heiko; Liechti, Felix

    2008-09-06

    Bird identification with radar is important for bird migration research, environmental impact assessments (e.g. wind farms), aircraft security and radar meteorology. In a study on bird migration, radar signals from birds, insects and ground clutter were recorded. Signals from birds show a typical pattern due to wing flapping. The data were labelled by experts into the four classes BIRD, INSECT, CLUTTER and UFO (unidentifiable signals). We present a classification algorithm aimed at automatic recognition of bird targets. Variables related to signal intensity and wing flapping pattern were extracted (via continuous wavelet transform). We used support vector classifiers to build predictive models. We estimated classification performance via cross validation on four datasets. When data from the same dataset were used for training and testing the classifier, the classification performance was extremely to moderately high. When data from one dataset were used for training and the three remaining datasets were used as test sets, the performance was lower but still extremely to moderately high. This shows that the method generalizes well across different locations or times. Our method provides a substantial gain of time when birds must be identified in large collections of radar signals and it represents the first substantial step in developing a real time bird identification radar system. We provide some guidelines and ideas for future research.

  14. Streamflow characterization using functional data analysis of the Potomac River

    NASA Astrophysics Data System (ADS)

    Zelmanow, A.; Maslova, I.; Ticlavilca, A. M.; McKee, M.

    2013-12-01

    Flooding and droughts are extreme hydrological events that affect the United States economically and socially. The severity and unpredictability of flooding has caused billions of dollars in damage and the loss of lives in the eastern United States. In this context, there is an urgent need to build a firm scientific basis for adaptation by developing and applying new modeling techniques for accurate streamflow characterization and reliable hydrological forecasting. The goal of this analysis is to use numerical streamflow characteristics in order to classify, model, and estimate the likelihood of extreme events in the eastern United States, mainly the Potomac River. Functional data analysis techniques are used to study yearly streamflow patterns, with the extreme streamflow events characterized via functional principal component analysis. These methods are merged with more classical techniques such as cluster analysis, classification analysis, and time series modeling. The developed functional data analysis approach is used to model continuous streamflow hydrographs. The forecasting potential of this technique is explored by incorporating climate factors to produce a yearly streamflow outlook.

  15. Providing an empirical basis for optimizing the verification and testing phases of software development

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Basili, Victor R.; Hetmanski, Christopher J.

    1992-01-01

    Applying equal testing and verification effort to all parts of a software system is not very efficient, especially when resources are limited and scheduling is tight. Therefore, one needs to be able to differentiate low/high fault density components so that the testing/verification effort can be concentrated where needed. Such a strategy is expected to detect more faults and thus improve the resulting reliability of the overall system. This paper presents an alternative approach for constructing such models that is intended to fulfill specific software engineering needs (i.e. dealing with partial/incomplete information and creating models that are easy to interpret). Our approach to classification is as follows: (1) to measure the software system to be considered; and (2) to build multivariate stochastic models for prediction. We present experimental results obtained by classifying FORTRAN components developed at the NASA/GSFC into two fault density classes: low and high. Also we evaluate the accuracy of the model and the insights it provides into the software process.

  16. Enhancement of global flood damage assessments using building material based vulnerability curves

    NASA Astrophysics Data System (ADS)

    Englhardt, Johanna; de Ruiter, Marleen; de Moel, Hans; Aerts, Jeroen

    2017-04-01

    This study discusses the development of an enhanced approach for flood damage and risk assessments using vulnerability curves that are based on building material information. The approach draws upon common practices in earthquake vulnerability assessments, and is an alternative for land-use or building occupancy approach in flood risk assessment models. The approach is of particular importance for studies where there is a large variation in building material, such as large scale studies or studies in developing countries. A case study of Ethiopia is used to demonstrate the impact of the different methodological approaches on direct damage assessments due to flooding. Generally, flood damage assessments use damage curves for different land-use or occupancy types (i.e. urban or residential and commercial classes). However, these categories do not necessarily relate directly to vulnerability of damage by flood waters. For this, the construction type and building material may be more important, as is used in earthquake risk assessments. For this study, we use building material classification data of the PAGER1 project to define new building material based vulnerability classes for flood damage. This approach will be compared to the widely applied land-use based vulnerability curves such as used by De Moel et al. (2011). The case of Ethiopia demonstrates and compares the feasibility of this novel flood vulnerability method on a country level which holds the potential to be scaled up to a global level. The study shows that flood vulnerability based on building material also allows for better differentiation between flood damage in urban and rural settings, opening doors to better link to poverty studies when such exposure data is available. Furthermore, this new approach paves the road to the enhancement of multi-risk assessments as the method enables the comparison of vulnerability across different natural hazard types that also use material-based vulnerability curves. Finally, this approach allows for more accuracy in estimating losses as a result of direct damages. 1 http://earthquake.usgs.gov/data/pager/

  17. A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network

    PubMed Central

    Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy

    2014-01-01

    Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659

  18. Assessing Electronic Cigarette-Related Tweets for Sentiment and Content Using Supervised Machine Learning

    PubMed Central

    Cole-Lewis, Heather; Varghese, Arun; Sanders, Amy; Schwarz, Mary; Pugatch, Jillian

    2015-01-01

    Background Electronic cigarettes (e-cigarettes) continue to be a growing topic among social media users, especially on Twitter. The ability to analyze conversations about e-cigarettes in real-time can provide important insight into trends in the public’s knowledge, attitudes, and beliefs surrounding e-cigarettes, and subsequently guide public health interventions. Objective Our aim was to establish a supervised machine learning algorithm to build predictive classification models that assess Twitter data for a range of factors related to e-cigarettes. Methods Manual content analysis was conducted for 17,098 tweets. These tweets were coded for five categories: e-cigarette relevance, sentiment, user description, genre, and theme. Machine learning classification models were then built for each of these five categories, and word groupings (n-grams) were used to define the feature space for each classifier. Results Predictive performance scores for classification models indicated that the models correctly labeled the tweets with the appropriate variables between 68.40% and 99.34% of the time, and the percentage of maximum possible improvement over a random baseline that was achieved by the classification models ranged from 41.59% to 80.62%. Classifiers with the highest performance scores that also achieved the highest percentage of the maximum possible improvement over a random baseline were Policy/Government (performance: 0.94; % improvement: 80.62%), Relevance (performance: 0.94; % improvement: 75.26%), Ad or Promotion (performance: 0.89; % improvement: 72.69%), and Marketing (performance: 0.91; % improvement: 72.56%). The most appropriate word-grouping unit (n-gram) was 1 for the majority of classifiers. Performance continued to marginally increase with the size of the training dataset of manually annotated data, but eventually leveled off. Even at low dataset sizes of 4000 observations, performance characteristics were fairly sound. Conclusions Social media outlets like Twitter can uncover real-time snapshots of personal sentiment, knowledge, attitudes, and behavior that are not as accessible, at this scale, through any other offline platform. Using the vast data available through social media presents an opportunity for social science and public health methodologies to utilize computational methodologies to enhance and extend research and practice. This study was successful in automating a complex five-category manual content analysis of e-cigarette-related content on Twitter using machine learning techniques. The study details machine learning model specifications that provided the best accuracy for data related to e-cigarettes, as well as a replicable methodology to allow extension of these methods to additional topics. PMID:26307512

  19. Assessing Electronic Cigarette-Related Tweets for Sentiment and Content Using Supervised Machine Learning.

    PubMed

    Cole-Lewis, Heather; Varghese, Arun; Sanders, Amy; Schwarz, Mary; Pugatch, Jillian; Augustson, Erik

    2015-08-25

    Electronic cigarettes (e-cigarettes) continue to be a growing topic among social media users, especially on Twitter. The ability to analyze conversations about e-cigarettes in real-time can provide important insight into trends in the public's knowledge, attitudes, and beliefs surrounding e-cigarettes, and subsequently guide public health interventions. Our aim was to establish a supervised machine learning algorithm to build predictive classification models that assess Twitter data for a range of factors related to e-cigarettes. Manual content analysis was conducted for 17,098 tweets. These tweets were coded for five categories: e-cigarette relevance, sentiment, user description, genre, and theme. Machine learning classification models were then built for each of these five categories, and word groupings (n-grams) were used to define the feature space for each classifier. Predictive performance scores for classification models indicated that the models correctly labeled the tweets with the appropriate variables between 68.40% and 99.34% of the time, and the percentage of maximum possible improvement over a random baseline that was achieved by the classification models ranged from 41.59% to 80.62%. Classifiers with the highest performance scores that also achieved the highest percentage of the maximum possible improvement over a random baseline were Policy/Government (performance: 0.94; % improvement: 80.62%), Relevance (performance: 0.94; % improvement: 75.26%), Ad or Promotion (performance: 0.89; % improvement: 72.69%), and Marketing (performance: 0.91; % improvement: 72.56%). The most appropriate word-grouping unit (n-gram) was 1 for the majority of classifiers. Performance continued to marginally increase with the size of the training dataset of manually annotated data, but eventually leveled off. Even at low dataset sizes of 4000 observations, performance characteristics were fairly sound. Social media outlets like Twitter can uncover real-time snapshots of personal sentiment, knowledge, attitudes, and behavior that are not as accessible, at this scale, through any other offline platform. Using the vast data available through social media presents an opportunity for social science and public health methodologies to utilize computational methodologies to enhance and extend research and practice. This study was successful in automating a complex five-category manual content analysis of e-cigarette-related content on Twitter using machine learning techniques. The study details machine learning model specifications that provided the best accuracy for data related to e-cigarettes, as well as a replicable methodology to allow extension of these methods to additional topics.

  20. Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.

    PubMed

    Taslimitehrani, Vahid; Dong, Guozhu; Pereira, Naveen L; Panahiazar, Maryam; Pathak, Jyotishman

    2016-04-01

    Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function, to develop and validate prognostic risk models to predict 1, 2, and 5year survival in heart failure (HF) using data from electronic health records (EHRs) at Mayo Clinic. The CPXR(Log) constructs a pattern aided logistic regression model defined by several patterns and corresponding local logistic regression models. One of the models generated by CPXR(Log) achieved an AUC and accuracy of 0.94 and 0.91, respectively, and significantly outperformed prognostic models reported in prior studies. Data extracted from EHRs allowed incorporation of patient co-morbidities into our models which helped improve the performance of the CPXR(Log) models (15.9% AUC improvement), although did not improve the accuracy of the models built by other classifiers. We also propose a probabilistic loss function to determine the large error and small error instances. The new loss function used in the algorithm outperforms other functions used in the previous studies by 1% improvement in the AUC. This study revealed that using EHR data to build prediction models can be very challenging using existing classification methods due to the high dimensionality and complexity of EHR data. The risk models developed by CPXR(Log) also reveal that HF is a highly heterogeneous disease, i.e., different subgroups of HF patients require different types of considerations with their diagnosis and treatment. Our risk models provided two valuable insights for application of predictive modeling techniques in biomedicine: Logistic risk models often make systematic prediction errors, and it is prudent to use subgroup based prediction models such as those given by CPXR(Log) when investigating heterogeneous diseases. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Looking at the ICF and human communication through the lens of classification theory.

    PubMed

    Walsh, Regina

    2011-08-01

    This paper explores the insights that classification theory can provide about the application of the International Classification of Functioning, Disability and Health (ICF) to communication. It first considers the relationship between conceptual models and classification systems, highlighting that classification systems in speech-language pathology (SLP) have not historically been based on conceptual models of human communication. It then overviews the key concepts and criteria of classification theory. Applying classification theory to the ICF and communication raises a number of issues, some previously highlighted through clinical application. Six focus questions from classification theory are used to explore these issues, and to propose the creation of an ICF-related conceptual model of communicating for the field of communication disability, which would address some of the issues raised. Developing a conceptual model of communication for SLP purposes closely articulated with the ICF would foster productive intra-professional discourse, while at the same time allow the profession to continue to use the ICF for purposes in inter-disciplinary discourse. The paper concludes by suggesting the insights of classification theory can assist professionals to apply the ICF to communication with the necessary rigour, and to work further in developing a conceptual model of human communication.

  2. Third Generation Nigerian University Libraries.

    ERIC Educational Resources Information Center

    Agboola, A. T.

    1993-01-01

    Examines the development of Nigerian University libraries and the political factors that created them and continue to effect their development, with a focus on those established between 1980 and 1984. Users, governance, finance, buildings, staffing, collection development, services, cataloging and classification, and automation are described.…

  3. 77 FR 49991 - Small Business Size Standards; Adoption of 2012 North American Industry Classification System for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-08-20

    ... Manufacturing. ......... 322215 Nonfolding Sanitary ......... 750 employees....... Food Container Manufacturing... Manufacturing. ......... 327113 Porcelain Electrical 500 employees. Supply Manufacturing. 327120 Clay Building N 2b 750 employees....... 327121 Brick and Structural 500 employees. Material and Clay Tile...

  4. 48 CFR 1845.7101-1 - Property classification.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... aeronautical and space programs, which are capable of stand-alone operation. Examples include research aircraft... characteristics. (ii) Examples of NASA heritage assets include buildings and structures designated as National...., it no longer provides service to NASA operations). Examples of obsolete property are items in...

  5. 48 CFR 1845.7101-1 - Property classification.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... aeronautical and space programs, which are capable of stand-alone operation. Examples include research aircraft... characteristics. (ii) Examples of NASA heritage assets include buildings and structures designated as National...., it no longer provides service to NASA operations). Examples of obsolete property are items in...

  6. 48 CFR 1845.7101-1 - Property classification.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... aeronautical and space programs, which are capable of stand-alone operation. Examples include research aircraft... characteristics. (ii) Examples of NASA heritage assets include buildings and structures designated as National...., it no longer provides service to NASA operations). Examples of obsolete property are items in...

  7. 48 CFR 1845.7101-1 - Property classification.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... aeronautical and space programs, which are capable of stand-alone operation. Examples include research aircraft... characteristics. (ii) Examples of NASA heritage assets include buildings and structures designated as National...., it no longer provides service to NASA operations). Examples of obsolete property are items in...

  8. 48 CFR 1845.7101-1 - Property classification.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... aeronautical and space programs, which are capable of stand-alone operation. Examples include research aircraft... characteristics. (ii) Examples of NASA heritage assets include buildings and structures designated as National...., it no longer provides service to NASA operations). Examples of obsolete property are items in...

  9. 23 CFR 750.703 - Definitions.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... classifications. (b) Erect means to construct, build, raise, assemble, place, affix, attach, create, paint, draw... frontage roads, turning roadways, or parking areas. (i) Sign, display or device, hereinafter referred to as “sign,” means an outdoor advertising sign, light, display, device, figure, painting, drawing, message...

  10. 23 CFR 750.703 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... classifications. (b) Erect means to construct, build, raise, assemble, place, affix, attach, create, paint, draw... frontage roads, turning roadways, or parking areas. (i) Sign, display or device, hereinafter referred to as “sign,” means an outdoor advertising sign, light, display, device, figure, painting, drawing, message...

  11. 23 CFR 750.703 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... classifications. (b) Erect means to construct, build, raise, assemble, place, affix, attach, create, paint, draw... frontage roads, turning roadways, or parking areas. (i) Sign, display or device, hereinafter referred to as “sign,” means an outdoor advertising sign, light, display, device, figure, painting, drawing, message...

  12. 23 CFR 750.703 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... classifications. (b) Erect means to construct, build, raise, assemble, place, affix, attach, create, paint, draw... frontage roads, turning roadways, or parking areas. (i) Sign, display or device, hereinafter referred to as “sign,” means an outdoor advertising sign, light, display, device, figure, painting, drawing, message...

  13. 23 CFR 750.703 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... classifications. (b) Erect means to construct, build, raise, assemble, place, affix, attach, create, paint, draw... frontage roads, turning roadways, or parking areas. (i) Sign, display or device, hereinafter referred to as “sign,” means an outdoor advertising sign, light, display, device, figure, painting, drawing, message...

  14. Plant or Animal?

    ERIC Educational Resources Information Center

    Bowman, Frank; Matthews, Catherine E.

    1996-01-01

    Presents activities that use marine organisms with plant-like appearances to help students build classification skills and illustrate some of the less obvious differences between plants and animals. Compares mechanisms by which sessile plants and animals deal with common problems such as obtaining energy, defending themselves, successfully…

  15. Inlining 3d Reconstruction, Multi-Source Texture Mapping and Semantic Analysis Using Oblique Aerial Imagery

    NASA Astrophysics Data System (ADS)

    Frommholz, D.; Linkiewicz, M.; Poznanska, A. M.

    2016-06-01

    This paper proposes an in-line method for the simplified reconstruction of city buildings from nadir and oblique aerial images that at the same time are being used for multi-source texture mapping with minimal resampling. Further, the resulting unrectified texture atlases are analyzed for façade elements like windows to be reintegrated into the original 3D models. Tests on real-world data of Heligoland/ Germany comprising more than 800 buildings exposed a median positional deviation of 0.31 m at the façades compared to the cadastral map, a correctness of 67% for the detected windows and good visual quality when being rendered with GPU-based perspective correction. As part of the process building reconstruction takes the oriented input images and transforms them into dense point clouds by semi-global matching (SGM). The point sets undergo local RANSAC-based regression and topology analysis to detect adjacent planar surfaces and determine their semantics. Based on this information the roof, wall and ground surfaces found get intersected and limited in their extension to form a closed 3D building hull. For texture mapping the hull polygons are projected into each possible input bitmap to find suitable color sources regarding the coverage and resolution. Occlusions are detected by ray-casting a full-scale digital surface model (DSM) of the scene and stored in pixel-precise visibility maps. These maps are used to derive overlap statistics and radiometric adjustment coefficients to be applied when the visible image parts for each building polygon are being copied into a compact texture atlas without resampling whenever possible. The atlas bitmap is passed to a commercial object-based image analysis (OBIA) tool running a custom rule set to identify windows on the contained façade patches. Following multi-resolution segmentation and classification based on brightness and contrast differences potential window objects are evaluated against geometric constraints and conditionally grown, fused and filtered morphologically. The output polygons are vectorized and reintegrated into the previously reconstructed buildings by sparsely ray-tracing their vertices. Finally the enhanced 3D models get stored as textured geometry for visualization and semantically annotated "LOD-2.5" CityGML objects for GIS applications.

  16. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.

    PubMed

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  17. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    PubMed Central

    Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159

  18. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    USGS Publications Warehouse

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  19. Branch classification: A new mechanism for improving branch predictor performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, P.Y.; Hao, E.; Patt, Y.

    There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows anmore » individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.« less

  20. On feature augmentation for semantic argument classification of the Quran English translation using support vector machine

    NASA Astrophysics Data System (ADS)

    Khaira Batubara, Dina; Arif Bijaksana, Moch; Adiwijaya

    2018-03-01

    Research on the semantic argument classification requires semantically labeled data in large numbers, called corpus. Because building a corpus is costly and time-consuming, recently many studies have used existing corpus as the training data to conduct semantic argument classification research on new domain. But previous studies have proven that there is a significant decrease in performance when classifying semantic arguments on different domain between the training and the testing data. The main problem is when there is a new argument that found in the testing data but it is not found in the training data. This research carries on semantic argument classification on a new domain that is Quran English Translation by utilizing Propbank corpus as the training data. To recognize the new argument in the training data, this research proposes four new features for extending the argument features in the training data. By using SVM Linear, the experiment has proven that augmenting the proposed features to the baseline system with some combinations option improve the performance of semantic argument classification on Quran data using Propbank Corpus as training data.

Top