Sample records for large scale classifier

  1. The large-scale environment from cosmological simulations - I. The baryonic cosmic web

    NASA Astrophysics Data System (ADS)

    Cui, Weiguang; Knebe, Alexander; Yepes, Gustavo; Yang, Xiaohu; Borgani, Stefano; Kang, Xi; Power, Chris; Staveley-Smith, Lister

    2018-01-01

    Using a series of cosmological simulations that includes one dark-matter-only (DM-only) run, one gas cooling-star formation-supernova feedback (CSF) run and one that additionally includes feedback from active galactic nuclei (AGNs), we classify the large-scale structures with both a velocity-shear-tensor code (VWEB) and a tidal-tensor code (PWEB). We find that the baryonic processes have almost no impact on large-scale structures - at least not when classified using aforementioned techniques. More importantly, our results confirm that the gas component alone can be used to infer the filamentary structure of the universe practically un-biased, which could be applied to cosmology constraints. In addition, the gas filaments are classified with its velocity (VWEB) and density (PWEB) fields, which can theoretically connect to the radio observations, such as H I surveys. This will help us to bias-freely link the radio observations with dark matter distributions at large scale.

  2. Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification.

    PubMed

    Fan, Jianping; Zhou, Ning; Peng, Jinye; Gao, Ling

    2015-11-01

    In this paper, a hierarchical multi-task structural learning algorithm is developed to support large-scale plant species identification, where a visual tree is constructed for organizing large numbers of plant species in a coarse-to-fine fashion and determining the inter-related learning tasks automatically. For a given parent node on the visual tree, it contains a set of sibling coarse-grained categories of plant species or sibling fine-grained plant species, and a multi-task structural learning algorithm is developed to train their inter-related classifiers jointly for enhancing their discrimination power. The inter-level relationship constraint, e.g., a plant image must first be assigned to a parent node (high-level non-leaf node) correctly if it can further be assigned to the most relevant child node (low-level non-leaf node or leaf node) on the visual tree, is formally defined and leveraged to learn more discriminative tree classifiers over the visual tree. Our experimental results have demonstrated the effectiveness of our hierarchical multi-task structural learning algorithm on training more discriminative tree classifiers for large-scale plant species identification.

  3. Exploring Google Earth Engine platform for big data processing: classification of multi-temporal satellite imagery for crop mapping

    NASA Astrophysics Data System (ADS)

    Shelestov, Andrii; Lavreniuk, Mykola; Kussul, Nataliia; Novikov, Alexei; Skakun, Sergii

    2017-02-01

    Many applied problems arising in agricultural monitoring and food security require reliable crop maps at national or global scale. Large scale crop mapping requires processing and management of large amount of heterogeneous satellite imagery acquired by various sensors that consequently leads to a “Big Data” problem. The main objective of this study is to explore efficiency of using the Google Earth Engine (GEE) platform when classifying multi-temporal satellite imagery with potential to apply the platform for a larger scale (e.g. country level) and multiple sensors (e.g. Landsat-8 and Sentinel-2). In particular, multiple state-of-the-art classifiers available in the GEE platform are compared to produce a high resolution (30 m) crop classification map for a large territory ( 28,100 km2 and 1.0 M ha of cropland). Though this study does not involve large volumes of data, it does address efficiency of the GEE platform to effectively execute complex workflows of satellite data processing required with large scale applications such as crop mapping. The study discusses strengths and weaknesses of classifiers, assesses accuracies that can be achieved with different classifiers for the Ukrainian landscape, and compares them to the benchmark classifier using a neural network approach that was developed in our previous studies. The study is carried out for the Joint Experiment of Crop Assessment and Monitoring (JECAM) test site in Ukraine covering the Kyiv region (North of Ukraine) in 2013. We found that Google Earth Engine (GEE) provides very good performance in terms of enabling access to the remote sensing products through the cloud platform and providing pre-processing; however, in terms of classification accuracy, the neural network based approach outperformed support vector machine (SVM), decision tree and random forest classifiers available in GEE.

  4. An algorithm for generating modular hierarchical neural network classifiers: a step toward larger scale applications

    NASA Astrophysics Data System (ADS)

    Roverso, Davide

    2003-08-01

    Many-class learning is the problem of training a classifier to discriminate among a large number of target classes. Together with the problem of dealing with high-dimensional patterns (i.e. a high-dimensional input space), the many class problem (i.e. a high-dimensional output space) is a major obstacle to be faced when scaling-up classifier systems and algorithms from small pilot applications to large full-scale applications. The Autonomous Recursive Task Decomposition (ARTD) algorithm is here proposed as a solution to the problem of many-class learning. Example applications of ARTD to neural classifier training are also presented. In these examples, improvements in training time are shown to range from 4-fold to more than 30-fold in pattern classification tasks of both static and dynamic character.

  5. Multi-view L2-SVM and its multi-view core vector machine.

    PubMed

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation.

    PubMed

    Fan, Jianping; Gao, Yuli; Luo, Hangzai

    2008-03-01

    In this paper, we have developed a new scheme for achieving multilevel annotations of large-scale images automatically. To achieve more sufficient representation of various visual properties of the images, both the global visual features and the local visual features are extracted for image content representation. To tackle the problem of huge intraconcept visual diversity, multiple types of kernels are integrated to characterize the diverse visual similarity relationships between the images more precisely, and a multiple kernel learning algorithm is developed for SVM image classifier training. To address the problem of huge interconcept visual similarity, a novel multitask learning algorithm is developed to learn the correlated classifiers for the sibling image concepts under the same parent concept and enhance their discrimination and adaptation power significantly. To tackle the problem of huge intraconcept visual diversity for the image concepts at the higher levels of the concept ontology, a novel hierarchical boosting algorithm is developed to learn their ensemble classifiers hierarchically. In order to assist users on selecting more effective hypotheses for image classifier training, we have developed a novel hyperbolic framework for large-scale image visualization and interactive hypotheses assessment. Our experiments on large-scale image collections have also obtained very positive results.

  7. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    PubMed

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  8. Domain-Adapted Convolutional Networks for Satellite Image Classification: A Large-Scale Interactive Learning Workflow

    DOE PAGES

    Lunga, Dalton D.; Yang, Hsiuhan Lexie; Reith, Andrew E.; ...

    2018-02-06

    Satellite imagery often exhibits large spatial extent areas that encompass object classes with considerable variability. This often limits large-scale model generalization with machine learning algorithms. Notably, acquisition conditions, including dates, sensor position, lighting condition, and sensor types, often translate into class distribution shifts introducing complex nonlinear factors and hamper the potential impact of machine learning classifiers. Here, this article investigates the challenge of exploiting satellite images using convolutional neural networks (CNN) for settlement classification where the class distribution shifts are significant. We present a large-scale human settlement mapping workflow based-off multiple modules to adapt a pretrained CNN to address themore » negative impact of distribution shift on classification performance. To extend a locally trained classifier onto large spatial extents areas we introduce several submodules: First, a human-in-the-loop element for relabeling of misclassified target domain samples to generate representative examples for model adaptation; second, an efficient hashing module to minimize redundancy and noisy samples from the mass-selected examples; and third, a novel relevance ranking module to minimize the dominance of source example on the target domain. The workflow presents a novel and practical approach to achieve large-scale domain adaptation with binary classifiers that are based-off CNN features. Experimental evaluations are conducted on areas of interest that encompass various image characteristics, including multisensors, multitemporal, and multiangular conditions. Domain adaptation is assessed on source–target pairs through the transfer loss and transfer ratio metrics to illustrate the utility of the workflow.« less

  9. Domain-Adapted Convolutional Networks for Satellite Image Classification: A Large-Scale Interactive Learning Workflow

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lunga, Dalton D.; Yang, Hsiuhan Lexie; Reith, Andrew E.

    Satellite imagery often exhibits large spatial extent areas that encompass object classes with considerable variability. This often limits large-scale model generalization with machine learning algorithms. Notably, acquisition conditions, including dates, sensor position, lighting condition, and sensor types, often translate into class distribution shifts introducing complex nonlinear factors and hamper the potential impact of machine learning classifiers. Here, this article investigates the challenge of exploiting satellite images using convolutional neural networks (CNN) for settlement classification where the class distribution shifts are significant. We present a large-scale human settlement mapping workflow based-off multiple modules to adapt a pretrained CNN to address themore » negative impact of distribution shift on classification performance. To extend a locally trained classifier onto large spatial extents areas we introduce several submodules: First, a human-in-the-loop element for relabeling of misclassified target domain samples to generate representative examples for model adaptation; second, an efficient hashing module to minimize redundancy and noisy samples from the mass-selected examples; and third, a novel relevance ranking module to minimize the dominance of source example on the target domain. The workflow presents a novel and practical approach to achieve large-scale domain adaptation with binary classifiers that are based-off CNN features. Experimental evaluations are conducted on areas of interest that encompass various image characteristics, including multisensors, multitemporal, and multiangular conditions. Domain adaptation is assessed on source–target pairs through the transfer loss and transfer ratio metrics to illustrate the utility of the workflow.« less

  10. SVM and SVM Ensembles in Breast Cancer Prediction.

    PubMed

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  11. SVM and SVM Ensembles in Breast Cancer Prediction

    PubMed Central

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807

  12. Multicategory Composite Least Squares Classifiers

    PubMed Central

    Park, Seo Young; Liu, Yufeng; Liu, Dacheng; Scholl, Paul

    2010-01-01

    Classification is a very useful statistical tool for information extraction. In particular, multicategory classification is commonly seen in various applications. Although binary classification problems are heavily studied, extensions to the multicategory case are much less so. In view of the increased complexity and volume of modern statistical problems, it is desirable to have multicategory classifiers that are able to handle problems with high dimensions and with a large number of classes. Moreover, it is necessary to have sound theoretical properties for the multicategory classifiers. In the literature, there exist several different versions of simultaneous multicategory Support Vector Machines (SVMs). However, the computation of the SVM can be difficult for large scale problems, especially for problems with large number of classes. Furthermore, the SVM cannot produce class probability estimation directly. In this article, we propose a novel efficient multicategory composite least squares classifier (CLS classifier), which utilizes a new composite squared loss function. The proposed CLS classifier has several important merits: efficient computation for problems with large number of classes, asymptotic consistency, ability to handle high dimensional data, and simple conditional class probability estimation. Our simulated and real examples demonstrate competitive performance of the proposed approach. PMID:21218128

  13. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    PubMed

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  14. HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.

    PubMed

    Fan, Jianping; Zhao, Tianyi; Kuang, Zhenzhong; Zheng, Yu; Zhang, Ji; Yu, Jun; Peng, Jinye

    2017-02-09

    In this paper, a hierarchical deep multi-task learning (HD-MTL) algorithm is developed to support large-scale visual recognition (e.g., recognizing thousands or even tens of thousands of atomic object classes automatically). First, multiple sets of multi-level deep features are extracted from different layers of deep convolutional neural networks (deep CNNs), and they are used to achieve more effective accomplishment of the coarseto- fine tasks for hierarchical visual recognition. A visual tree is then learned by assigning the visually-similar atomic object classes with similar learning complexities into the same group, which can provide a good environment for determining the interrelated learning tasks automatically. By leveraging the inter-task relatedness (inter-class similarities) to learn more discriminative group-specific deep representations, our deep multi-task learning algorithm can train more discriminative node classifiers for distinguishing the visually-similar atomic object classes effectively. Our hierarchical deep multi-task learning (HD-MTL) algorithm can integrate two discriminative regularization terms to control the inter-level error propagation effectively, and it can provide an end-to-end approach for jointly learning more representative deep CNNs (for image representation) and more discriminative tree classifier (for large-scale visual recognition) and updating them simultaneously. Our incremental deep learning algorithms can effectively adapt both the deep CNNs and the tree classifier to the new training images and the new object classes. Our experimental results have demonstrated that our HD-MTL algorithm can achieve very competitive results on improving the accuracy rates for large-scale visual recognition.

  15. Prevalence scaling: applications to an intelligent workstation for the diagnosis of breast cancer.

    PubMed

    Horsch, Karla; Giger, Maryellen L; Metz, Charles E

    2008-11-01

    Our goal was to investigate the effects of changes that the prevalence of cancer in a population have on the probability of malignancy (PM) output and an optimal combination of a true-positive fraction (TPF) and a false-positive fraction (FPF) of a mammographic and sonographic automatic classifier for the diagnosis of breast cancer. We investigate how a prevalence-scaling transformation that is used to change the prevalence inherent in the computer estimates of the PM affects the numerical and histographic output of a previously developed multimodality intelligent workstation. Using Bayes' rule and the binormal model, we study how changes in the prevalence of cancer in the diagnostic breast population affect our computer classifiers' optimal operating points, as defined by maximizing the expected utility. Prevalence scaling affects the threshold at which a particular TPF and FPF pair is achieved. Tables giving the thresholds on the scaled PM estimates that result in particular pairs of TPF and FPF are presented. Histograms of PMs scaled to reflect clinically relevant prevalence values differ greatly from histograms of laboratory-designed PMs. The optimal pair (TPF, FPF) of our lower performing mammographic classifier is more sensitive to changes in clinical prevalence than that of our higher performing sonographic classifier. Prevalence scaling can be used to change computer PM output to reflect clinically more appropriate prevalence. Relatively small changes in clinical prevalence can have large effects on the computer classifier's optimal operating point.

  16. A generalized approach for producing, quantifying, and validating citizen science data from wildlife images.

    PubMed

    Swanson, Alexandra; Kosmala, Margaret; Lintott, Chris; Packer, Craig

    2016-06-01

    Citizen science has the potential to expand the scope and scale of research in ecology and conservation, but many professional researchers remain skeptical of data produced by nonexperts. We devised an approach for producing accurate, reliable data from untrained, nonexpert volunteers. On the citizen science website www.snapshotserengeti.org, more than 28,000 volunteers classified 1.51 million images taken in a large-scale camera-trap survey in Serengeti National Park, Tanzania. Each image was circulated to, on average, 27 volunteers, and their classifications were aggregated using a simple plurality algorithm. We validated the aggregated answers against a data set of 3829 images verified by experts and calculated 3 certainty metrics-level of agreement among classifications (evenness), fraction of classifications supporting the aggregated answer (fraction support), and fraction of classifiers who reported "nothing here" for an image that was ultimately classified as containing an animal (fraction blank)-to measure confidence that an aggregated answer was correct. Overall, aggregated volunteer answers agreed with the expert-verified data on 98% of images, but accuracy differed by species commonness such that rare species had higher rates of false positives and false negatives. Easily calculated analysis of variance and post-hoc Tukey tests indicated that the certainty metrics were significant indicators of whether each image was correctly classified or classifiable. Thus, the certainty metrics can be used to identify images for expert review. Bootstrapping analyses further indicated that 90% of images were correctly classified with just 5 volunteers per image. Species classifications based on the plurality vote of multiple citizen scientists can provide a reliable foundation for large-scale monitoring of African wildlife. © 2016 The Authors. Conservation Biology published by Wiley Periodicals, Inc. on behalf of Society for Conservation Biology.

  17. Using Neural Networks to Classify Digitized Images of Galaxies

    NASA Astrophysics Data System (ADS)

    Goderya, S. N.; McGuire, P. C.

    2000-12-01

    Automated classification of Galaxies into Hubble types is of paramount importance to study the large scale structure of the Universe, particularly as survey projects like the Sloan Digital Sky Survey complete their data acquisition of one million galaxies. At present it is not possible to find robust and efficient artificial intelligence based galaxy classifiers. In this study we will summarize progress made in the development of automated galaxy classifiers using neural networks as machine learning tools. We explore the Bayesian linear algorithm, the higher order probabilistic network, the multilayer perceptron neural network and Support Vector Machine Classifier. The performance of any machine classifier is dependant on the quality of the parameters that characterize the different groups of galaxies. Our effort is to develop geometric and invariant moment based parameters as input to the machine classifiers instead of the raw pixel data. Such an approach reduces the dimensionality of the classifier considerably, and removes the effects of scaling and rotation, and makes it easier to solve for the unknown parameters in the galaxy classifier. To judge the quality of training and classification we develop the concept of Mathews coefficients for the galaxy classification community. Mathews coefficients are single numbers that quantify classifier performance even with unequal prior probabilities of the classes.

  18. Enhancements for a Dynamic Data Warehousing and Mining System for Large-Scale Human Social Cultural Behavioral (HSBC) Data

    DTIC Science & Technology

    2016-09-26

    Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Enhancements for a Dynamic Data Warehousing and Mining System for N00014-16-P-3014 Large-Scale Human Social Cultural Behavioral (HSBC) Data 5b. GRANT NUMBER...Representative Media Gallery View. We perform Scraawl’s NER algorithm to the text associated with YouTube post, which classifies the named entities into

  19. Detection and classification of ash dieback on large-scale color aerial photographs

    Treesearch

    Ralph J. Croxton

    1966-01-01

    Aerial color photographs were taken at two scales over ash stands in New York State that were infected with ash dieback. Three photo interpreters then attempted to distinguish ash trees from other hardwoods and classify their disease condition. The scale of 1:7,920 was too small to permit accurate identification, but accuracy at the scale 1:1,584 was fair (60 to 70...

  20. A novel artificial fish swarm algorithm for solving large-scale reliability-redundancy application problem.

    PubMed

    He, Qiang; Hu, Xiangtao; Ren, Hong; Zhang, Hongqi

    2015-11-01

    A novel artificial fish swarm algorithm (NAFSA) is proposed for solving large-scale reliability-redundancy allocation problem (RAP). In NAFSA, the social behaviors of fish swarm are classified in three ways: foraging behavior, reproductive behavior, and random behavior. The foraging behavior designs two position-updating strategies. And, the selection and crossover operators are applied to define the reproductive ability of an artificial fish. For the random behavior, which is essentially a mutation strategy, the basic cloud generator is used as the mutation operator. Finally, numerical results of four benchmark problems and a large-scale RAP are reported and compared. NAFSA shows good performance in terms of computational accuracy and computational efficiency for large scale RAP. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  1. Centralized automated quality assurance for large scale health care systems. A pilot method for some aspects of dental radiography.

    PubMed

    Benn, D K; Minden, N J; Pettigrew, J C; Shim, M

    1994-08-01

    President Clinton's Health Security Act proposes the formation of large scale health plans with improved quality assurance. Dental radiography consumes 4% ($1.2 billion in 1990) of total dental expenditure yet regular systematic office quality assurance is not performed. A pilot automated method is described for assessing density of exposed film and fogging of unexposed processed film. A workstation and camera were used to input intraoral radiographs. Test images were produced from a phantom jaw with increasing exposure times. Two radiologists subjectively classified the images as too light, acceptable, or too dark. A computer program automatically classified global grey level histograms from the test images as too light, acceptable, or too dark. The program correctly classified 95% of 88 clinical films. Optical density of unexposed film in the range 0.15 to 0.52 measured by computer was reliable to better than 0.01. Further work is needed to see if comprehensive centralized automated radiographic quality assurance systems with feedback to dentists are feasible, are able to improve quality, and are significantly cheaper than conventional clerical methods.

  2. Classification of Large-Scale Remote Sensing Images for Automatic Identification of Health Hazards: Smoke Detection Using an Autologistic Regression Classifier.

    PubMed

    Wolters, Mark A; Dean, C B

    2017-01-01

    Remote sensing images from Earth-orbiting satellites are a potentially rich data source for monitoring and cataloguing atmospheric health hazards that cover large geographic regions. A method is proposed for classifying such images into hazard and nonhazard regions using the autologistic regression model, which may be viewed as a spatial extension of logistic regression. The method includes a novel and simple approach to parameter estimation that makes it well suited to handling the large and high-dimensional datasets arising from satellite-borne instruments. The methodology is demonstrated on both simulated images and a real application to the identification of forest fire smoke.

  3. Building rooftop classification using random forests for large-scale PV deployment

    NASA Astrophysics Data System (ADS)

    Assouline, Dan; Mohajeri, Nahid; Scartezzini, Jean-Louis

    2017-10-01

    Large scale solar Photovoltaic (PV) deployment on existing building rooftops has proven to be one of the most efficient and viable sources of renewable energy in urban areas. As it usually requires a potential analysis over the area of interest, a crucial step is to estimate the geometric characteristics of the building rooftops. In this paper, we introduce a multi-layer machine learning methodology to classify 6 roof types, 9 aspect (azimuth) classes and 5 slope (tilt) classes for all building rooftops in Switzerland, using GIS processing. We train Random Forests (RF), an ensemble learning algorithm, to build the classifiers. We use (2 × 2) [m2 ] LiDAR data (considering buildings and vegetation) to extract several rooftop features, and a generalised footprint polygon data to localize buildings. The roof classifier is trained and tested with 1252 labeled roofs from three different urban areas, namely Baden, Luzern, and Winterthur. The results for roof type classification show an average accuracy of 67%. The aspect and slope classifiers are trained and tested with 11449 labeled roofs in the Zurich periphery area. The results for aspect and slope classification show different accuracies depending on the classes: while some classes are well identified, other under-represented classes remain challenging to detect.

  4. Monitoring Urbanization Processes from Space: Using Landsat Imagery to Detect Built-Up Areas at Scale

    NASA Astrophysics Data System (ADS)

    Goldblatt, R.; You, W.; Hanson, G.; Khandelwal, A. K.

    2016-12-01

    Urbanization is one of the most fundamental trends of the past two centuries and a key force shaping almost all dimensions of modern society. Monitoring the spatial extent of cities and their dynamics be means of remote sensing methods is crucial for many research domains, as well as to city and regional planning and to policy making. Yet the majority of urban research is being done in small scales, due, in part, to computational limitation. With the increasing availability of parallel computing platforms with large storage capacities, such as Google Earth Engine (GEE), researchers can scale up the spatial and the temporal units of analysis and investigate urbanization processes over larger areas and over longer periods of time. In this study we present a methodology that is designed to capture temporal changes in the spatial extent of urban areas at the national level. We utilize a large scale ground-truth dataset containing examples of "built-up" and "not built-up" areas from across India. This dataset, which was collected based on 2016 high-resolution imagery, is used for supervised pixel-based image classification in GEE. We assess different types of classifiers and inputs and demonstrate that with Landsat 8 as the classifier`s input, Random Forest achieves a high accuracy rate of around 87%. Although performance with Landsat 8 as the input exceeds that of Landsat 7, with the addition of several per-pixel computed indices to Landsat 7 - NDVI, NDBI, MNDWI and SAVI - the classifier`s sensitivity improves by around 10%. We use Landsat 7 to detect temporal changes in the extent of urban areas. The classifier is trained with 2016 imagery as the input - for which ground truth data is available - and is used the to detect urban areas over the historical imagery. We demonstrate that this classification produces high quality maps of urban extent over time. We compare the classification result with numerous datasets of urban areas (e.g. MODIS, DMSP-OLS and WorldPop) and show that our classification captures the fine boundaries between built-up areas and various types of land cover thus providing an accurate estimation of the extent of urban areas. The study demonstrates the potential of cloud-based platforms, such as GEE, for monitoring long-term and continuous urbanization processes at scale.

  5. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  6. A Large-scale Distributed Indexed Learning Framework for Data that Cannot Fit into Memory

    DTIC Science & Technology

    2015-03-27

    learn a classifier. Integrating three learning techniques (online, semi-supervised and active learning ) together with a selective sampling with minimum communication between the server and the clients solved this problem.

  7. Sloan Digital Sky Survey III photometric quasar clustering: Probing the initial conditions of the Universe

    DOE PAGES

    Ho, Shirley; Agarwal, Nishant; Myers, Adam D.; ...

    2015-05-22

    Here, the Sloan Digital Sky Survey has surveyed 14,555 square degrees of the sky, and delivered over a trillion pixels of imaging data. We present the large-scale clustering of 1.6 million quasars between z=0.5 and z=2.5 that have been classified from this imaging, representing the highest density of quasars ever studied for clustering measurements. This data set spans 0~ 11,00 square degrees and probes a volume of 80 h –3 Gpc 3. In principle, such a large volume and medium density of tracers should facilitate high-precision cosmological constraints. We measure the angular clustering of photometrically classified quasars using an optimalmore » quadratic estimator in four redshift slices with an accuracy of ~ 25% over a bin width of δ l ~ 10–15 on scales corresponding to matter-radiation equality and larger (0ℓ ~ 2–3).« less

  8. Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

    PubMed

    Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

    2007-09-21

    Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.

  9. HIV-1 genetic diversity and primary drug resistance mutations before large-scale access to antiretroviral therapy, Republic of Congo.

    PubMed

    Niama, Fabien Roch; Vidal, Nicole; Diop-Ndiaye, Halimatou; Nguimbi, Etienne; Ahombo, Gabriel; Diakabana, Philippe; Bayonne Kombo, Édith Sophie; Mayengue, Pembe Issamou; Kobawila, Simon-Charles; Parra, Henri Joseph; Toure-Kane, Coumba

    2017-07-05

    In this work, we investigated the genetic diversity of HIV-1 and the presence of mutations conferring antiretroviral drug resistance in 50 drug-naïve infected persons in the Republic of Congo (RoC). Samples were obtained before large-scale access to HAART in 2002 and 2004. To assess the HIV-1 genetic recombination, the sequencing of the pol gene encoding a protease and partial reverse transcriptase was performed and analyzed with updated references, including newly characterized CRFs. The assessment of drug resistance was conducted according to the WHO protocol. Among the 50 samples analyzed for the pol gene, 50% were classified as intersubtype recombinants, charring complex structures inside the pol fragment. Five samples could not be classified (noted U). The most prevalent subtypes were G with 10 isolates and D with 11 isolates. One isolate of A, J, H, CRF05, CRF18 and CRF37 were also found. Two samples (4%) harboring the mutations M230L and Y181C associated with the TAMs M41L and T215Y, respectively, were found. This first study in the RoC, based on WHO classification, shows that the threshold of transmitted drug resistance before large-scale access to antiretroviral therapy is 4%.

  10. Ensemble candidate classification for the LOTAAS pulsar survey

    NASA Astrophysics Data System (ADS)

    Tan, C. M.; Lyon, R. J.; Stappers, B. W.; Cooper, S.; Hessels, J. W. T.; Kondratiev, V. I.; Michilli, D.; Sanidas, S.

    2018-03-01

    One of the biggest challenges arising from modern large-scale pulsar surveys is the number of candidates generated. Here, we implemented several improvements to the machine learning (ML) classifier previously used by the LOFAR Tied-Array All-Sky Survey (LOTAAS) to look for new pulsars via filtering the candidates obtained during periodicity searches. To assist the ML algorithm, we have introduced new features which capture the frequency and time evolution of the signal and improved the signal-to-noise calculation accounting for broad profiles. We enhanced the ML classifier by including a third class characterizing RFI instances, allowing candidates arising from RFI to be isolated, reducing the false positive return rate. We also introduced a new training data set used by the ML algorithm that includes a large sample of pulsars misclassified by the previous classifier. Lastly, we developed an ensemble classifier comprised of five different Decision Trees. Taken together these updates improve the pulsar recall rate by 2.5 per cent, while also improving the ability to identify pulsars with wide pulse profiles, often misclassified by the previous classifier. The new ensemble classifier is also able to reduce the percentage of false positive candidates identified from each LOTAAS pointing from 2.5 per cent (˜500 candidates) to 1.1 per cent (˜220 candidates).

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Torcellini, P.; Pless, S.; Lobato, C.

    Ongoing work at the National Renewable Energy Laboratory indicates that net-zero energy building (NZEB) status is both achievable and repeatable today. This paper presents a definition framework for classifying NZEBs and a real-life example that demonstrates how a large-scale office building can cost-effectively achieve net-zero energy.

  12. Classification and asymptotic scaling of the light-cone wave-function amplitudes of hadrons

    DOE PAGES

    Ji, Xiangdong; Ma, Jian-Ping; Yuan, Feng

    2004-01-29

    Here we classify the hadron light-cone wave-function amplitudes in terms of parton helicity, orbital angular momentum, and quark-flavor and color symmetries. We show in detail how this is done for the pion, ρ meson, nucleon, and delta resonance up to and including three partons. For the pion and nucleon, we also consider four-parton amplitudes. Using the scaling law derived previously, we show how these amplitudes scale in the limit that all parton transverse momenta become large.

  13. Detecting natural occlusion boundaries using local cues

    PubMed Central

    DiMattina, Christopher; Fox, Sean A.; Lewicki, Michael S.

    2012-01-01

    Occlusion boundaries and junctions provide important cues for inferring three-dimensional scene organization from two-dimensional images. Although several investigators in machine vision have developed algorithms for detecting occlusions and other edges in natural images, relatively few psychophysics or neurophysiology studies have investigated what features are used by the visual system to detect natural occlusions. In this study, we addressed this question using a psychophysical experiment where subjects discriminated image patches containing occlusions from patches containing surfaces. Image patches were drawn from a novel occlusion database containing labeled occlusion boundaries and textured surfaces in a variety of natural scenes. Consistent with related previous work, we found that relatively large image patches were needed to attain reliable performance, suggesting that human subjects integrate complex information over a large spatial region to detect natural occlusions. By defining machine observers using a set of previously studied features measured from natural occlusions and surfaces, we demonstrate that simple features defined at the spatial scale of the image patch are insufficient to account for human performance in the task. To define machine observers using a more biologically plausible multiscale feature set, we trained standard linear and neural network classifiers on the rectified outputs of a Gabor filter bank applied to the image patches. We found that simple linear classifiers could not match human performance, while a neural network classifier combining filter information across location and spatial scale compared well. These results demonstrate the importance of combining a variety of cues defined at multiple spatial scales for detecting natural occlusions. PMID:23255731

  14. Advanced Cell Classifier: User-Friendly Machine-Learning-Based Software for Discovering Phenotypes in High-Content Imaging Data.

    PubMed

    Piccinini, Filippo; Balassa, Tamas; Szkalisity, Abel; Molnar, Csaba; Paavolainen, Lassi; Kujala, Kaisa; Buzas, Krisztina; Sarazova, Marie; Pietiainen, Vilja; Kutay, Ulrike; Smith, Kevin; Horvath, Peter

    2017-06-28

    High-content, imaging-based screens now routinely generate data on a scale that precludes manual verification and interrogation. Software applying machine learning has become an essential tool to automate analysis, but these methods require annotated examples to learn from. Efficiently exploring large datasets to find relevant examples remains a challenging bottleneck. Here, we present Advanced Cell Classifier (ACC), a graphical software package for phenotypic analysis that addresses these difficulties. ACC applies machine-learning and image-analysis methods to high-content data generated by large-scale, cell-based experiments. It features methods to mine microscopic image data, discover new phenotypes, and improve recognition performance. We demonstrate that these features substantially expedite the training process, successfully uncover rare phenotypes, and improve the accuracy of the analysis. ACC is extensively documented, designed to be user-friendly for researchers without machine-learning expertise, and distributed as a free open-source tool at www.cellclassifier.org. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Studies of Sub-Synchronous Oscillations in Large-Scale Wind Farm Integrated System

    NASA Astrophysics Data System (ADS)

    Yue, Liu; Hang, Mend

    2018-01-01

    With the rapid development and construction of large-scale wind farms and grid-connected operation, the series compensation wind power AC transmission is gradually becoming the main way of power usage and improvement of wind power availability and grid stability, but the integration of wind farm will change the SSO (Sub-Synchronous oscillation) damping characteristics of synchronous generator system. Regarding the above SSO problem caused by integration of large-scale wind farms, this paper focusing on doubly fed induction generator (DFIG) based wind farms, aim to summarize the SSO mechanism in large-scale wind power integrated system with series compensation, which can be classified as three types: sub-synchronous control interaction (SSCI), sub-synchronous torsional interaction (SSTI), sub-synchronous resonance (SSR). Then, SSO modelling and analysis methods are categorized and compared by its applicable areas. Furthermore, this paper summarizes the suppression measures of actual SSO projects based on different control objectives. Finally, the research prospect on this field is explored.

  16. The fusion of large scale classified side-scan sonar image mosaics.

    PubMed

    Reed, Scott; Tena, Ruiz Ioseba; Capus, Chris; Petillot, Yvan

    2006-07-01

    This paper presents a unified framework for the creation of classified maps of the seafloor from sonar imagery. Significant challenges in photometric correction, classification, navigation and registration, and image fusion are addressed. The techniques described are directly applicable to a range of remote sensing problems. Recent advances in side-scan data correction are incorporated to compensate for the sonar beam pattern and motion of the acquisition platform. The corrected images are segmented using pixel-based textural features and standard classifiers. In parallel, the navigation of the sonar device is processed using Kalman filtering techniques. A simultaneous localization and mapping framework is adopted to improve the navigation accuracy and produce georeferenced mosaics of the segmented side-scan data. These are fused within a Markovian framework and two fusion models are presented. The first uses a voting scheme regularized by an isotropic Markov random field and is applicable when the reliability of each information source is unknown. The Markov model is also used to inpaint regions where no final classification decision can be reached using pixel level fusion. The second model formally introduces the reliability of each information source into a probabilistic model. Evaluation of the two models using both synthetic images and real data from a large scale survey shows significant quantitative and qualitative improvement using the fusion approach.

  17. Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.

    PubMed

    Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang

    2015-01-01

    Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.

  18. Large-scale Activities Associated with the 2005 Sep. 7th Event

    NASA Astrophysics Data System (ADS)

    Zong, Weiguo

    We present a multi-wavelength study on large-scale activities associated with a significant solar event. On 2005 September 7, a flare classified as bigger than X17 was observed. Combining with Hα 6562.8 ˚, He I 10830 ˚and soft X-ray observations, three large-scale activities were A A found to propagate over a long distance on the solar surface. 1) The first large-scale activity emanated from the flare site, which propagated westward around the solar equator and appeared as sequential brightenings. With MDI longitudinal magnetic field map, the activity was found to propagate along the magnetic network. 2) The second large-scale activity could be well identified both in He I 10830 ˚images and soft X-ray images and appeared as diffuse emission A enhancement propagating away. The activity started later than the first one and was not centric on the flare site. Moreover, a rotation was found along with the bright front propagating away. 3) The third activity was ahead of the second one, which was identified as a "winking" filament. The three activities have different origins, which were seldom observed in one event. Therefore this study is useful to understand the mechanism of large-scale activities on solar surface.

  19. INVENTORY AND CLASSIFICATION OF GREAT LAKES COASTAL WETLANDS FOR MONITORING AND ASSESSMENT AT LARGE SPATIAL SCALES

    EPA Science Inventory

    Monitoring aquatic resources for regional assessments requires an accurate and comprehensive inventory of the resource and useful classification of exosystem similarities. Our research effort to create an electronic database and work with various ways to classify coastal wetlands...

  20. Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

    PubMed Central

    Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

    2013-01-01

    The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920

  1. A fast learning method for large scale and multi-class samples of SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  2. Textual and visual content-based anti-phishing: a Bayesian approach.

    PubMed

    Zhang, Haijun; Liu, Gang; Chow, Tommy W S; Liu, Wenyin

    2011-10-01

    A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases. © 2011 IEEE

  3. Application of hierarchical clustering method to classify of space-time rainfall patterns

    NASA Astrophysics Data System (ADS)

    Yu, Hwa-Lung; Chang, Tu-Je

    2010-05-01

    Understanding the local precipitation patterns is essential to the water resources management and flooding mitigation. The precipitation patterns can vary in space and time depending upon the factors from different spatial scales such as local topological changes and macroscopic atmospheric circulation. The spatiotemporal variation of precipitation in Taiwan is significant due to its complex terrain and its location at west pacific and subtropical area, where is the boundary between the pacific ocean and Asia continent with the complex interactions among the climatic processes. This study characterizes local-scale precipitation patterns by classifying the historical space-time precipitation records. We applied the hierarchical ascending clustering method to analyze the precipitation records from 1960 to 2008 at the six rainfall stations located in Lan-yang catchment at the northeast of the island. Our results identify the four primary space-time precipitation types which may result from distinct driving forces from the changes of atmospheric variables and topology at different space-time scales. This study also presents an important application of the statistical downscaling to combine large-scale upper-air circulation with local space-time precipitation patterns.

  4. A scaling transformation for classifier output based on likelihood ratio: Applications to a CAD workstation for diagnosis of breast cancer

    PubMed Central

    Horsch, Karla; Pesce, Lorenzo L.; Giger, Maryellen L.; Metz, Charles E.; Jiang, Yulei

    2012-01-01

    Purpose: The authors developed scaling methods that monotonically transform the output of one classifier to the “scale” of another. Such transformations affect the distribution of classifier output while leaving the ROC curve unchanged. In particular, they investigated transformations between radiologists and computer classifiers, with the goal of addressing the problem of comparing and interpreting case-specific values of output from two classifiers. Methods: Using both simulated and radiologists’ rating data of breast imaging cases, the authors investigated a likelihood-ratio-scaling transformation, based on “matching” classifier likelihood ratios. For comparison, three other scaling transformations were investigated that were based on matching classifier true positive fraction, false positive fraction, or cumulative distribution function, respectively. The authors explored modifying the computer output to reflect the scale of the radiologist, as well as modifying the radiologist’s ratings to reflect the scale of the computer. They also evaluated how dataset size affects the transformations. Results: When ROC curves of two classifiers differed substantially, the four transformations were found to be quite different. The likelihood-ratio scaling transformation was found to vary widely from radiologist to radiologist. Similar results were found for the other transformations. Our simulations explored the effect of database sizes on the accuracy of the estimation of our scaling transformations. Conclusions: The likelihood-ratio-scaling transformation that the authors have developed and evaluated was shown to be capable of transforming computer and radiologist outputs to a common scale reliably, thereby allowing the comparison of the computer and radiologist outputs on the basis of a clinically relevant statistic. PMID:22559651

  5. Detection of Neuron Membranes in Electron Microscopy Images Using Multi-scale Context and Radon-Like Features

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seyedhosseini, Mojtaba; Kumar, Ritwik; Jurrus, Elizabeth R.

    2011-10-01

    Automated neural circuit reconstruction through electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that exploits multi-scale contextual information together with Radon-like features (RLF) to learn a series of discriminative models. The main idea is to build a framework which is capable of extracting information about cell membranes from a large contextual area of an EM image in a computationally efficient way. Toward this goal, we extract RLF that can be computed efficiently from the input image and generate a scale-space representation of the context images that are obtained at the output ofmore » each discriminative model in the series. Compared to a single-scale model, the use of a multi-scale representation of the context image gives the subsequent classifiers access to a larger contextual area in an effective way. Our strategy is general and independent of the classifier and has the potential to be used in any context based framework. We demonstrate that our method outperforms the state-of-the-art algorithms in detection of neuron membranes in EM images.« less

  6. Dynamic and scalable audio classification by collective network of binary classifiers framework: an evolutionary approach.

    PubMed

    Kiranyaz, Serkan; Mäkinen, Toni; Gabbouj, Moncef

    2012-10-01

    In this paper, we propose a novel framework based on a collective network of evolutionary binary classifiers (CNBC) to address the problems of feature and class scalability. The main goal of the proposed framework is to achieve a high classification performance over dynamic audio and video repositories. The proposed framework adopts a "Divide and Conquer" approach in which an individual network of binary classifiers (NBC) is allocated to discriminate each audio class. An evolutionary search is applied to find the best binary classifier in each NBC with respect to a given criterion. Through the incremental evolution sessions, the CNBC framework can dynamically adapt to each new incoming class or feature set without resorting to a full-scale re-training or re-configuration. Therefore, the CNBC framework is particularly designed for dynamically varying databases where no conventional static classifiers can adapt to such changes. In short, it is entirely a novel topology, an unprecedented approach for dynamic, content/data adaptive and scalable audio classification. A large set of audio features can be effectively used in the framework, where the CNBCs make appropriate selections and combinations so as to achieve the highest discrimination among individual audio classes. Experiments demonstrate a high classification accuracy (above 90%) and efficiency of the proposed framework over large and dynamic audio databases. Copyright © 2012 Elsevier Ltd. All rights reserved.

  7. Revealing the z ~ 2.5 Cosmic Web with 3D Lyα Forest Tomography: a Deformation Tensor Approach

    NASA Astrophysics Data System (ADS)

    Lee, Khee-Gan; White, Martin

    2016-11-01

    Studies of cosmological objects should take into account their positions within the cosmic web of large-scale structure. Unfortunately, the cosmic web has only been extensively mapped at low redshifts (z\\lt 1), using galaxy redshifts as tracers of the underlying density field. At z\\gt 1, the required galaxy densities are inaccessible for the foreseeable future, but 3D reconstructions of Lyα forest absorption in closely separated background QSOs and star-forming galaxies already offer a detailed window into z˜ 2-3 large-scale structure. We quantify the utility of such maps for studying the cosmic web by using realistic z = 2.5 Lyα forest simulations matched to observational properties of upcoming surveys. A deformation tensor-based analysis is used to classify voids, sheets, filaments, and nodes in the flux, which are compared to those determined from the underlying dark matter (DM) field. We find an extremely good correspondence, with 70% of the volume in the flux maps correctly classified relative to the DM web, and 99% classified to within one eigenvalue. This compares favorably to the performance of galaxy-based classifiers with even the highest galaxy densities from low-redshift surveys. We find that narrow survey geometries can degrade the recovery of the cosmic web unless the survey is ≳ 60 {h}-1 {Mpc} or ≳ 1 deg on the sky. We also examine halo abundances as a function of the cosmic web, and find a clear dependence as a function of flux overdensity, but little explicit dependence on the cosmic web. These methods will provide a new window on cosmological environments of galaxies at this very special time in galaxy formation, “high noon,” and on overall properties of cosmological structures at this epoch.

  8. Quantity Representation in Children and Rhesus Monkeys: Linear Versus Logarithmic Scales

    ERIC Educational Resources Information Center

    Beran, Michael J.; Johnson-Pynn, Julie S.; Ready, Christopher

    2008-01-01

    The performances of 4- and 5-year-olds and rhesus monkeys were compared using a computerized task for quantity assessment. Participants first learned two quantity anchor values and then responded to intermediate values by classifying them as similar to either the large anchor or the small anchor. Of primary interest was an assessment of where the…

  9. AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.

    PubMed

    Yao, Hantao; Zhang, Shiliang; Yan, Chenggang; Zhang, Yongdong; Li, Jintao; Tian, Qi

    Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.

  10. Geomorphic Flood Area (GFA): a DEM-based tool for flood susceptibility mapping at large scales

    NASA Astrophysics Data System (ADS)

    Manfreda, S.; Samela, C.; Albano, R.; Sole, A.

    2017-12-01

    Flood hazard and risk mapping over large areas is a critical issue. Recently, many researchers are trying to achieve a global scale mapping encountering several difficulties, above all the lack of data and implementation costs. In data scarce environments, a preliminary and cost-effective floodplain delineation can be performed using geomorphic methods (e.g., Manfreda et al., 2014). We carried out several years of research on this topic, proposing a morphologic descriptor named Geomorphic Flood Index (GFI) (Samela et al., 2017) and developing a Digital Elevation Model (DEM)-based procedure able to identify flood susceptible areas. The procedure exhibited high accuracy in several test sites in Europe, United States and Africa (Manfreda et al., 2015; Samela et al., 2016, 2017) and has been recently implemented in a QGIS plugin named Geomorphic Flood Area (GFA) - tool. The tool allows to automatically compute the GFI, and turn it into a linear binary classifier capable of detecting flood-prone areas. To train this classifier, an inundation map derived using hydraulic models for a small portion of the basin is required (the minimum is 2% of the river basin's area). In this way, the GFA-tool allows to extend the classification of the flood-prone areas across the entire basin. We are also defining a simplified procedure for the estimation of the river depth, which may be helpful for large-scale analyses to approximatively evaluate the expected flood damages in the surrounding areas. ReferencesManfreda, S., Nardi, F., Samela, C., Grimaldi, S., Taramasso, A. C., Roth, G., & Sole, A. (2014). Investigation on the use of geomorphic approaches for the delineation of flood prone areas. J. Hydrol., 517, 863-876. Manfreda, S., Samela, C., Gioia, A., Consoli, G., Iacobellis, V., Giuzio, L., & Sole, A. (2016). Flood-prone areas assessment using linear binary classifiers based on flood maps obtained from 1D and 2D hydraulic models. Nat. Hazards, Vol. 79 (2), pp 735-754. Samela, C., Manfreda, S., Paola, F. D., Giugni, M., Sole, A., & Fiorentino, M. (2016). DEM-Based Approaches for the Delineation of Flood-Prone Areas in an Ungauged Basin in Africa. J. Hydrol. Eng,, 06015010. Samela, C., Troy, T. J., & Manfreda, S. (2017a). Geomorphic classifiers for flood-prone areas delineation for data-scarce environments. Adv. Water Resour., 102, 13-28.

  11. Supervised Outlier Detection in Large-Scale Mvs Point Clouds for 3d City Modeling Applications

    NASA Astrophysics Data System (ADS)

    Stucker, C.; Richard, A.; Wegner, J. D.; Schindler, K.

    2018-05-01

    We propose to use a discriminative classifier for outlier detection in large-scale point clouds of cities generated via multi-view stereo (MVS) from densely acquired images. What makes outlier removal hard are varying distributions of inliers and outliers across a scene. Heuristic outlier removal using a specific feature that encodes point distribution often delivers unsatisfying results. Although most outliers can be identified correctly (high recall), many inliers are erroneously removed (low precision), too. This aggravates object 3D reconstruction due to missing data. We thus propose to discriminatively learn class-specific distributions directly from the data to achieve high precision. We apply a standard Random Forest classifier that infers a binary label (inlier or outlier) for each 3D point in the raw, unfiltered point cloud and test two approaches for training. In the first, non-semantic approach, features are extracted without considering the semantic interpretation of the 3D points. The trained model approximates the average distribution of inliers and outliers across all semantic classes. Second, semantic interpretation is incorporated into the learning process, i.e. we train separate inlieroutlier classifiers per semantic class (building facades, roof, ground, vegetation, fields, and water). Performance of learned filtering is evaluated on several large SfM point clouds of cities. We find that results confirm our underlying assumption that discriminatively learning inlier-outlier distributions does improve precision over global heuristics by up to ≍ 12 percent points. Moreover, semantically informed filtering that models class-specific distributions further improves precision by up to ≍ 10 percent points, being able to remove very isolated building, roof, and water points while preserving inliers on building facades and vegetation.

  12. DeepDeath: Learning to predict the underlying cause of death with Big Data.

    PubMed

    Hassanzadeh, Hamid Reza; Ying Sha; Wang, May D

    2017-07-01

    Multiple cause-of-death data provides a valuable source of information that can be used to enhance health standards by predicting health related trajectories in societies with large populations. These data are often available in large quantities across U.S. states and require Big Data techniques to uncover complex hidden patterns. We design two different classes of models suitable for large-scale analysis of mortality data, a Hadoop-based ensemble of random forests trained over N-grams, and the DeepDeath, a deep classifier based on the recurrent neural network (RNN). We apply both classes to the mortality data provided by the National Center for Health Statistics and show that while both perform significantly better than the random classifier, the deep model that utilizes long short-term memory networks (LSTMs), surpasses the N-gram based models and is capable of learning the temporal aspect of the data without a need for building ad-hoc, expert-driven features.

  13. An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.

    PubMed

    Tuarob, Suppawong; Tucker, Conrad S; Salathe, Marcel; Ram, Nilam

    2014-06-01

    The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts. Two sets of evaluations are conducted to investigate the proposed model's ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media. The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Evaluating data mining algorithms using molecular dynamics trajectories.

    PubMed

    Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis

    2013-01-01

    Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.

  15. Hierarchical ensemble of global and local classifiers for face recognition.

    PubMed

    Su, Yu; Shan, Shiguang; Chen, Xilin; Gao, Wen

    2009-08-01

    In the literature of psychophysics and neurophysiology, many studies have shown that both global and local features are crucial for face representation and recognition. This paper proposes a novel face recognition method which exploits both global and local discriminative features. In this method, global features are extracted from the whole face images by keeping the low-frequency coefficients of Fourier transform, which we believe encodes the holistic facial information, such as facial contour. For local feature extraction, Gabor wavelets are exploited considering their biological relevance. After that, Fisher's linear discriminant (FLD) is separately applied to the global Fourier features and each local patch of Gabor features. Thus, multiple FLD classifiers are obtained, each embodying different facial evidences for face recognition. Finally, all these classifiers are combined to form a hierarchical ensemble classifier. We evaluate the proposed method using two large-scale face databases: FERET and FRGC version 2.0. Experiments show that the results of our method are impressively better than the best known results with the same evaluation protocol.

  16. Fast and sensitive taxonomic classification for metagenomics with Kaiju

    PubMed Central

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-01-01

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk. PMID:27071849

  17. Fast and sensitive taxonomic classification for metagenomics with Kaiju.

    PubMed

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-04-13

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows-Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk.

  18. The epistemic culture in an online citizen science project: Programs, antiprograms and epistemic subjects.

    PubMed

    Kasperowski, Dick; Hillman, Thomas

    2018-05-01

    In the past decade, some areas of science have begun turning to masses of online volunteers through open calls for generating and classifying very large sets of data. The purpose of this study is to investigate the epistemic culture of a large-scale online citizen science project, the Galaxy Zoo, that turns to volunteers for the classification of images of galaxies. For this task, we chose to apply the concepts of programs and antiprograms to examine the 'essential tensions' that arise in relation to the mobilizing values of a citizen science project and the epistemic subjects and cultures that are enacted by its volunteers. Our premise is that these tensions reveal central features of the epistemic subjects and distributed cognition of epistemic cultures in these large-scale citizen science projects.

  19. Sensing Urban Land-Use Patterns by Integrating Google Tensorflow and Scene-Classification Models

    NASA Astrophysics Data System (ADS)

    Yao, Y.; Liang, H.; Li, X.; Zhang, J.; He, J.

    2017-09-01

    With the rapid progress of China's urbanization, research on the automatic detection of land-use patterns in Chinese cities is of substantial importance. Deep learning is an effective method to extract image features. To take advantage of the deep-learning method in detecting urban land-use patterns, we applied a transfer-learning-based remote-sensing image approach to extract and classify features. Using the Google Tensorflow framework, a powerful convolution neural network (CNN) library was created. First, the transferred model was previously trained on ImageNet, one of the largest object-image data sets, to fully develop the model's ability to generate feature vectors of standard remote-sensing land-cover data sets (UC Merced and WHU-SIRI). Then, a random-forest-based classifier was constructed and trained on these generated vectors to classify the actual urban land-use pattern on the scale of traffic analysis zones (TAZs). To avoid the multi-scale effect of remote-sensing imagery, a large random patch (LRP) method was used. The proposed method could efficiently obtain acceptable accuracy (OA = 0.794, Kappa = 0.737) for the study area. In addition, the results show that the proposed method can effectively overcome the multi-scale effect that occurs in urban land-use classification at the irregular land-parcel level. The proposed method can help planners monitor dynamic urban land use and evaluate the impact of urban-planning schemes.

  20. Medical image classification based on multi-scale non-negative sparse coding.

    PubMed

    Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar

    2017-11-01

    With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Classification of mechanisms, climatic context, areal scaling, and synchronization of floods: the hydroclimatology of floods in the Upper Paraná River basin, Brazil

    NASA Astrophysics Data System (ADS)

    Lima, Carlos H. R.; AghaKouchak, Amir; Lall, Upmanu

    2017-12-01

    Floods are the main natural disaster in Brazil, causing substantial economic damage and loss of life. Studies suggest that some extreme floods result from a causal climate chain. Exceptional rain and floods are determined by large-scale anomalies and persistent patterns in the atmospheric and oceanic circulations, which influence the magnitude, extent, and duration of these extremes. Moreover, floods can result from different generating mechanisms. These factors contradict the assumptions of homogeneity, and often stationarity, in flood frequency analysis. Here we outline a methodological framework based on clustering using self-organizing maps (SOMs) that allows the linkage of large-scale processes to local-scale observations. The methodology is applied to flood data from several sites in the flood-prone Upper Paraná River basin (UPRB) in southern Brazil. The SOM clustering approach is employed to classify the 6-day rainfall field over the UPRB into four categories, which are then used to classify floods into four types based on the spatiotemporal dynamics of the rainfall field prior to the observed flood events. An analysis of the vertically integrated moisture fluxes, vorticity, and high-level atmospheric circulation revealed that these four clusters are related to known tropical and extratropical processes, including the South American low-level jet (SALLJ); extratropical cyclones; and the South Atlantic Convergence Zone (SACZ). Persistent anomalies in the sea surface temperature fields in the Pacific and Atlantic oceans are also found to be associated with these processes. Floods associated with each cluster present different patterns in terms of frequency, magnitude, spatial variability, scaling, and synchronization of events across the sites and subbasins. These insights suggest new directions for flood risk assessment, forecasting, and management.

  2. Characterizing the Severe Turbulence Environments Associated With Commercial Aviation Accidents. Part 1; 44 Case Study Synoptic Observational Analyses

    NASA Technical Reports Server (NTRS)

    Kaplan, Michael L.; Huffman, Allan W.; Lux, Kevin M.; Charney, Joseph J.; Riordan, Allan J.; Lin, Yuh-Lang; Proctor, Fred H. (Technical Monitor)

    2002-01-01

    A 44 case study analysis of the large-scale atmospheric structure associated with development of accident-producing aircraft turbulence is described. Categorization is a function of the accident location, altitude, time of year, time of day, and the turbulence category, which classifies disturbances. National Centers for Environmental Prediction Reanalyses data sets and satellite imagery are employed to diagnose synoptic scale predictor fields associated with the large-scale environment preceding severe turbulence. These analyses indicate a predominance of severe accident-producing turbulence within the entrance region of a jet stream at the synoptic scale. Typically, a flow curvature region is just upstream within the jet entrance region, convection is within 100 km of the accident, vertical motion is upward, absolute vorticity is low, vertical wind shear is increasing, and horizontal cold advection is substantial. The most consistent predictor is upstream flow curvature and nearby convection is the second most frequent predictor.

  3. Synoptic climatology of the long-distance dispersal of white pine blister rust I. Development of an upper level synoptic classification

    Treesearch

    K. L. Frank; L. S. Kalkstein; B. W. Geils; H. W. Thistle

    2008-01-01

    This study developed a methodology to temporally classify large scale, upper level atmospheric conditions over North America, utilizing a newly-developed upper level synoptic classification (ULSC). Four meteorological variables: geopotential height, specific humidity, and u- and v-wind components, at the 500 hPa level over North America were obtained from the NCEP/NCAR...

  4. Strengthening Policies and Practices for the Initial Classification of English Learners: Insights from a National Working Session

    ERIC Educational Resources Information Center

    Cook, H. Gary; Linquanti, Robert

    2015-01-01

    This report summarizes and further develops ideas discussed at a national working session held on May 23, 2014, to examine issues and options associated with initially classifying English learners (ELs). It is the third in a series of guidance papers intended to support states in large-scale assessment consortia that are expected to move toward a…

  5. J plots: a new method for characterizing structures in the interstellar medium

    NASA Astrophysics Data System (ADS)

    Jaffa, S. E.; Whitworth, A. P.; Clarke, S. D.; Howard, A. D. P.

    2018-06-01

    Large-scale surveys have brought about a revolution in astronomy. To analyse the resulting wealth of data, we need automated tools to identify, classify, and quantify the important underlying structures. We present here a method for classifying and quantifying a pixelated structure, based on its principal moments of inertia. The method enables us to automatically detect, and objectively compare, centrally condensed cores, elongated filaments, and hollow rings. We illustrate the method by applying it to (i) observations of surface density from Hi-GAL, and (ii) simulations of filament growth in a turbulent medium. We limit the discussion here to 2D data; in a future paper, we will extend the method to 3D data.

  6. The HI Content of Galaxies as a Function of Local Density and Large-Scale Environment

    NASA Astrophysics Data System (ADS)

    Thoreen, Henry; Cantwell, Kelly; Maloney, Erin; Cane, Thomas; Brough Morris, Theodore; Flory, Oscar; Raskin, Mark; Crone-Odekon, Mary; ALFALFA Team

    2017-01-01

    We examine the HI content of galaxies as a function of environment, based on a catalogue of 41527 galaxies that are part of the 70% complete Arecibo Legacy Fast-ALFA (ALFALFA) survey. We use nearest-neighbor methods to characterize local environment, and a modified version of the algorithm developed for the Galaxy and Mass Assembly (GAMA) survey to classify large-scale environment as group, filament, tendril, or void. We compare the HI content in these environments using statistics that include both HI detections and the upper limits on detections from ALFALFA. The large size of the sample allows to statistically compare the HI content in different environments for early-type galaxies as well as late-type galaxies. This work is supported by NSF grants AST-1211005 and AST-1637339, the Skidmore Faculty-Student Summer Research program, and the Schupf Scholars program.

  7. Depression assessment after traumatic brain injury: an empirically based classification method.

    PubMed

    Seel, Ronald T; Kreutzer, Jeffrey S

    2003-11-01

    To describe the patterns of depression in patients with traumatic brain injury (TBI), to evaluate the psychometric properties of the Neurobehavioral Functioning Inventory (NFI) Depression Scale, and to classify empirically NFI Depression Scale scores. Depressive symptoms were characterized by using the NFI Depression Scale, the Beck Depression Inventory (BDI), and the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) Depression Scale. An outpatient clinic within a Traumatic Brain Injury Model Systems center. A demographically diverse sample of 172 outpatients with TBI, evaluated between 1996 and 2000. Not applicable. The NFI, BDI, and MMPI-2 Depression Scale. The Cronbach alpha, analysis of variance, Pearson correlations, and canonical discriminant function analysis were used to examine the psychometric properties of the NFI Depression Scale. Patients with TBI most frequently reported problems with frustration (81%), restlessness (73%), rumination (69%), boredom (66%), and sadness (66%) with the NFI Depression Scale. The percentages of patients classified as depressed with the BDI and the NFI Depression Scale were 37% and 30%, respectively. The Cronbach alpha for the NFI Depression Scale was.93, indicating a high degree of internal consistency. As hypothesized, NFI Depression Scale scores correlated highly with BDI (r=.765) and MMPI-2 Depression Scale T scores (r=.752). The NFI Depression Scale did not correlate significantly with the MMPI-2 Hypomania Scale, thus showing discriminant validity. Normal and clinically depressed BDI scores were most likely to be accurately predicted by the NFI Depression Scale, with 81% and 87% of grouped cases, respectively, correctly classified. Normal and depressed MMPI-2 Depression Scale scores were accurately predicted by the NFI Depression Scale, with 75% and 83% of grouped cases correctly classified, respectively. Patients' NFI Depression Scale scores were mapped to the corresponding BDI categories, and 3 NFI score classifications emerged: minimally depressed (13-28), borderline depressed (29-42), and clinically depressed (43-65). Our study provided further evidence that screening for depression should be a standard component of TBI assessment protocols. Between 30% and 38% of patients with TBI were classified as depressed with the NFI Depression Scale and the BDI, respectively. Our findings also provided empirical evidence that the NFI Depression Scale is a useful tool for classifying postinjury depression.

  8. Classification of JERS-1 Image Mosaic of Central Africa Using A Supervised Multiscale Classifier of Texture Features

    NASA Technical Reports Server (NTRS)

    Saatchi, Sassan; DeGrandi, Franco; Simard, Marc; Podest, Erika

    1999-01-01

    In this paper, a multiscale approach is introduced to classify the Japanese Research Satellite-1 (JERS-1) mosaic image over the Central African rainforest. A series of texture maps are generated from the 100 m mosaic image at various scales. Using a quadtree model and relating classes at each scale by a Markovian relationship, the multiscale images are classified from course to finer scale. The results are verified at various scales and the evolution of classification is monitored by calculating the error at each stage.

  9. How well can regional fluxes be derived from smaller-scale estimates?

    NASA Technical Reports Server (NTRS)

    Moore, Kathleen E.; Fitzjarrald, David R.; Ritter, John A.

    1992-01-01

    Regional surface fluxes are essential lower boundary conditions for large scale numerical weather and climate models and are the elements of global budgets of important trace gases. Surface properties affecting the exchange of heat, moisture, momentum and trace gases vary with length scales from one meter to hundreds of km. A classical difficulty is that fluxes have been measured directly only at points or along lines. The process of scaling up observations limited in space and/or time to represent larger areas was done by assigning properties to surface classes and combining estimated or calculated fluxes using an area weighted average. It is not clear that a simple area weighted average is sufficient to produce the large scale from the small scale, chiefly due to the effect of internal boundary layers, nor is it known how important the uncertainty is to large scale model outcomes. Simultaneous aircraft and tower data obtained in the relatively simple terrain of the western Alaska tundra were used to determine the extent to which surface type variation can be related to fluxes of heat, moisture, and other properties. Surface type was classified as lake or land with aircraft borne infrared thermometer, and flight level heat and moisture fluxes were related to surface type. The magnitude and variety of sampling errors inherent in eddy correlation flux estimation place limits on how well any flux can be known even in simple geometries.

  10. TomoMiner and TomoMinerCloud: A software platform for large-scale subtomogram structural analysis

    PubMed Central

    Frazier, Zachary; Xu, Min; Alber, Frank

    2017-01-01

    SUMMARY Cryo-electron tomography (cryoET) captures the 3D electron density distribution of macromolecular complexes in close to native state. With the rapid advance of cryoET acquisition technologies, it is possible to generate large numbers (>100,000) of subtomograms, each containing a macromolecular complex. Often, these subtomograms represent a heterogeneous sample due to variations in structure and composition of a complex in situ form or because particles are a mixture of different complexes. In this case subtomograms must be classified. However, classification of large numbers of subtomograms is a time-intensive task and often a limiting bottleneck. This paper introduces an open source software platform, TomoMiner, for large-scale subtomogram classification, template matching, subtomogram averaging, and alignment. Its scalable and robust parallel processing allows efficient classification of tens to hundreds of thousands of subtomograms. Additionally, TomoMiner provides a pre-configured TomoMinerCloud computing service permitting users without sufficient computing resources instant access to TomoMiners high-performance features. PMID:28552576

  11. Informational and emotional elements in online support groups: a Bayesian approach to large-scale content analysis.

    PubMed

    Deetjen, Ulrike; Powell, John A

    2016-05-01

    This research examines the extent to which informational and emotional elements are employed in online support forums for 14 purposively sampled chronic medical conditions and the factors that influence whether posts are of a more informational or emotional nature. Large-scale qualitative data were obtained from Dailystrength.org. Based on a hand-coded training dataset, all posts were classified into informational or emotional using a Bayesian classification algorithm to generalize the findings. Posts that could not be classified with a probability of at least 75% were excluded. The overall tendency toward emotional posts differs by condition: mental health (depression, schizophrenia) and Alzheimer's disease consist of more emotional posts, while informational posts relate more to nonterminal physical conditions (irritable bowel syndrome, diabetes, asthma). There is no gender difference across conditions, although prostate cancer forums are oriented toward informational support, whereas breast cancer forums rather feature emotional support. Across diseases, the best predictors for emotional content are lower age and a higher number of overall posts by the support group member. The results are in line with previous empirical research and unify empirical findings from single/2-condition research. Limitations include the analytical restriction to predefined categories (informational, emotional) through the chosen machine-learning approach. Our findings provide an empirical foundation for building theory on informational versus emotional support across conditions, give insights for practitioners to better understand the role of online support groups for different patients, and show the usefulness of machine-learning approaches to analyze large-scale qualitative health data from online settings. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Torcellini, P.; Pless, S.; Lobato, C.

    Until recently, large-scale, cost-effective net-zero energy buildings (NZEBs) were thought to lie decades in the future. However, ongoing work at the National Renewable Energy Laboratory (NREL) indicates that NZEB status is both achievable and repeatable today. This paper presents a definition framework for classifying NZEBs and a real-life example that demonstrates how a large-scale office building can cost-effectively achieve net-zero energy. The vision of NZEBs is compelling. In theory, these highly energy-efficient buildings will produce, during a typical year, enough renewable energy to offset the energy they consume from the grid. The NREL NZEB definition framework classifies NZEBs according tomore » the criteria being used to judge net-zero status and the way renewable energy is supplied to achieve that status. We use the new U.S. Department of Energy/NREL 220,000-ft{sub 2} Research Support Facilities (RSF) building to illustrate why a clear picture of NZEB definitions is important and how the framework provides a methodology for creating a cost-effective NZEB. The RSF, scheduled to open in June 2010, includes contractual commitments to deliver a Leadership in Energy Efficiency and Design (LEED) Platinum Rating, an energy use intensity of 25 kBtu/ft{sub 2} (half that of a typical LEED Platinum office building), and net-zero energy status. We will discuss the analysis method and cost tradeoffs that were performed throughout the design and build phases to meet these commitments and maintain construction costs at $259/ft{sub 2}. We will discuss ways to achieve large-scale, replicable NZEB performance. Many passive and renewable energy strategies are utilized, including full daylighting, high-performance lighting, natural ventilation through operable windows, thermal mass, transpired solar collectors, radiant heating and cooling, and workstation configurations allow for maximum daylighting.« less

  13. Investigating a link between large and small-scale chaos features on Europa

    NASA Astrophysics Data System (ADS)

    Tognetti, L.; Rhoden, A.; Nelson, D. M.

    2017-12-01

    Chaos is one of the most recognizable, and studied, features on Europa's surface. Most models of chaos formation invoke liquid water at shallow depths within the ice shell; the liquid destabilizes the overlying ice layer, breaking it into mobile rafts and destroying pre-existing terrain. This class of model has been applied to both large-scale chaos like Conamara and small-scale features (i.e. microchaos), which are typically <10 km in diameter. Currently unknown, however, is whether both large-scale and small-scale features are produced together, e.g. through a network of smaller sills linked to a larger liquid water pocket. If microchaos features do form as satellites of large-scale chaos features, we would expect a drop off in the number density of microchaos with increasing distance from the large chaos feature; the trend should not be observed in regions without large-scale chaos features. Here, we test the hypothesis that large chaos features create "satellite" systems of smaller chaos features. Either outcome will help us better understand the relationship between large-scale chaos and microchaos. We focus first on regions surrounding the large chaos features Conamara and Murias (e.g. the Mitten). We map all chaos features within 90,000 sq km of the main chaos feature and assign each one a ranking (High Confidence, Probable, or Low Confidence) based on the observed characteristics of each feature. In particular, we look for a distinct boundary, loss of preexisting terrain, the existence of rafts or blocks, and the overall smoothness of the feature. We also note features that are chaos-like but lack sufficient characteristics to be classified as chaos. We then apply the same criteria to map microchaos features in regions of similar area ( 90,000 sq km) that lack large chaos features. By plotting the distribution of microchaos with distance from the center point of the large chaos feature or the mapping region (for the cases without a large feature), we determine whether there is a distinct signature linking large-scale chaos features with nearby microchaos. We discuss the implications of these results on the process of chaos formation and the extent of liquid water within Europa's ice shell.

  14. Subjective and Objective Measures of Dryness Symptoms in Primary Sjögren's Syndrome: Capturing the Discrepancy.

    PubMed

    Bezzina, Oriana M; Gallagher, Peter; Mitchell, Sheryl; Bowman, Simon J; Griffiths, Bridget; Hindmarsh, Victoria; Hargreaves, Ben; Price, Elizabeth J; Pease, Colin T; Emery, Paul; Lanyon, Peter; Bombardieri, Michele; Sutcliffe, Nurhan; Pitzalis, Costantino; Hunter, John; Gupta, Monica; McLaren, John; Cooper, Anne M; Regan, Marian; Giles, Ian P; Isenberg, David A; Saravanan, Vadivelu; Coady, David; Dasgupta, Bhaskar; McHugh, Neil J; Young-Min, Steven A; Moots, Robert J; Gendi, Nagui; Akil, Mohammed; MacKay, Kirsten; Ng, W Fai; Robinson, Lucy J

    2017-11-01

    To develop a novel method for capturing the discrepancy between objective tests and subjective dryness symptoms (a sensitivity scale) and to explore predictors of dryness sensitivity. Archive data from the UK Primary Sjögren's Syndrome Registry (n = 688) were used. Patients were classified on a scale from -5 (stoical) to +5 (sensitive) depending on the degree of discrepancy between their objective and subjective symptoms classes. Sensitivity scores were correlated with demographic variables, disease-related factors, and symptoms of pain, fatigue, anxiety, and depression. Patients were on average relatively stoical for both types of dryness symptoms (mean ± SD ocular dryness -0.42 ± 2.2 and -1.24 ± 1.6 oral dryness). Twenty-seven percent of patients were classified as sensitive to ocular dryness and 9% to oral dryness. Hierarchical regression analyses identified the strongest predictor of ocular dryness sensitivity to be self-reported pain and that of oral dryness sensitivity to be self-reported fatigue. Ocular and oral dryness sensitivity can be classified on a continuous scale. The 2 symptom types are predicted by different variables. A large number of factors remain to be explored that may impact symptom sensitivity in primary Sjögren's syndrome, and the proposed method could be used to identify relatively sensitive and stoical patients for future studies. © 2016, The Authors. Arthritis Care & Research published by Wiley Periodicals, Inc. on behalf of American College of Rheumatology.

  15. Consolidation of glycosyl hydrolase family 30 : a dual domain 4/7 hydrolase family consisting of two structurally distinct groups

    Treesearch

    Franz J. St John; Javier M. Gonzalez; Edwin Pozharski

    2010-01-01

    In this work glycosyl hydrolase (GH) family 30 (GH30) is analyzed and shown to consist of its currently classified member sequences as well as several homologous sequence groups currently assigned within family GH5. A large scale amino acid sequence alignment and a phylogenetic tree were generated and GH30 groups and subgroups were designated. A partial rearrangement...

  16. Prediction of drug indications based on chemical interactions and chemical similarities.

    PubMed

    Huang, Guohua; Lu, Yin; Lu, Changhong; Zheng, Mingyue; Cai, Yu-Dong

    2015-01-01

    Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs.

  17. Prediction of Drug Indications Based on Chemical Interactions and Chemical Similarities

    PubMed Central

    Huang, Guohua; Lu, Yin; Lu, Changhong; Cai, Yu-Dong

    2015-01-01

    Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs. PMID:25821813

  18. [Object-oriented segmentation and classification of forest gap based on QuickBird remote sensing image.

    PubMed

    Mao, Xue Gang; Du, Zi Han; Liu, Jia Qian; Chen, Shu Xin; Hou, Ji Yu

    2018-01-01

    Traditional field investigation and artificial interpretation could not satisfy the need of forest gaps extraction at regional scale. High spatial resolution remote sensing image provides the possibility for regional forest gaps extraction. In this study, we used object-oriented classification method to segment and classify forest gaps based on QuickBird high resolution optical remote sensing image in Jiangle National Forestry Farm of Fujian Province. In the process of object-oriented classification, 10 scales (10-100, with a step length of 10) were adopted to segment QuickBird remote sensing image; and the intersection area of reference object (RA or ) and intersection area of segmented object (RA os ) were adopted to evaluate the segmentation result at each scale. For segmentation result at each scale, 16 spectral characteristics and support vector machine classifier (SVM) were further used to classify forest gaps, non-forest gaps and others. The results showed that the optimal segmentation scale was 40 when RA or was equal to RA os . The accuracy difference between the maximum and minimum at different segmentation scales was 22%. At optimal scale, the overall classification accuracy was 88% (Kappa=0.82) based on SVM classifier. Combining high resolution remote sensing image data with object-oriented classification method could replace the traditional field investigation and artificial interpretation method to identify and classify forest gaps at regional scale.

  19. Genome-scale approaches to the epigenetics of common human disease

    PubMed Central

    2011-01-01

    Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology. PMID:19844740

  20. Temporal stability and rates of post-depositional change in geochemical signatures of brown trout Salmo trutta scales.

    PubMed

    Ryan, D; Shephard, S; Kelly, F L

    2016-09-01

    This study investigates temporal stability in the scale microchemistry of brown trout Salmo trutta in feeder streams of a large heterogeneous lake catchment and rates of change after migration into the lake. Laser-ablation inductively coupled plasma mass spectrometry was used to quantify the elemental concentrations of Na, Mg, Mn, Cu, Zn, Ba and Sr in archived (1997-2002) scales of juvenile S. trutta collected from six major feeder streams of Lough Mask, County Mayo, Ireland. Water-element Ca ratios within these streams were determined for the fish sampling period and for a later period (2013-2015). Salmo trutta scale Sr and Ba concentrations were significantly (P < 0·05) correlated with stream water sample Sr:Ca and Ba:Ca ratios respectively from both periods, indicating multi-annual stability in scale and water-elemental signatures. Discriminant analysis of scale chemistries correctly classified 91% of sampled juvenile S. trutta to their stream of origin using a cross-validated classification model. This model was used to test whether assumed post-depositional change in scale element concentrations reduced correct natal stream classification of S. trutta in successive years after migration into Lough Mask. Fish residing in the lake for 1-3 years could be reliably classified to their most likely natal stream, but the probability of correct classification diminished strongly with longer lake residence. Use of scale chemistry to identify natal streams of lake S. trutta should focus on recent migrants, but may not require contemporary water chemistry data. © 2016 The Fisheries Society of the British Isles.

  1. Anomaly detection for medical images based on a one-class classification

    NASA Astrophysics Data System (ADS)

    Wei, Qi; Ren, Yinhao; Hou, Rui; Shi, Bibo; Lo, Joseph Y.; Carin, Lawrence

    2018-02-01

    Detecting an anomaly such as a malignant tumor or a nodule from medical images including mammogram, CT or PET images is still an ongoing research problem drawing a lot of attention with applications in medical diagnosis. A conventional way to address this is to learn a discriminative model using training datasets of negative and positive samples. The learned model can be used to classify a testing sample into a positive or negative class. However, in medical applications, the high unbalance between negative and positive samples poses a difficulty for learning algorithms, as they will be biased towards the majority group, i.e., the negative one. To address this imbalanced data issue as well as leverage the huge amount of negative samples, i.e., normal medical images, we propose to learn an unsupervised model to characterize the negative class. To make the learned model more flexible and extendable for medical images of different scales, we have designed an autoencoder based on a deep neural network to characterize the negative patches decomposed from large medical images. A testing image is decomposed into patches and then fed into the learned autoencoder to reconstruct these patches themselves. The reconstruction error of one patch is used to classify this patch into a binary class, i.e., a positive or a negative one, leading to a one-class classifier. The positive patches highlight the suspicious areas containing anomalies in a large medical image. The proposed method has been tested on InBreast dataset and achieves an AUC of 0.84. The main contribution of our work can be summarized as follows. 1) The proposed one-class learning requires only data from one class, i.e., the negative data; 2) The patch-based learning makes the proposed method scalable to images of different sizes and helps avoid the large scale problem for medical images; 3) The training of the proposed deep convolutional neural network (DCNN) based auto-encoder is fast and stable.

  2. A survey on routing protocols for large-scale wireless sensor networks.

    PubMed

    Li, Changle; Zhang, Hanxiao; Hao, Binbin; Li, Jiandong

    2011-01-01

    With the advances in micro-electronics, wireless sensor devices have been made much smaller and more integrated, and large-scale wireless sensor networks (WSNs) based the cooperation among the significant amount of nodes have become a hot topic. "Large-scale" means mainly large area or high density of a network. Accordingly the routing protocols must scale well to the network scope extension and node density increases. A sensor node is normally energy-limited and cannot be recharged, and thus its energy consumption has a quite significant effect on the scalability of the protocol. To the best of our knowledge, currently the mainstream methods to solve the energy problem in large-scale WSNs are the hierarchical routing protocols. In a hierarchical routing protocol, all the nodes are divided into several groups with different assignment levels. The nodes within the high level are responsible for data aggregation and management work, and the low level nodes for sensing their surroundings and collecting information. The hierarchical routing protocols are proved to be more energy-efficient than flat ones in which all the nodes play the same role, especially in terms of the data aggregation and the flooding of the control packets. With focus on the hierarchical structure, in this paper we provide an insight into routing protocols designed specifically for large-scale WSNs. According to the different objectives, the protocols are generally classified based on different criteria such as control overhead reduction, energy consumption mitigation and energy balance. In order to gain a comprehensive understanding of each protocol, we highlight their innovative ideas, describe the underlying principles in detail and analyze their advantages and disadvantages. Moreover a comparison of each routing protocol is conducted to demonstrate the differences between the protocols in terms of message complexity, memory requirements, localization, data aggregation, clustering manner and other metrics. Finally some open issues in routing protocol design in large-scale wireless sensor networks and conclusions are proposed.

  3. An efficient approach for surveillance of childhood diabetes by type derived from electronic health record data: the SEARCH for Diabetes in Youth Study

    PubMed Central

    Zhong, Victor W; Obeid, Jihad S; Craig, Jean B; Pfaff, Emily R; Thomas, Joan; Jaacks, Lindsay M; Beavers, Daniel P; Carey, Timothy S; Lawrence, Jean M; Dabelea, Dana; Hamman, Richard F; Bowlby, Deborah A; Pihoker, Catherine; Saydah, Sharon H

    2016-01-01

    Objective To develop an efficient surveillance approach for childhood diabetes by type across 2 large US health care systems, using phenotyping algorithms derived from electronic health record (EHR) data. Materials and Methods Presumptive diabetes cases <20 years of age from 2 large independent health care systems were identified as those having ≥1 of the 5 indicators in the past 3.5 years, including elevated HbA1c, elevated blood glucose, diabetes-related billing codes, patient problem list, and outpatient anti-diabetic medications. EHRs of all the presumptive cases were manually reviewed, and true diabetes status and diabetes type were determined. Algorithms for identifying diabetes cases overall and classifying diabetes type were either prespecified or derived from classification and regression tree analysis. Surveillance approach was developed based on the best algorithms identified. Results We developed a stepwise surveillance approach using billing code–based prespecified algorithms and targeted manual EHR review, which efficiently and accurately ascertained and classified diabetes cases by type, in both health care systems. The sensitivity and positive predictive values in both systems were approximately ≥90% for ascertaining diabetes cases overall and classifying cases with type 1 or type 2 diabetes. About 80% of the cases with “other” type were also correctly classified. This stepwise surveillance approach resulted in a >70% reduction in the number of cases requiring manual validation compared to traditional surveillance methods. Conclusion EHR data may be used to establish an efficient approach for large-scale surveillance for childhood diabetes by type, although some manual effort is still needed. PMID:27107449

  4. Computational Short-cutting the Big Data Classification Bottleneck: Using the MODIS Land Cover Product to Derive a Consistent 30 m Landsat Land Cover Product of the Conterminous United States

    NASA Astrophysics Data System (ADS)

    Zhang, H.; Roy, D. P.

    2016-12-01

    Classification is a fundamental process in remote sensing used to relate pixel values to land cover classes present on the surface. The state of the practice for large area land cover classification is to classify satellite time series metrics with a supervised (i.e., training data dependent) non-parametric classifier. Classification accuracy generally increases with training set size. However, training data collection is expensive and the optimal training distribution over large areas is unknown. The MODIS 500 m land cover product is available globally on an annual basis and so provides a potentially very large source of land cover training data. A novel methodology to classify large volume Landsat data using high quality training data derived automatically from the MODIS land cover product is demonstrated for all of the Conterminous United States (CONUS). The known misclassification accuracy of the MODIS land cover product and the scale difference between the 500 m MODIS and 30 m Landsat data are accommodated for by a novel MODIS product filtering, Landsat pixel selection, and iterative training approach to balance the proportion of local and CONUS training data used. Three years of global Web-enabled Landsat data (WELD) data for all of the CONUS are classified using a random forest classifier and the results assessed using random forest `out-of-bag' training samples. The global WELD data are corrected to surface nadir BRDF-Adjusted Reflectance and are defined in 158 × 158 km tiles in the same projection and nested to the MODIS land cover products. This reduces the need to pre-process the considerable Landsat data volume (more than 14,000 Landsat 5 and 7 scenes per year over the CONUS covering 11,000 million 30 m pixels). The methodology is implemented in a parallel manner on WELD tile by tile basis but provides a wall-to-wall seamless 30 m land cover product. Detailed tile and CONUS results are presented and the potential for global production using the recently available global WELD products are discussed.

  5. Weakly Supervised Segmentation-Aided Classification of Urban Scenes from 3d LIDAR Point Clouds

    NASA Astrophysics Data System (ADS)

    Guinard, S.; Landrieu, L.

    2017-05-01

    We consider the problem of the semantic classification of 3D LiDAR point clouds obtained from urban scenes when the training set is limited. We propose a non-parametric segmentation model for urban scenes composed of anthropic objects of simple shapes, partionning the scene into geometrically-homogeneous segments which size is determined by the local complexity. This segmentation can be integrated into a conditional random field classifier (CRF) in order to capture the high-level structure of the scene. For each cluster, this allows us to aggregate the noisy predictions of a weakly-supervised classifier to produce a higher confidence data term. We demonstrate the improvement provided by our method over two publicly-available large-scale data sets.

  6. Unsupervised Pattern Classifier for Abnormality-Scaling of Vibration Features for Helicopter Gearbox Fault Diagnosis

    NASA Technical Reports Server (NTRS)

    Jammu, Vinay B.; Danai, Kourosh; Lewicki, David G.

    1996-01-01

    A new unsupervised pattern classifier is introduced for on-line detection of abnormality in features of vibration that are used for fault diagnosis of helicopter gearboxes. This classifier compares vibration features with their respective normal values and assigns them a value in (0, 1) to reflect their degree of abnormality. Therefore, the salient feature of this classifier is that it does not require feature values associated with faulty cases to identify abnormality. In order to cope with noise and changes in the operating conditions, an adaptation algorithm is incorporated that continually updates the normal values of the features. The proposed classifier is tested using experimental vibration features obtained from an OH-58A main rotor gearbox. The overall performance of this classifier is then evaluated by integrating the abnormality-scaled features for detection of faults. The fault detection results indicate that the performance of this classifier is comparable to the leading unsupervised neural networks: Kohonen's Feature Mapping and Adaptive Resonance Theory (AR72). This is significant considering that the independence of this classifier from fault-related features makes it uniquely suited to abnormality-scaling of vibration features for fault diagnosis.

  7. An Object-Based Machine Learning Classification Procedure for Mapping Impoundments in Brazil's Amazon-Cerrado Agricultural Frontier

    NASA Astrophysics Data System (ADS)

    Solvik, K.; Macedo, M.; Graesser, J.; Lathuilliere, M. J.

    2017-12-01

    Large-scale agriculture and cattle ranching in Brazil has driving the creation of tens of thousands of small stream impoundments to provide water for crops and livestock. These impoundments are a source of methane emissions and have significant impacts on stream temperature, connectivity, and water use over a large region. Due to their large numbers and small size, they are difficult to map using conventional methods. Here, we present a two-stage object-based supervised classification methodology for identifying man-made impoundments in Brazil. First, in Google Earth Engine pixels are classified as water or non-water using satellite data and HydroSHEDS products as predictors. Second, using Python's scikit-learn and scikit-image modules the water objects are classified as man-made or natural based on a variety of shape and spectral properties. Both classifications are performed by a random forest classifier. Training data is acquired by visually identifying impoundments and natural water bodies using high resolution satellite imagery from Google Earth.This methodology was applied to the state of Mato Grosso using a cloud-free mosaic of Sentinel 1 (10m resolution) radar and Sentinel 2 (10-20m) multispectral data acquired during the 2016 dry season. Independent test accuracy was estimated at 95% for the first stage and 93% for the second. We identified 54,294 man-made impoundments in Mato Grosso in 2016. The methodology is generalizable to other high resolution satellite data and has been tested on Landsat 5 and 8 imagery. Applying the same approach to Landsat 8 images (30 m), we identified 35,707 impoundments in the 2015 dry season. The difference in number is likely because the coarser-scale imagery fails to detect small (< 900 m2) objects. On-going work will apply this approach to satellite time series for the entire Amazon-Cerrado frontier, allowing us to track changes in the number, size, and distribution of man-made impoundments. Automated impoundment mapping over large areas may help with management of streams in agricultural landscapes in Brazil and other tropical regions.

  8. New software methods in radar ornithology using WSR-88D weather data and potential application to monitoring effects of climate change on bird migration

    USGS Publications Warehouse

    Mead, Reginald; Paxton, John; Sojda, Richard S.; Swayne, David A.; Yang, Wanhong; Voinov, A.A.; Rizzoli, A.; Filatova, T.

    2010-01-01

    Radar ornithology has provided tools for studying the movement of birds, especially related to migration. Researchers have presented qualitative evidence suggesting that birds, or at least migration events, can be identified using large broad scale radars such as the WSR-88D used in the NEXRAD weather surveillance system. This is potentially a boon for ornithologists because such data cover a large portion of the United States, are constantly being produced, are freely available, and have been archived since the early 1990s. A major obstacle to this research, however, has been that identifying birds in NEXRAD data has required a trained technician to manually inspect a graphically rendered radar sweep. A single site completes one volume scan every five to ten minutes, producing over 52,000 volume scans in one year. This is an immense amount of data, and manual classification is infeasible. We have developed a system that identifies biological echoes using machine learning techniques. This approach begins with training data using scans that have been classified by experts, or uses bird data collected in the field. The data are preprocessed to ensure quality and to emphasize relevant features. A classifier is then trained using this data and cross validation is used to measure performance. We compared neural networks, naive Bayes, and k-nearest neighbor classifiers. Empirical evidence is provided showing that this system can achieve classification accuracies in the 80th to 90th percentile. We propose to apply these methods to studying bird migration phenology and how it is affected by climate variability and change over multiple temporal scales.

  9. Sloan Digital Sky Survey III photometric quasar clustering: probing the initial conditions of the Universe

    NASA Astrophysics Data System (ADS)

    Ho, Shirley; Agarwal, Nishant; Myers, Adam D.; Lyons, Richard; Disbrow, Ashley; Seo, Hee-Jong; Ross, Ashley; Hirata, Christopher; Padmanabhan, Nikhil; O'Connell, Ross; Huff, Eric; Schlegel, David; Slosar, Anže; Weinberg, David; Strauss, Michael; Ross, Nicholas P.; Schneider, Donald P.; Bahcall, Neta; Brinkmann, J.; Palanque-Delabrouille, Nathalie; Yèche, Christophe

    2015-05-01

    The Sloan Digital Sky Survey has surveyed 14,555 square degrees of the sky, and delivered over a trillion pixels of imaging data. We present the large-scale clustering of 1.6 million quasars between z=0.5 and z=2.5 that have been classified from this imaging, representing the highest density of quasars ever studied for clustering measurements. This data set spans 0~ 11,00 square degrees and probes a volume of 80 h-3 Gpc3. In principle, such a large volume and medium density of tracers should facilitate high-precision cosmological constraints. We measure the angular clustering of photometrically classified quasars using an optimal quadratic estimator in four redshift slices with an accuracy of ~ 25% over a bin width of δl ~ 10-15 on scales corresponding to matter-radiation equality and larger (0l ~ 2-3). Observational systematics can strongly bias clustering measurements on large scales, which can mimic cosmologically relevant signals such as deviations from Gaussianity in the spectrum of primordial perturbations. We account for systematics by employing a new method recently proposed by Agarwal et al. (2014) to the clustering of photometrically classified quasars. We carefully apply our methodology to mitigate known observational systematics and further remove angular bins that are contaminated by unknown systematics. Combining quasar data with the photometric luminous red galaxy (LRG) sample of Ross et al. (2011) and Ho et al. (2012), and marginalizing over all bias and shot noise-like parameters, we obtain a constraint on local primordial non-Gaussianity of fNL = -113+154-154 (1σ error). We next assume that the bias of quasar and galaxy distributions can be obtained independently from quasar/galaxy-CMB lensing cross-correlation measurements (such as those in Sherwin et al. (2013)). This can be facilitated by spectroscopic observations of the sources, enabling the redshift distribution to be completely determined, and allowing precise estimates of the bias parameters. In this paper, if the bias and shot noise parameters are fixed to their known values (which we model by fixing them to their best-fit Gaussian values), we find that the error bar reduces to 1σ simeq 65. We expect this error bar to reduce further by at least another factor of five if the data is free of any observational systematics. We therefore emphasize that in order to make best use of large scale structure data we need an accurate modeling of known systematics, a method to mitigate unknown systematics, and additionally independent theoretical models or observations to probe the bias of dark matter halos.

  10. Lunar terrain mapping and relative-roughness analysis

    NASA Technical Reports Server (NTRS)

    Rowan, L. C.; Mccauley, J. F.; Holm, E. A.

    1971-01-01

    Terrain maps of the equatorial zone were prepared at scales of 1:2,000,000 and 1:1,000,000 to classify lunar terrain with respect to roughness and to provide a basis for selecting sites for Surveyor and Apollo landings, as well as for Ranger and Lunar Orbiter photographs. Lunar terrain was described by qualitative and quantitative methods and divided into four fundamental classes: maria, terrae, craters, and linear features. Some 35 subdivisions were defined and mapped throughout the equatorial zone, and, in addition, most of the map units were illustrated by photographs. The terrain types were analyzed quantitatively to characterize and order their relative roughness characteristics. For some morphologically homogeneous mare areas, relative roughness can be extrapolated to the large scales from measurements at small scales.

  11. Analysis on the restriction factors of the green building scale promotion based on DEMATEL

    NASA Astrophysics Data System (ADS)

    Wenxia, Hong; Zhenyao, Jiang; Zhao, Yang

    2017-03-01

    In order to promote the large-scale development of the green building in our country, DEMATEL method was used to classify influence factors of green building development into three parts, including green building market, green technology and macro economy. Through the DEMATEL model, the interaction mechanism of each part was analyzed. The mutual influence degree of each barrier factor that affects the green building promotion was quantitatively analysed and key factors for the development of green building in China were also finally determined. In addition, some implementation strategies of promoting green building scale development in our country were put forward. This research will show important reference value and practical value for making policies of the green building promotion.

  12. Vulnerability of ecosystems to climate change moderated by habitat intactness.

    PubMed

    Eigenbrod, Felix; Gonzalez, Patrick; Dash, Jadunandan; Steyl, Ilse

    2015-01-01

    The combined effects of climate change and habitat loss represent a major threat to species and ecosystems around the world. Here, we analyse the vulnerability of ecosystems to climate change based on current levels of habitat intactness and vulnerability to biome shifts, using multiple measures of habitat intactness at two spatial scales. We show that the global extent of refugia depends highly on the definition of habitat intactness and spatial scale of the analysis of intactness. Globally, 28% of terrestrial vegetated area can be considered refugia if all natural vegetated land cover is considered. This, however, drops to 17% if only areas that are at least 50% wilderness at a scale of 48×48 km are considered and to 10% if only areas that are at least 50% wilderness at a scale of 4.8×4.8 km are considered. Our results suggest that, in regions where relatively large, intact wilderness areas remain (e.g. Africa, Australia, boreal regions, South America), conservation of the remaining large-scale refugia is the priority. In human-dominated landscapes, (e.g. most of Europe, much of North America and Southeast Asia), focusing on finer scale refugia is a priority because large-scale wilderness refugia simply no longer exist. Action to conserve such refugia is particularly urgent since only 1 to 2% of global terrestrial vegetated area is classified as refugia and at least 50% covered by the global protected area network. © 2014 John Wiley & Sons Ltd.

  13. Rating scales for dystonia in cerebral palsy: reliability and validity.

    PubMed

    Monbaliu, E; Ortibus, E; Roelens, F; Desloovere, K; Deklerck, J; Prinzie, P; de Cock, P; Feys, H

    2010-06-01

    This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Three raters independently scored videotapes of 10 patients (five males, five females; mean age 13 y 3 mo, SD 5 y 2 mo, range 5-22 y). One patient each was classified at levels I-IV in the Gross Motor Function Classification System and six patients were classified at level V. Reliability was measured by (1) intraclass correlation coefficient (ICC) for interrater reliability, (2) standard error of measurement (SEM) and smallest detectable difference (SDD), and (3) Cronbach's alpha for internal consistency. Validity was assessed by Pearson's correlations among the three scales used and by content analysis. Moderate to good interrater reliability was found for total scores of the three scales (ICC: BADS=0.87; BFMMS=0.86; UDRS=0.79). However, many subitems showed low reliability, in particular for the UDRS. SEM and SDD were respectively 6.36% and 17.72% for the BADS, 9.88% and 27.39% for the BFMMS, and 8.89% and 24.63% for the UDRS. High internal consistency was found. Pearson's correlations were high. Content validity showed insufficient accordance with the new CP definition and classification. Our results support the internal consistency and concurrent validity of the scales; however, taking into consideration the limitations in reliability, including the large SDD values and the content validity, further research on methods of assessment of dystonia is warranted.

  14. On-Site Classification of Pansteatitis in Mozambique Tilapia (Oreochromis mossambicus) using a Portable Lipid-Based Analyzer

    PubMed Central

    Somerville, Stephen E.; Cantu, Theresa M.; Guillette, Matthew P.; Botha, Hannes; Boggs, Ashley S. P.; Luus-Powell, Wilmien; Guillette, Louis J.

    2017-01-01

    While no pansteatitis-related large-scale mortality events have occurred since 2008, the current status of pansteatitis (presence and pervasiveness) in the Olifants River system and other regions of South Africa remain largely unknown. In part, this is due to both a lack of known biological markers of pansteatitis and a lack of suitable non-invasive assays capable of rapidly classifying the disease. Here, we propose the application of a point-of-care (POC) device using lipid-based test strips (total cholesterol (TC) and total triglyceride (TG)), for classifying pansteatitis status in the whole blood of pre-spawning Mozambique tilapia (Oreochromis mossambicus). Using the TC strips, the POC device was able to non-lethally classify the tilapia as either healthy or pansteatitis-affected; the sexes were examined independently because sexual dimorphism was observed for TC (males p = 0.0364, females χ2 = 0.0007). No significant difference between diseased and pansteatitis-affected tilapia was observed using the TG strips. This is one of the first described applications of using POC devices for on-site environmental disease state testing. A discussion on the merits of using portable lipid-based analyzers as an in-field disease-state diagnostic tool is provided. PMID:28729886

  15. Classification of large-scale fundus image data sets: a cloud-computing framework.

    PubMed

    Roychowdhury, Sohini

    2016-08-01

    Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.

  16. Face recognition system using multiple face model of hybrid Fourier feature under uncontrolled illumination variation.

    PubMed

    Hwang, Wonjun; Wang, Haitao; Kim, Hyunwoo; Kee, Seok-Cheol; Kim, Junmo

    2011-04-01

    The authors present a robust face recognition system for large-scale data sets taken under uncontrolled illumination variations. The proposed face recognition system consists of a novel illumination-insensitive preprocessing method, a hybrid Fourier-based facial feature extraction, and a score fusion scheme. First, in the preprocessing stage, a face image is transformed into an illumination-insensitive image, called an "integral normalized gradient image," by normalizing and integrating the smoothed gradients of a facial image. Then, for feature extraction of complementary classifiers, multiple face models based upon hybrid Fourier features are applied. The hybrid Fourier features are extracted from different Fourier domains in different frequency bandwidths, and then each feature is individually classified by linear discriminant analysis. In addition, multiple face models are generated by plural normalized face images that have different eye distances. Finally, to combine scores from multiple complementary classifiers, a log likelihood ratio-based score fusion scheme is applied. The proposed system using the face recognition grand challenge (FRGC) experimental protocols is evaluated; FRGC is a large available data set. Experimental results on the FRGC version 2.0 data sets have shown that the proposed method shows an average of 81.49% verification rate on 2-D face images under various environmental variations such as illumination changes, expression changes, and time elapses.

  17. Natural fracture systems on planetary surfaces: Genetic classification and pattern randomness

    NASA Technical Reports Server (NTRS)

    Rossbacher, Lisa A.

    1987-01-01

    One method for classifying natural fracture systems is by fracture genesis. This approach involves the physics of the formation process, and it has been used most frequently in attempts to predict subsurface fractures and petroleum reservoir productivity. This classification system can also be applied to larger fracture systems on any planetary surface. One problem in applying this classification system to planetary surfaces is that it was developed for ralatively small-scale fractures that would influence porosity, particularly as observed in a core sample. Planetary studies also require consideration of large-scale fractures. Nevertheless, this system offers some valuable perspectives on fracture systems of any size.

  18. A Survey on Routing Protocols for Large-Scale Wireless Sensor Networks

    PubMed Central

    Li, Changle; Zhang, Hanxiao; Hao, Binbin; Li, Jiandong

    2011-01-01

    With the advances in micro-electronics, wireless sensor devices have been made much smaller and more integrated, and large-scale wireless sensor networks (WSNs) based the cooperation among the significant amount of nodes have become a hot topic. “Large-scale” means mainly large area or high density of a network. Accordingly the routing protocols must scale well to the network scope extension and node density increases. A sensor node is normally energy-limited and cannot be recharged, and thus its energy consumption has a quite significant effect on the scalability of the protocol. To the best of our knowledge, currently the mainstream methods to solve the energy problem in large-scale WSNs are the hierarchical routing protocols. In a hierarchical routing protocol, all the nodes are divided into several groups with different assignment levels. The nodes within the high level are responsible for data aggregation and management work, and the low level nodes for sensing their surroundings and collecting information. The hierarchical routing protocols are proved to be more energy-efficient than flat ones in which all the nodes play the same role, especially in terms of the data aggregation and the flooding of the control packets. With focus on the hierarchical structure, in this paper we provide an insight into routing protocols designed specifically for large-scale WSNs. According to the different objectives, the protocols are generally classified based on different criteria such as control overhead reduction, energy consumption mitigation and energy balance. In order to gain a comprehensive understanding of each protocol, we highlight their innovative ideas, describe the underlying principles in detail and analyze their advantages and disadvantages. Moreover a comparison of each routing protocol is conducted to demonstrate the differences between the protocols in terms of message complexity, memory requirements, localization, data aggregation, clustering manner and other metrics. Finally some open issues in routing protocol design in large-scale wireless sensor networks and conclusions are proposed. PMID:22163808

  19. An unbalanced spectra classification method based on entropy

    NASA Astrophysics Data System (ADS)

    Liu, Zhong-bao; Zhao, Wen-juan

    2017-05-01

    How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.

  20. The Role of Forests in Regulating the River Flow Regime of Large Basins of the World

    NASA Astrophysics Data System (ADS)

    Salazar, J. F.; Villegas, J. C.; Mercado-Bettin, D. A.; Rodríguez, E.

    2016-12-01

    Many natural and social phenomena depend on river flow regimes that are being altered by global change. Understanding the mechanisms behind such alterations is crucial for predicting river flow regimes in a changing environment. Here we explore potential linkages between the presence of forests and the capacity of river basins for regulating river flows. Regulation is defined here as the capacity of river basins to attenuate the amplitude of the river flow regime, that is to reduce the difference between high and low flows. We first use scaling theory to show how scaling properties of observed river flows can be used to classify river basins as regulated or unregulated. This parsimonious classification is based on a physical interpretation of the scaling properties (particularly the scaling exponents) that is novel (most previous studies have focused on the interpretation of the scaling exponents for floods only), and widely-applicable to different basins (the only assumption is that river flows in a given river basin exhibit scaling properties through well-known power laws). Then we show how this scaling framework can be used to explore global-change-induced temporal variations in the regulation capacity of river basins. Finally, we propose a conceptual hypothesis (the "Forest reservoir concept") to explain how large-scale forests can exert important effects on the long-term water balance partitioning and regulation capacity of large basins of the world. Our quantitative results are based on data analysis (river flows and land cover features) from 22 large basins of the world, with emphasis in the Amazon river and its main tributaries. Collectively, our findings support the hypothesis that forest cover enhances the capacity of large river basins to maintain relatively high mean river flows, as well as to regulate (ameliorate) extreme river flows. Advancing towards this quantitative understanding of the relation between forest cover and river flow regimes is crucial for water management- and land cover-related decisions.

  1. The Role of Forests in Regulating the River Flow Regime of Large Basins of the World

    NASA Astrophysics Data System (ADS)

    Salazar, J. F.; Villegas, J. C.; Mercado-Bettin, D. A.; Rodríguez, E.

    2017-12-01

    Many natural and social phenomena depend on river flow regimes that are being altered by global change. Understanding the mechanisms behind such alterations is crucial for predicting river flow regimes in a changing environment. Here we explore potential linkages between the presence of forests and the capacity of river basins for regulating river flows. Regulation is defined here as the capacity of river basins to attenuate the amplitude of the river flow regime, that is to reduce the difference between high and low flows. We first use scaling theory to show how scaling properties of observed river flows can be used to classify river basins as regulated or unregulated. This parsimonious classification is based on a physical interpretation of the scaling properties (particularly the scaling exponents) that is novel (most previous studies have focused on the interpretation of the scaling exponents for floods only), and widely-applicable to different basins (the only assumption is that river flows in a given river basin exhibit scaling properties through well-known power laws). Then we show how this scaling framework can be used to explore global-change-induced temporal variations in the regulation capacity of river basins. Finally, we propose a conceptual hypothesis (the "Forest reservoir concept") to explain how large-scale forests can exert important effects on the long-term water balance partitioning and regulation capacity of large basins of the world. Our quantitative results are based on data analysis (river flows and land cover features) from 22 large basins of the world, with emphasis in the Amazon river and its main tributaries. Collectively, our findings support the hypothesis that forest cover enhances the capacity of large river basins to maintain relatively high mean river flows, as well as to regulate (ameliorate) extreme river flows. Advancing towards this quantitative understanding of the relation between forest cover and river flow regimes is crucial for water management- and land cover-related decisions.

  2. Classifying a Smoker Scale in Adult Daily and Nondaily Smokers

    PubMed Central

    2014-01-01

    Introduction: Smoker identity, or the strength of beliefs about oneself as a smoker, is a robust marker of smoking behavior. However, many nondaily smokers do not identify as smokers, underestimating their risk for tobacco-related disease and resulting in missed intervention opportunities. Assessing underlying beliefs about characteristics used to classify smokers may help explain the discrepancy between smoking behavior and smoker identity. This study examines the factor structure, reliability, and validity of the Classifying a Smoker scale among a racially diverse sample of adult smokers. Methods: A cross-sectional survey was administered through an online panel survey service to 2,376 current smokers who were at least 25 years of age. The sample was stratified to obtain equal numbers of 3 racial/ethnic groups (African American, Latino, and White) across smoking level (nondaily and daily smoking). Results: The Classifying a Smoker scale displayed a single factor structure and excellent internal consistency (α = .91). Classifying a Smoker scores significantly increased at each level of smoking, F(3,2375) = 23.68, p < .0001. Those with higher scores had a stronger smoker identity, stronger dependence on cigarettes, greater health risk perceptions, more smoking friends, and were more likely to carry cigarettes. Classifying a Smoker scores explained unique variance in smoking variables above and beyond that explained by smoker identity. Conclusions: The present study supports the use of the Classifying a Smoker scale among diverse, experienced smokers. Stronger endorsement of characteristics used to classify a smoker (i.e., stricter criteria) was positively associated with heavier smoking and related characteristics. Prospective studies are needed to inform prevention and treatment efforts. PMID:24297807

  3. Mono-isotope Prediction for Mass Spectra Using Bayes Network.

    PubMed

    Li, Hui; Liu, Chunmei; Rwebangira, Mugizi Robert; Burge, Legand

    2014-12-01

    Mass spectrometry is one of the widely utilized important methods to study protein functions and components. The challenge of mono-isotope pattern recognition from large scale protein mass spectral data needs computational algorithms and tools to speed up the analysis and improve the analytic results. We utilized naïve Bayes network as the classifier with the assumption that the selected features are independent to predict mono-isotope pattern from mass spectrometry. Mono-isotopes detected from validated theoretical spectra were used as prior information in the Bayes method. Three main features extracted from the dataset were employed as independent variables in our model. The application of the proposed algorithm to publicMo dataset demonstrates that our naïve Bayes classifier is advantageous over existing methods in both accuracy and sensitivity.

  4. Multi-discipline resource inventory of soils, vegetation and geology

    NASA Technical Reports Server (NTRS)

    Simonson, G. H. (Principal Investigator); Paine, D. P.; Lawrence, R. D.; Norgren, J. A.; Pyott, W. Y.; Herzog, J. H.; Murray, R. J.; Rogers, R.

    1973-01-01

    The author has identified the following significant results. Computer classification of natural vegetation, in the vicinity of Big Summit Prairie, Crook County, Oregon was carried out using MSS digital data. Impure training sets, representing eleven vegetation types plus water, were selected from within the area to be classified. Close correlations were visually observed between vegetation types mapped from the large scale photographs and the computer classification of the ERTS data (Frame 1021-18151, 13 August 1972).

  5. Semantic Concept Discovery for Large Scale Zero Shot Event Detection

    DTIC Science & Technology

    2015-07-25

    sources and can be shared among many different events, including unseen ones. Based on this idea, events can be detected by inspect- ing the individual...2013]. Partial success along this vein has also been achieved in the zero-shot setting, e.g. [Habibian et al., 2014; Wu et al., 2014], but the...candle”, “birthday cake” and “applaud- ing”. Since concepts are shared among many different classes (events) and each concept classifier can be trained

  6. The behavior limestone under explosive load

    NASA Astrophysics Data System (ADS)

    Orlov, M. Yu; Orlova, Yu N.; Bogomolov, G. N.

    2016-11-01

    Limestone behavior under explosive loading was investigated. The behavior of the limestone by the action of the three types of explosives, including granular, ammonite and emulsion explosives was studied in detail. The shape and diameter of the explosion craters were obtained. The observed fragments after the blast have been classified as large, medium and small fragments. Three full-scale experiments were carried out. The research results can be used as a qualitative test for the approbation of numerical methods.

  7. Non-stationary Drainage Flows and Cold Pools in Gentle Terrain

    NASA Astrophysics Data System (ADS)

    Mahrt, L.

    2015-12-01

    Previous studies have concentrated on organized topography with well-defined slopes or valleys in an effort to understand the flow dynamics. However, most of the Earth's land surface consists of gentle terrain that is quasi three dimensional. Different scenarios are briefly classified. A network of measurements are analyzed to examine shallow cold pools and drainage flow down the valley which develop for weak ambient wind and relatively clear skies. However, transient modes constantly modulate or intermittently eliminate the cold pool, which makes extraction and analysis of the horizontal structure of the cold pool difficult with traditional analysis methods. Singular value decomposition successfully isolates the effects of large-scale flow from local down-valley cold air drainage within the cold pool in spite of the intermittent nature of this local flow. The traditional concept of a cold pool must be generalized to include cold pool intermittency, complex variation of temperature related to some three-dimensionality and a diffuse cold pool top. Different types of cold pools are classified in terms of the stratification and gradient of potential temperature along the slope. The strength of the cold pool is related to a forcing temperature scale proportional to the net radiative cooling divided by the wind speed above the valley. The scatter is large partly due to nonstationarity of the marginal cold pool in this shallow valley

  8. Geometry-based ensembles: toward a structural characterization of the classification boundary.

    PubMed

    Pujol, Oriol; Masip, David

    2009-06-01

    This paper introduces a novel binary discriminative learning technique based on the approximation of the nonlinear decision boundary by a piecewise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points-points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and nonlinear behavior is obtained. The simplicity of the method allows its extension to cope with some of today's machine learning challenges, such as online learning, large-scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database, comparing with several state-of-the-art classification techniques. Finally, we apply our technique in online and large-scale scenarios and in six real-life computer vision and pattern recognition problems: gender recognition based on face images, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease myocardial damage severity detection, old musical scores clef classification, and action recognition using 3D accelerometer data from a wearable device. The results are promising and this paper opens a line of research that deserves further attention.

  9. Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model

    PubMed Central

    Perlis, R. H.; Iosifescu, D. V.; Castro, V. M.; Murphy, S. N.; Gainer, V. S.; Minnier, J.; Cai, T.; Goryachev, S.; Zeng, Q.; Gallagher, P. J.; Fava, M.; Weilburg, J. B.; Churchill, S. E.; Kohane, I. S.; Smoller, J. W.

    2013-01-01

    Background Electronic medical records (EMR) provide a unique opportunity for efficient, large-scale clinical investigation in psychiatry. However, such studies will require development of tools to define treatment outcome. Method Natural language processing (NLP) was applied to classify notes from 127 504 patients with a billing diagnosis of major depressive disorder, drawn from out-patient psychiatry practices affiliated with multiple, large New England hospitals. Classifications were compared with results using billing data (ICD-9 codes) alone and to a clinical gold standard based on chart review by a panel of senior clinicians. These cross-sectional classifications were then used to define longitudinal treatment outcomes, which were compared with a clinician-rated gold standard. Results Models incorporating NLP were superior to those relying on billing data alone for classifying current mood state (area under receiver operating characteristic curve of 0.85–0.88 v. 0.54–0.55). When these cross-sectional visits were integrated to define longitudinal outcomes and incorporate treatment data, 15% of the cohort remitted with a single antidepressant treatment, while 13% were identified as failing to remit despite at least two antidepressant trials. Non-remitting patients were more likely to be non-Caucasian (p<0.001). Conclusions The application of bioinformatics tools such as NLP should enable accurate and efficient determination of longitudinal outcomes, enabling existing EMR data to be applied to clinical research, including biomarker investigations. Continued development will be required to better address moderators of outcome such as adherence and co-morbidity. PMID:21682950

  10. Unmasking the masked Universe: the 2M++ catalogue through Bayesian eyes

    NASA Astrophysics Data System (ADS)

    Lavaux, Guilhem; Jasche, Jens

    2016-01-01

    This work describes a full Bayesian analysis of the Nearby Universe as traced by galaxies of the 2M++ survey. The analysis is run in two sequential steps. The first step self-consistently derives the luminosity-dependent galaxy biases, the power spectrum of matter fluctuations and matter density fields within a Gaussian statistic approximation. The second step makes a detailed analysis of the three-dimensional large-scale structures, assuming a fixed bias model and a fixed cosmology. This second step allows for the reconstruction of both the final density field and the initial conditions at z = 1000 assuming a fixed bias model. From these, we derive fields that self-consistently extrapolate the observed large-scale structures. We give two examples of these extrapolation and their utility for the detection of structures: the visibility of the Sloan Great Wall, and the detection and characterization of the Local Void using DIVA, a Lagrangian based technique to classify structures.

  11. Evaluating the NOAA Coastal and Marine Ecological Classification Standard in estuarine systems: A Columbia River Estuary case study

    NASA Astrophysics Data System (ADS)

    Keefer, Matthew L.; Peery, Christopher A.; Wright, Nancy; Daigle, William R.; Caudill, Christopher C.; Clabough, Tami S.; Griffith, David W.; Zacharias, Mark A.

    2008-06-01

    A common first step in conservation planning and resource management is to identify and classify habitat types, and this has led to a proliferation of habitat classification systems. Ideally, classifications should be scientifically and conceptually rigorous, with broad applicability across spatial and temporal scales. Successful systems will also be flexible and adaptable, with a framework and supporting lexicon accessible to users from a variety of disciplines and locations. A new, continental-scale classification system for coastal and marine habitats—the Coastal and Marine Ecological Classification Standard (CMECS)—is currently being developed for North America by NatureServe and the National Oceanic and Atmospheric Administration (NOAA). CMECS is a nested, hierarchical framework that applies a uniform set of rules and terminology across multiple habitat scales using a combination of oceanographic (e.g. salinity, temperature), physiographic (e.g. depth, substratum), and biological (e.g. community type) criteria. Estuaries are arguably the most difficult marine environments to classify due to large spatio-temporal variability resulting in rapidly shifting benthic and water column conditions. We simultaneously collected data at eleven subtidal sites in the Columbia River Estuary (CRE) in fall 2004 to evaluate whether the estuarine component of CMECS could adequately classify habitats across several scales for representative sites within the estuary spanning a range of conditions. Using outputs from an acoustic Doppler current profiler (ADCP), CTD (conductivity, temperature, depth) sensor, and PONAR (benthic dredge) we concluded that the CMECS hierarchy provided a spatially explicit framework in which to integrate multiple parameters to define macro-habitats at the 100 m 2 to >1000 m 2 scales, or across several tiers of the CMECS system. The classification's strengths lie in its nested, hierarchical structure and in the development of a standardized, yet flexible classification lexicon. The application of the CMECS to other estuaries in North America should therefore identify similar habitat types at similar scales as we identified in the CRE. We also suggest that the CMECS could be improved by refining classification thresholds to better reflect ecological processes, by direct integration of temporal variability, and by more explicitly linking physical and biological processes with habitat patterns.

  12. The Development and Validation of a Scale Assessing Individual Schemas Used in Classifying a Smoker: Implications for Research and Practice

    PubMed Central

    Nehl, Eric; Sterling, Kymberle; Buchanan, Taneisha; Narula, Shana; Sutfin, Erin; Ahluwalia, Jasjit S.

    2011-01-01

    Introduction: Half of college students who have smoked in the past month do not consider themselves smokers. Understanding one’s schema of smokers is important, as it might relate to smoking behavior. Thus, we aimed to develop a scale assessing how young adults classify smokers and establish reliability and validity of the scale. Methods: Of 24,055 students at six Southeast colleges recruited to complete an online survey, 4,840 (20.1%) responded, with complete smoking and scale development data from 3,863. Results: The Classifying a Smoker Scale consisted of 10 items derived from prior research. Factor analysis extracted a single factor accounting for 40.00% of score variance (eigenvalue = 5.52). Higher scores (range 10–70) indicate stricter criteria in classifying a smoker. The scale yielded a Cronbach’s alpha of .91. Current smoking (past 30-day) prevalence was 22.8%. Higher Classifying a Smoker Scale scores (p = .001) were significant predictors of current smoking, controlling for sociodemographics. Higher scores were related to being nondaily versus daily smokers (p = .009), readiness to quit in the next month (p = .04), greater perceived smoking prevalence (p = .007), not identifying as smokers (p < .001), less perceived harm of smoking (p < .001), greater concern about smoking health risks (p = .01), and less favorable attitudes toward smoking restrictions (p < .001). Among current smokers, higher scores were related to greater smoking frequency (p = .02), not identifying as smokers (p < .001), and less perceived harm of smoking (p < .001), controlling for sociodemographics. Conclusion: This scale, demonstrating good psychometric properties, highlights potential intervention targets for prevention and cessation, as it relates to smoking, risk perception, and interest in quitting. PMID:21994337

  13. The development and validation of a scale assessing individual schemas used in classifying a smoker: implications for research and practice.

    PubMed

    Berg, Carla J; Nehl, Eric; Sterling, Kymberle; Buchanan, Taneisha; Narula, Shana; Sutfin, Erin; Ahluwalia, Jasjit S

    2011-12-01

    Half of college students who have smoked in the past month do not consider themselves smokers. Understanding one's schema of smokers is important, as it might relate to smoking behavior. Thus, we aimed to develop a scale assessing how young adults classify smokers and establish reliability and validity of the scale. Of 24,055 students at six Southeast colleges recruited to complete an online survey, 4,840 (20.1%) responded, with complete smoking and scale development data from 3,863. The Classifying a Smoker Scale consisted of 10 items derived from prior research. Factor analysis extracted a single factor accounting for 40.00% of score variance (eigenvalue = 5.52). Higher scores (range 10-70) indicate stricter criteria in classifying a smoker. The scale yielded a Cronbach's alpha of .91. Current smoking (past 30-day) prevalence was 22.8%. Higher Classifying a Smoker Scale scores (p = .001) were significant predictors of current smoking, controlling for sociodemographics. Higher scores were related to being nondaily versus daily smokers (p = .009), readiness to quit in the next month (p = .04), greater perceived smoking prevalence (p = .007), not identifying as smokers (p < .001), less perceived harm of smoking (p < .001), greater concern about smoking health risks (p = .01), and less favorable attitudes toward smoking restrictions (p < .001). Among current smokers, higher scores were related to greater smoking frequency (p = .02), not identifying as smokers (p < .001), and less perceived harm of smoking (p < .001), controlling for sociodemographics. This scale, demonstrating good psychometric properties, highlights potential intervention targets for prevention and cessation, as it relates to smoking, risk perception, and interest in quitting.

  14. Effect of boric acid on the properties of Li{sub 2}MnO{sub 3}·LiNi{sub 0.5}Mn{sub 0.5}O{sub 2} composite cathode powders prepared by large-scale spray pyrolysis with droplet classifier

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, Young Jun; Choi, Seung Ho; Sim, Chul Min

    2012-12-15

    Graphical abstract: Display Omitted Highlights: ► Spherical shape Li{sub 2}MnO{sub 3}·LiNi{sub 0.5}Mn{sub 0.5}O{sub 2} composite cathode powders are prepared by large-scale spray pyrolysis with droplet classifier. ► Boric acid improves the morphological and electrochemical properties of the composite cathode powders. ► The discharge capacity of the composite cathode powders decreases from 217 to 196 mAh g{sup −1} by the 30th cycle. -- Abstract: Spherically shaped 0.3Li{sub 2}MnO{sub 3}·0.7LiNi{sub 0.5}Mn{sub 0.5}O{sub 2} composite cathode powders with filled morphology and narrow size distribution are prepared by large-scale spray pyrolysis. A droplet classification reduces the standard deviation of the size distribution of themore » composite cathode powders. Addition of boric acid improves the morphological properties of the product powders by forming a lithium borate glass material with low melting temperature. The optimum amount of boric acid dissolved in the spray solution is 0.8 wt% of the composite powders. The powders prepared from the spray solution with 0.8 wt% boric acid have a mixed layered crystal structure comprising Li{sub 2}MnO{sub 3} and LiNi{sub 0.5}Mn{sub 0.5}O{sub 2} phases, thus forming a composite compound. The initial charge and discharge capacities of the composite cathode powders prepared from the 0.8 wt% boric acid spray solution are 297 and 217 mAh g{sup −1}, respectively. The discharge capacity of the powders decreases from 217 to 196 mAh g{sup −1} by the 30th cycle, in which the capacity retention is 90%.« less

  15. Salience network-based classification and prediction of symptom severity in children with autism.

    PubMed

    Uddin, Lucina Q; Supekar, Kaustubh; Lynch, Charles J; Khouzam, Amirah; Phillips, Jennifer; Feinstein, Carl; Ryali, Srikanth; Menon, Vinod

    2013-08-01

    Autism spectrum disorder (ASD) affects 1 in 88 children and is characterized by a complex phenotype, including social, communicative, and sensorimotor deficits. Autism spectrum disorder has been linked with atypical connectivity across multiple brain systems, yet the nature of these differences in young children with the disorder is not well understood. To examine connectivity of large-scale brain networks and determine whether specific networks can distinguish children with ASD from typically developing (TD) children and predict symptom severity in children with ASD. Case-control study performed at Stanford University School of Medicine of 20 children 7 to 12 years old with ASD and 20 age-, sex-, and IQ-matched TD children. Between-group differences in intrinsic functional connectivity of large-scale brain networks, performance of a classifier built to discriminate children with ASD from TD children based on specific brain networks, and correlations between brain networks and core symptoms of ASD. We observed stronger functional connectivity within several large-scale brain networks in children with ASD compared with TD children. This hyperconnectivity in ASD encompassed salience, default mode, frontotemporal, motor, and visual networks. This hyperconnectivity result was replicated in an independent cohort obtained from publicly available databases. Using maps of each individual's salience network, children with ASD could be discriminated from TD children with a classification accuracy of 78%, with 75% sensitivity and 80% specificity. The salience network showed the highest classification accuracy among all networks examined, and the blood oxygen-level dependent signal in this network predicted restricted and repetitive behavior scores. The classifier discriminated ASD from TD in the independent sample with 83% accuracy, 67% sensitivity, and 100% specificity. Salience network hyperconnectivity may be a distinguishing feature in children with ASD. Quantification of brain network connectivity is a step toward developing biomarkers for objectively identifying children with ASD.

  16. Salience Network–Based Classification and Prediction of Symptom Severity in Children With Autism

    PubMed Central

    Uddin, Lucina Q.; Supekar, Kaustubh; Lynch, Charles J.; Khouzam, Amirah; Phillips, Jennifer; Feinstein, Carl; Ryali, Srikanth; Menon, Vinod

    2014-01-01

    IMPORTANCE Autism spectrum disorder (ASD) affects 1 in 88 children and is characterized by a complex phenotype, including social, communicative, and sensorimotor deficits. Autism spectrum disorder has been linked with atypical connectivity across multiple brain systems, yet the nature of these differences in young children with the disorder is not well understood. OBJECTIVES To examine connectivity of large-scale brain networks and determine whether specific networks can distinguish children with ASD from typically developing (TD) children and predict symptom severity in children with ASD. DESIGN, SETTING, AND PARTICIPANTS Case-control study performed at Stanford University School of Medicine of 20 children 7 to 12 years old with ASD and 20 age-, sex-, and IQ-matched TD children. MAIN OUTCOMES AND MEASURES Between-group differences in intrinsic functional connectivity of large-scale brain networks, performance of a classifier built to discriminate children with ASD from TD children based on specific brain networks, and correlations between brain networks and core symptoms of ASD. RESULTS We observed stronger functional connectivity within several large-scale brain networks in children with ASD compared with TD children. This hyperconnectivity in ASD encompassed salience, default mode, frontotemporal, motor, and visual networks. This hyperconnectivity result was replicated in an independent cohort obtained from publicly available databases. Using maps of each individual’s salience network, children with ASD could be discriminated from TD children with a classification accuracy of 78%, with 75% sensitivity and 80% specificity. The salience network showed the highest classification accuracy among all networks examined, and the blood oxygen–level dependent signal in this network predicted restricted and repetitive behavior scores. The classifier discriminated ASD from TD in the independent sample with 83% accuracy, 67% sensitivity, and 100% specificity. CONCLUSIONS AND RELEVANCE Salience network hyperconnectivity may be a distinguishing feature in children with ASD. Quantification of brain network connectivity is a step toward developing biomarkers for objectively identifying children with ASD. PMID:23803651

  17. 40 CFR Table 4 to Subpart Zzzzz of... - Compliance Certifications for New and Existing Affected Sources Classified as Large Iron and...

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Existing Affected Sources Classified as Large Iron and Steel Foundries 4 Table 4 to Subpart ZZZZZ of Part... Emission Standards for Hazardous Air Pollutants for Iron and Steel Foundries Area Sources Pt. 63, Subpt... Affected Sources Classified as Large Iron and Steel Foundries As required by § 63.10900(b), your...

  18. 40 CFR Table 4 to Subpart Zzzzz of... - Compliance Certifications for New and Existing Affected Sources Classified as Large Iron and...

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Existing Affected Sources Classified as Large Iron and Steel Foundries 4 Table 4 to Subpart ZZZZZ of Part... Emission Standards for Hazardous Air Pollutants for Iron and Steel Foundries Area Sources Pt. 63, Subpt... Affected Sources Classified as Large Iron and Steel Foundries As required by § 63.10900(b), your...

  19. 40 CFR Table 4 to Subpart Zzzzz of... - Compliance Certifications for New and Existing Affected Sources Classified as Large Iron and...

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Existing Affected Sources Classified as Large Iron and Steel Foundries 4 Table 4 to Subpart ZZZZZ of Part... Emission Standards for Hazardous Air Pollutants for Iron and Steel Foundries Area Sources Pt. 63, Subpt... Affected Sources Classified as Large Iron and Steel Foundries As required by § 63.10900(b), your...

  20. 40 CFR Table 4 to Subpart Zzzzz of... - Compliance Certifications for New and Existing Affected Sources Classified as Large Iron and...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Existing Affected Sources Classified as Large Iron and Steel Foundries 4 Table 4 to Subpart ZZZZZ of Part... Emission Standards for Hazardous Air Pollutants for Iron and Steel Foundries Area Sources Pt. 63, Subpt... Affected Sources Classified as Large Iron and Steel Foundries As required by § 63.10900(b), your...

  1. 40 CFR Table 4 to Subpart Zzzzz of... - Compliance Certifications for New and Existing Affected Sources Classified as Large Iron and...

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Existing Affected Sources Classified as Large Iron and Steel Foundries 4 Table 4 to Subpart ZZZZZ of Part... Emission Standards for Hazardous Air Pollutants for Iron and Steel Foundries Area Sources Pt. 63, Subpt... Affected Sources Classified as Large Iron and Steel Foundries As required by § 63.10900(b), your...

  2. Reionization Models Classifier using 21cm Map Deep Learning

    NASA Astrophysics Data System (ADS)

    Hassan, Sultan; Liu, Adrian; Kohn, Saul; Aguirre, James E.; La Plante, Paul; Lidz, Adam

    2018-05-01

    Next-generation 21cm observations will enable imaging of reionization on very large scales. These images will contain more astrophysical and cosmological information than the power spectrum, and hence providing an alternative way to constrain the contribution of different reionizing sources populations to cosmic reionization. Using Convolutional Neural Networks, we present a simple network architecture that is sufficient to discriminate between Galaxy-dominated versus AGN-dominated models, even in the presence of simulated noise from different experiments such as the HERA and SKA.

  3. Department of the Navy Supporting Data for Fiscal Year 1983 Budget Estimates Descriptive Summaries Submitted to Congress February 1982. Research, Development, Test and Evaluation, Navy. Book 2 of 3. Tactical Programs.

    DTIC Science & Technology

    1982-02-01

    control unit will detect and classify submerged submarins transiting within PJ The EnCAPsulated pedo augments air, surface and submarine anti...vidicon (date link video enhancement). Conduct Operational Test and Evaluation. Complete Large Scale Integration Receiver-Decoder improvement. Continue...analysis, and data link video enhancement focusing on application of a new silicon vidicon was continued; data link improvements such as adaptive null

  4. On Evaluating Brain Tissue Classifiers without a Ground Truth

    PubMed Central

    Martin-Fernandez, Marcos; Ungar, Lida; Nakamura, Motoaki; Koo, Min-Seong; McCarley, Robert W.; Shenton, Martha E.

    2009-01-01

    In this paper, we present a set of techniques for the evaluation of brain tissue classifiers on a large data set of MR images of the head. Due to the difficulty of establishing a gold standard for this type of data, we focus our attention on methods which do not require a ground truth, but instead rely on a common agreement principle. Three different techniques are presented: the Williams’ index, a measure of common agreement; STAPLE, an Expectation Maximization algorithm which simultaneously estimates performance parameters and constructs an estimated reference standard; and Multidimensional Scaling, a visualization technique to explore similarity data. We apply these different evaluation methodologies to a set eleven different segmentation algorithms on forty MR images. We then validate our evaluation pipeline by building a ground truth based on human expert tracings. The evaluations with and without a ground truth are compared. Our findings show that comparing classifiers without a gold standard can provide a lot of interesting information. In particular, outliers can be easily detected, strongly consistent or highly variable techniques can be readily discriminated, and the overall similarity between different techniques can be assessed. On the other hand, we also find that some information present in the expert segmentations is not captured by the automatic classifiers, suggesting that common agreement alone may not be sufficient for a precise performance evaluation of brain tissue classifiers. PMID:17532646

  5. Tooth labeling in cone-beam CT using deep convolutional neural network for forensic identification

    NASA Astrophysics Data System (ADS)

    Miki, Yuma; Muramatsu, Chisako; Hayashi, Tatsuro; Zhou, Xiangrong; Hara, Takeshi; Katsumata, Akitoshi; Fujita, Hiroshi

    2017-03-01

    In large disasters, dental record plays an important role in forensic identification. However, filing dental charts for corpses is not an easy task for general dentists. Moreover, it is laborious and time-consuming work in cases of large scale disasters. We have been investigating a tooth labeling method on dental cone-beam CT images for the purpose of automatic filing of dental charts. In our method, individual tooth in CT images are detected and classified into seven tooth types using deep convolutional neural network. We employed the fully convolutional network using AlexNet architecture for detecting each tooth and applied our previous method using regular AlexNet for classifying the detected teeth into 7 tooth types. From 52 CT volumes obtained by two imaging systems, five images each were randomly selected as test data, and the remaining 42 cases were used as training data. The result showed the tooth detection accuracy of 77.4% with the average false detection of 5.8 per image. The result indicates the potential utility of the proposed method for automatic recording of dental information.

  6. Cloud fraction at the ARM SGP site: Reducing uncertainty with self-organizing maps

    DOE PAGES

    Kennedy, Aaron D.; Dong, Xiquan; Xi, Baike

    2015-02-15

    Instrument downtime leads to uncertainty in the monthly and annual record of cloud fraction (CF), making it difficult to perform time series analyses of cloud properties and perform detailed evaluations of model simulations. As cloud occurrence is partially controlled by the large-scale atmospheric environment, this knowledge is used to reduce uncertainties in the instrument record. Synoptic patterns diagnosed from the North American Regional Reanalysis (NARR) during the period 1997–2010 are classified using a competitive neural network known as the self-organizing map (SOM). The classified synoptic states are then compared to the Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP) instrumentmore » record to determine the expected CF. A number of SOMs are tested to understand how the number of classes and the period of classifications impact the relationship between classified states and CFs. Bootstrapping is utilized to quantify the uncertainty of the instrument record when statistical information from the SOM is included. Although all SOMs significantly reduce the uncertainty of the CF record calculated in Kennedy et al. (Theor Appl Climatol 115:91–105, 2014), SOMs with a large number of classes and separated by month are required to produce the lowest uncertainty and best agreement with the annual cycle of CF. Lastly, this result may be due to a manifestation of seasonally dependent biases in NARR.« less

  7. Convolutional neural networks for transient candidate vetting in large-scale surveys

    NASA Astrophysics Data System (ADS)

    Gieseke, Fabian; Bloemen, Steven; van den Bogaard, Cas; Heskes, Tom; Kindler, Jonas; Scalzo, Richard A.; Ribeiro, Valério A. R. M.; van Roestel, Jan; Groot, Paul J.; Yuan, Fang; Möller, Anais; Tucker, Brad E.

    2017-12-01

    Current synoptic sky surveys monitor large areas of the sky to find variable and transient astronomical sources. As the number of detections per night at a single telescope easily exceeds several thousand, current detection pipelines make intensive use of machine learning algorithms to classify the detected objects and to filter out the most interesting candidates. A number of upcoming surveys will produce up to three orders of magnitude more data, which renders high-precision classification systems essential to reduce the manual and, hence, expensive vetting by human experts. We present an approach based on convolutional neural networks to discriminate between true astrophysical sources and artefacts in reference-subtracted optical images. We show that relatively simple networks are already competitive with state-of-the-art systems and that their quality can further be improved via slightly deeper networks and additional pre-processing steps - eventually yielding models outperforming state-of-the-art systems. In particular, our best model correctly classifies about 97.3 per cent of all 'real' and 99.7 per cent of all 'bogus' instances on a test set containing 1942 'bogus' and 227 'real' instances in total. Furthermore, the networks considered in this work can also successfully classify these objects at hand without relying on difference images, which might pave the way for future detection pipelines not containing image subtraction steps at all.

  8. Federated learning of predictive models from federated Electronic Health Records.

    PubMed

    Brisimi, Theodora S; Chen, Ruidi; Mela, Theofanie; Olshevsky, Alex; Paschalidis, Ioannis Ch; Shi, Wei

    2018-04-01

    In an era of "big data," computationally efficient and privacy-aware solutions for large-scale machine learning problems become crucial, especially in the healthcare domain, where large amounts of data are stored in different locations and owned by different entities. Past research has been focused on centralized algorithms, which assume the existence of a central data repository (database) which stores and can process the data from all participants. Such an architecture, however, can be impractical when data are not centrally located, it does not scale well to very large datasets, and introduces single-point of failure risks which could compromise the integrity and privacy of the data. Given scores of data widely spread across hospitals/individuals, a decentralized computationally scalable methodology is very much in need. We aim at solving a binary supervised classification problem to predict hospitalizations for cardiac events using a distributed algorithm. We seek to develop a general decentralized optimization framework enabling multiple data holders to collaborate and converge to a common predictive model, without explicitly exchanging raw data. We focus on the soft-margin l 1 -regularized sparse Support Vector Machine (sSVM) classifier. We develop an iterative cluster Primal Dual Splitting (cPDS) algorithm for solving the large-scale sSVM problem in a decentralized fashion. Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the data holders to collaborate, while keeping every participant's data private. We test cPDS on the problem of predicting hospitalizations due to heart diseases within a calendar year based on information in the patients Electronic Health Records prior to that year. cPDS converges faster than centralized methods at the cost of some communication between agents. It also converges faster and with less communication overhead compared to an alternative distributed algorithm. In both cases, it achieves similar prediction accuracy measured by the Area Under the Receiver Operating Characteristic Curve (AUC) of the classifier. We extract important features discovered by the algorithm that are predictive of future hospitalizations, thus providing a way to interpret the classification results and inform prevention efforts. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Using NLP to identify cancer cases in imaging reports drawn from radiology information systems.

    PubMed

    Patrick, Jon; Asgari, Pooyan; Li, Min; Nguyen, Dung

    2013-01-01

    A Natural Language processing (NLP) classifier has been developed for the Victorian and NSW Cancer Registries with the purpose of automatically identifying cancer reports from imaging services, transmitting them to the Registries and then extracting pertinent cancer information. Large scale trials conducted on over 40,000 reports show the sensitivity for identifying reportable cancer reports is above 98% with a specificity above 96%. Detection of tumour stream, report purpose, and a variety of extracted content is generally above 90% specificity. The differences between report layout and authoring strategies across imaging services appear to require different classifiers to retain this high level of accuracy. Linkage of the imaging data with existing registry records (hospital and pathology reports) to derive stage and recurrence of cancer has commenced and shown very promising results.

  10. A semi-automated image analysis procedure for in situ plankton imaging systems.

    PubMed

    Bi, Hongsheng; Guo, Zhenhua; Benfield, Mark C; Fan, Chunlei; Ford, Michael; Shahrestani, Suzan; Sieracki, Jeffery M

    2015-01-01

    Plankton imaging systems are capable of providing fine-scale observations that enhance our understanding of key physical and biological processes. However, processing the large volumes of data collected by imaging systems remains a major obstacle for their employment, and existing approaches are designed either for images acquired under laboratory controlled conditions or within clear waters. In the present study, we developed a semi-automated approach to analyze plankton taxa from images acquired by the ZOOplankton VISualization (ZOOVIS) system within turbid estuarine waters, in Chesapeake Bay. When compared to images under laboratory controlled conditions or clear waters, images from highly turbid waters are often of relatively low quality and more variable, due to the large amount of objects and nonlinear illumination within each image. We first customized a segmentation procedure to locate objects within each image and extracted them for classification. A maximally stable extremal regions algorithm was applied to segment large gelatinous zooplankton and an adaptive threshold approach was developed to segment small organisms, such as copepods. Unlike the existing approaches for images acquired from laboratory, controlled conditions or clear waters, the target objects are often the majority class, and the classification can be treated as a multi-class classification problem. We customized a two-level hierarchical classification procedure using support vector machines to classify the target objects (< 5%), and remove the non-target objects (> 95%). First, histograms of oriented gradients feature descriptors were constructed for the segmented objects. In the first step all non-target and target objects were classified into different groups: arrow-like, copepod-like, and gelatinous zooplankton. Each object was passed to a group-specific classifier to remove most non-target objects. After the object was classified, an expert or non-expert then manually removed the non-target objects that could not be removed by the procedure. The procedure was tested on 89,419 images collected in Chesapeake Bay, and results were consistent with visual counts with >80% accuracy for all three groups.

  11. A Semi-Automated Image Analysis Procedure for In Situ Plankton Imaging Systems

    PubMed Central

    Bi, Hongsheng; Guo, Zhenhua; Benfield, Mark C.; Fan, Chunlei; Ford, Michael; Shahrestani, Suzan; Sieracki, Jeffery M.

    2015-01-01

    Plankton imaging systems are capable of providing fine-scale observations that enhance our understanding of key physical and biological processes. However, processing the large volumes of data collected by imaging systems remains a major obstacle for their employment, and existing approaches are designed either for images acquired under laboratory controlled conditions or within clear waters. In the present study, we developed a semi-automated approach to analyze plankton taxa from images acquired by the ZOOplankton VISualization (ZOOVIS) system within turbid estuarine waters, in Chesapeake Bay. When compared to images under laboratory controlled conditions or clear waters, images from highly turbid waters are often of relatively low quality and more variable, due to the large amount of objects and nonlinear illumination within each image. We first customized a segmentation procedure to locate objects within each image and extracted them for classification. A maximally stable extremal regions algorithm was applied to segment large gelatinous zooplankton and an adaptive threshold approach was developed to segment small organisms, such as copepods. Unlike the existing approaches for images acquired from laboratory, controlled conditions or clear waters, the target objects are often the majority class, and the classification can be treated as a multi-class classification problem. We customized a two-level hierarchical classification procedure using support vector machines to classify the target objects (< 5%), and remove the non-target objects (> 95%). First, histograms of oriented gradients feature descriptors were constructed for the segmented objects. In the first step all non-target and target objects were classified into different groups: arrow-like, copepod-like, and gelatinous zooplankton. Each object was passed to a group-specific classifier to remove most non-target objects. After the object was classified, an expert or non-expert then manually removed the non-target objects that could not be removed by the procedure. The procedure was tested on 89,419 images collected in Chesapeake Bay, and results were consistent with visual counts with >80% accuracy for all three groups. PMID:26010260

  12. How do we diagnose and treat epilepsy with myoclonic-atonic seizures (Doose syndrome)? Results of the Pediatric Epilepsy Research Consortium survey.

    PubMed

    Nickels, Katherine; Thibert, Ronald; Rau, Stephanie; Demarest, Scott; Wirrell, Elaine; Kossoff, Eric H; Joshi, Charuta; Nangia, Srishti; Shellhaas, Renee

    2018-04-25

    To obtain and assess opinions on EMAS diagnostic criteria, recommended investigations, and therapeutic options, from a large group of physicians who care for children with EMAS. The EMAS focus group of PERC created a survey to assess the opinions of pediatric neurologists who care for children with EMAS regarding diagnosis and treatment of this condition, which was sent to members of PERC, AES, and CNS. A Likert scale was used to assess the respondents' opinions on the importance of diagnostic and exclusion criteria (five point scale), investigations (four point scale), and treatment (six point scale) of EMAS. Inclusion/exclusion criteria were then classified as critical, strong, or modest. Investigations were classified as essential, recommended, or possible. Therapies were classified as first line, beneficial, indeterminate benefit, or contraindicated. Survey results from the 76 participants determined the following: EMAS inclusion criteria: history suggestive of MAS (critical), recorded or home video suggestive of MAS, generalized discharges on inter-ictal EEG, normal neuroimaging, normal development prior to seizure onset (strong). EMAS exclusionary criteria: epileptic spasms, abnormal neuroimaging, focal abnormal exam, seizure onset six years (strong). EEG and MRI (essential), amino acids, organic acids, fatty acid/acylcarnitine profile, microarray, genetic panel, lactate/pyruvate, CSF and serum glucose/lactate (strong). Valproic acid (first line), topiramate, zonisamide, levetiracetam, benzodiazepines, and dietary therapies (beneficial). To date, no similar surveys have been published, even though early syndrome identification and initiation of effective treatment have been associated with improved outcome in EMAS. Medications that exacerbate seizures in EMAS have also been identified. This survey identified critical and preferred diagnostic electro clinical features, investigations, and treatments for EMAS. It will guide future research and is a crucial first step in defining specific diagnostic criteria, recommended evaluation, and most effective therapies for EMAS. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Transferring genomics to the clinic: distinguishing Burkitt and diffuse large B cell lymphomas.

    PubMed

    Sha, Chulin; Barrans, Sharon; Care, Matthew A; Cunningham, David; Tooze, Reuben M; Jack, Andrew; Westhead, David R

    2015-01-01

    Classifiers based on molecular criteria such as gene expression signatures have been developed to distinguish Burkitt lymphoma and diffuse large B cell lymphoma, which help to explore the intermediate cases where traditional diagnosis is difficult. Transfer of these research classifiers into a clinical setting is challenging because there are competing classifiers in the literature based on different methodology and gene sets with no clear best choice; classifiers based on one expression measurement platform may not transfer effectively to another; and, classifiers developed using fresh frozen samples may not work effectively with the commonly used and more convenient formalin fixed paraffin-embedded samples used in routine diagnosis. Here we thoroughly compared two published high profile classifiers developed on data from different Affymetrix array platforms and fresh-frozen tissue, examining their transferability and concordance. Based on this analysis, a new Burkitt and diffuse large B cell lymphoma classifier (BDC) was developed and employed on Illumina DASL data from our own paraffin-embedded samples, allowing comparison with the diagnosis made in a central haematopathology laboratory and evaluation of clinical relevance. We show that both previous classifiers can be recapitulated using very much smaller gene sets than originally employed, and that the classification result is closely dependent on the Burkitt lymphoma criteria applied in the training set. The BDC classification on our data exhibits high agreement (~95 %) with the original diagnosis. A simple outcome comparison in the patients presenting intermediate features on conventional criteria suggests that the cases classified as Burkitt lymphoma by BDC have worse response to standard diffuse large B cell lymphoma treatment than those classified as diffuse large B cell lymphoma. In this study, we comprehensively investigate two previous Burkitt lymphoma molecular classifiers, and implement a new gene expression classifier, BDC, that works effectively on paraffin-embedded samples and provides useful information for treatment decisions. The classifier is available as a free software package under the GNU public licence within the R statistical software environment through the link http://www.bioinformatics.leeds.ac.uk/labpages/softwares/ or on github https://github.com/Sharlene/BDC.

  14. Functional Interaction Network Construction and Analysis for Disease Discovery.

    PubMed

    Wu, Guanming; Haw, Robin

    2017-01-01

    Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.

  15. Using Satellite Imagery to Assess Large-Scale Habitat Characteristics of Adirondack Park, New York, USA

    NASA Astrophysics Data System (ADS)

    McClain, Bobbi J.; Porter, William F.

    2000-11-01

    Satellite imagery is a useful tool for large-scale habitat analysis; however, its limitations need to be tested. We tested these limitations by varying the methods of a habitat evaluation for white-tailed deer ( Odocoileus virginianus) in the Adirondack Park, New York, USA, utilizing harvest data to create and validate the assessment models. We used two classified images, one with a large minimum mapping unit but high accuracy and one with no minimum mapping unit but slightly lower accuracy, to test the sensitivity of the evaluation to these differences. We tested the utility of two methods of assessment, habitat suitability index modeling, and pattern recognition modeling. We varied the scale at which the models were applied by using five separate sizes of analysis windows. Results showed that the presence of a large minimum mapping unit eliminates important details of the habitat. Window size is relatively unimportant if the data are averaged to a large resolution (i.e., township), but if the data are used at the smaller resolution, then the window size is an important consideration. In the Adirondacks, the proportion of hardwood and softwood in an area is most important to the spatial dynamics of deer populations. The low occurrence of open area in all parts of the park either limits the effect of this cover type on the population or limits our ability to detect the effect. The arrangement and interspersion of cover types were not significant to deer populations.

  16. Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: Application to handwritten character recognition

    PubMed Central

    Vajda, Szilárd; Rangoni, Yves; Cecotti, Hubert

    2015-01-01

    For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would. PMID:25870463

  17. The Achievement Motivation-Performance Relationship as Moderated by Sex-Role Attitudes

    ERIC Educational Resources Information Center

    Thurber, Steven

    1976-01-01

    The moderating effect of sex-role attitudes in relation to the predictive validity of Mehrabian's achievement tendency scale for females is examined. The scale predicts better for academic achievement with females classified as non-traditional in sex-role orientation, and in social achievement for females classified as traditional. (Author/JKS)

  18. Large-scale gene function analysis with the PANTHER classification system.

    PubMed

    Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

    2013-08-01

    The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.

  19. Automatic scoring of dicentric chromosomes as a tool in large scale radiation accidents.

    PubMed

    Romm, H; Ainsbury, E; Barnard, S; Barrios, L; Barquinero, J F; Beinke, C; Deperas, M; Gregoire, E; Koivistoinen, A; Lindholm, C; Moquet, J; Oestreicher, U; Puig, R; Rothkamm, K; Sommer, S; Thierens, H; Vandersickel, V; Vral, A; Wojcik, A

    2013-08-30

    Mass casualty scenarios of radiation exposure require high throughput biological dosimetry techniques for population triage in order to rapidly identify individuals who require clinical treatment. The manual dicentric assay is a highly suitable technique, but it is also very time consuming and requires well trained scorers. In the framework of the MULTIBIODOSE EU FP7 project, semi-automated dicentric scoring has been established in six European biodosimetry laboratories. Whole blood was irradiated with a Co-60 gamma source resulting in 8 different doses between 0 and 4.5Gy and then shipped to the six participating laboratories. To investigate two different scoring strategies, cell cultures were set up with short term (2-3h) or long term (24h) colcemid treatment. Three classifiers for automatic dicentric detection were applied, two of which were developed specifically for these two different culture techniques. The automation procedure included metaphase finding, capture of cells at high resolution and detection of dicentric candidates. The automatically detected dicentric candidates were then evaluated by a trained human scorer, which led to the term 'semi-automated' being applied to the analysis. The six participating laboratories established at least one semi-automated calibration curve each, using the appropriate classifier for their colcemid treatment time. There was no significant difference between the calibration curves established, regardless of the classifier used. The ratio of false positive to true positive dicentric candidates was dose dependent. The total staff effort required for analysing 150 metaphases using the semi-automated approach was 2 min as opposed to 60 min for manual scoring of 50 metaphases. Semi-automated dicentric scoring is a useful tool in a large scale radiation accident as it enables high throughput screening of samples for fast triage of potentially exposed individuals. Furthermore, the results from the participating laboratories were comparable which supports networking between laboratories for this assay. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Depression and fatigue in patients with multiple sclerosis.

    PubMed

    Greeke, Emily E; Chua, Alicia S; Healy, Brian C; Rintell, David J; Chitnis, Tanuja; Glanz, Bonnie I

    2017-09-15

    Previous research has examined the components of depression and fatigue in multiple sclerosis (MS), but the findings have been inconsistent. The aim of this study was to explore the associations between overall and subscale scores of the Center for Epidemiologic Studies-Depression Scale (CES-D) and the Modified Fatigue Impact Scale (MFIS) as well as the longitudinal changes in scores in a large cohort of MS patients. MS subjects who completed a battery of patient reported outcome (PRO) measures including the CES-D and MFIS (N=435) were included in our analysis. At the first available MFIS measurement, Pearson's correlation coefficient was used to estimate the association between the CES-D and MFIS in terms of both total scores and subscale scores. In addition, the longitudinal change in each total score and subscale score was estimated using a linear mixed model, and the association between the measures in terms of longitudinal change was estimated using Pearson's correlation coefficient and linear mixed models. At baseline, 15% of subjects were classified as high on both depression and fatigue scales, 16% were classified as high on the fatigue scale only, and 9% were classified as high on the depression scale only. There was a high correlation between CES-D and MFIS total scores (r=0.62). High correlations were also observed between the somatic and retarded activity subscales of the CES-D and each of the MFIS subscales (r≥0.60). In terms of longitudinal change, the change over the first year between the CES-D and MFIS total scores showed a moderate correlation (r=0.49). Subjects with high fatigue scores but low depression scores at baseline were more likely than subjects with low baseline fatigue and depression scores to develop high depression scores at follow-up. Our study demonstrated that depression and fatigue in MS share several features and have a similar longitudinal course. But using cut-off scores to define depression and fatigue, our study also found that non-depressed subjects with high fatigue may be at a greater risk for developing depression. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Coal resources, reserves and peak coal production in the United States

    USGS Publications Warehouse

    Milici, Robert C.; Flores, Romeo M.; Stricker, Gary D.

    2013-01-01

    In spite of its large endowment of coal resources, recent studies have indicated that United States coal production is destined to reach a maximum and begin an irreversible decline sometime during the middle of the current century. However, studies and assessments illustrating coal reserve data essential for making accurate forecasts of United States coal production have not been compiled on a national basis. As a result, there is a great deal of uncertainty in the accuracy of the production forecasts. A very large percentage of the coal mined in the United States comes from a few large-scale mines (mega-mines) in the Powder River Basin of Wyoming and Montana. Reported reserves at these mines do not account for future potential reserves or for future development of technology that may make coal classified currently as resources into reserves in the future. In order to maintain United States coal production at or near current levels for an extended period of time, existing mines will eventually have to increase their recoverable reserves and/or new large-scale mines will have to be opened elsewhere. Accordingly, in order to facilitate energy planning for the United States, this paper suggests that probabilistic assessments of the remaining coal reserves in the country would improve long range forecasts of coal production. As it is in United States coal assessment projects currently being conducted, a major priority of probabilistic assessments would be to identify the numbers and sizes of remaining large blocks of coal capable of supporting large-scale mining operations for extended periods of time and to conduct economic evaluations of those resources.

  2. Detection of right-to-left shunts: comparison between the International Consensus and Spencer Logarithmic Scale criteria.

    PubMed

    Lao, Annabelle Y; Sharma, Vijay K; Tsivgoulis, Georgios; Frey, James L; Malkoff, Marc D; Navarro, Jose C; Alexandrov, Andrei V

    2008-10-01

    International Consensus Criteria (ICC) consider right-to-left shunt (RLS) present when Transcranial Doppler (TCD) detects even one microbubble (microB). Spencer Logarithmic Scale (SLS) offers more grades of RLS with detection of >30 microB corresponding to a large shunt. We compared the yield of ICC and SLS in detection and quantification of a large RLS. We prospectively evaluated paradoxical embolism in consecutive patients with ischemic strokes or transient ischemic attack (TIA) using injections of 9 cc saline agitated with 1 cc of air. Results were classified according to ICC [negative (no microB), grade I (1-20 microB), grade II (>20 microB or "shower" appearance of microB), and grade III ("curtain" appearance of microB)] and SLS criteria [negative (no microB), grade I (1-10 microB), grade II (11-30 microB), grade III (31100 microB), grade IV (101300 microB), grade V (>300 microB)]. The RLS size was defined as large (>4 mm) using diameter measurement of the septal defects on transesophageal echocardiography (TEE). TCD comparison to TEE showed 24 true positive, 48 true negative, 4 false positive, and 2 false negative cases (sensitivity 92.3%, specificity 92.3%, positive predictive value (PPV) 85.7%, negative predictive value (NPV) 96%, and accuracy 92.3%) for any RLS presence. Both ICC and SLS were 100% sensitive for detection of large RLS. ICC and SLS criteria yielded a false positive rate of 24.4% and 7.7%, respectively when compared to TEE. Although both grading scales provide agreement as to any shunt presence, using the Spencer Scale grade III or higher can decrease by one-half the number of false positive TCD diagnoses to predict large RLS on TEE.

  3. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    PubMed Central

    Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  4. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    NASA Astrophysics Data System (ADS)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  5. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  6. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    PubMed Central

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  7. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

    PubMed

    Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  8. Breast tissue classification in digital tomosynthesis images based on global gradient minimization and texture features

    NASA Astrophysics Data System (ADS)

    Qin, Xulei; Lu, Guolan; Sechopoulos, Ioannis; Fei, Baowei

    2014-03-01

    Digital breast tomosynthesis (DBT) is a pseudo-three-dimensional x-ray imaging modality proposed to decrease the effect of tissue superposition present in mammography, potentially resulting in an increase in clinical performance for the detection and diagnosis of breast cancer. Tissue classification in DBT images can be useful in risk assessment, computer-aided detection and radiation dosimetry, among other aspects. However, classifying breast tissue in DBT is a challenging problem because DBT images include complicated structures, image noise, and out-of-plane artifacts due to limited angular tomographic sampling. In this project, we propose an automatic method to classify fatty and glandular tissue in DBT images. First, the DBT images are pre-processed to enhance the tissue structures and to decrease image noise and artifacts. Second, a global smooth filter based on L0 gradient minimization is applied to eliminate detailed structures and enhance large-scale ones. Third, the similar structure regions are extracted and labeled by fuzzy C-means (FCM) classification. At the same time, the texture features are also calculated. Finally, each region is classified into different tissue types based on both intensity and texture features. The proposed method is validated using five patient DBT images using manual segmentation as the gold standard. The Dice scores and the confusion matrix are utilized to evaluate the classified results. The evaluation results demonstrated the feasibility of the proposed method for classifying breast glandular and fat tissue on DBT images.

  9. A Framework of Simple Event Detection in Surveillance Video

    NASA Astrophysics Data System (ADS)

    Xu, Weiguang; Zhang, Yafei; Lu, Jianjiang; Tian, Yulong; Wang, Jiabao

    Video surveillance is playing more and more important role in people's social life. Real-time alerting of threaten events and searching interesting content in stored large scale video footage needs human operator to pay full attention on monitor for long time. The labor intensive mode has limit the effectiveness and efficiency of the system. A framework of simple event detection is presented advance the automation of video surveillance. An improved inner key point matching approach is used to compensate motion of background in real-time; frame difference are used to detect foreground; HOG based classifiers are used to classify foreground object into people and car; mean-shift is used to tracking the recognized objects. Events are detected based on predefined rules. The maturity of the algorithms guarantee the robustness of the framework, and the improved approach and the easily checked rules enable the framework to work in real-time. Future works to be done are also discussed.

  10. Influenza antiviral therapeutics.

    PubMed

    Mayburd, Anatoly L

    2010-01-01

    In this review we conducted a landscaping study of the therapeutic anti-influenza agents, limiting the scope by exclusion of vaccines. The resulting 2800 patent publications were classified into 23 distinct technological sectors. The mechanism of action, the promise and drawbacks of the corresponding technological sectors were explored on comparative basis. A set of quantitative parameters was defined based on landscaping procedure that appears to correlate with the practical success of a given class of therapeutics. Thus, the sectors not considered promising from the mechanistic side were also displaying low value of the classifying parameters. The parameters were combined into a probabilistic Marketing Prediction Score, assessing a likelihood of a given sector to produce a marketable product. The proposed analytical methodology may be useful for automatic search and assessment of technologies for the goals of acquisition, investment and competitive bidding. While not being a substitute for an expert evaluation, it provides an initial assessment suitable for implementation with large-scale automated landscaping.

  11. Early sinkhole detection using a drone-based thermal camera and image processing

    NASA Astrophysics Data System (ADS)

    Lee, Eun Ju; Shin, Sang Young; Ko, Byoung Chul; Chang, Chunho

    2016-09-01

    Accurate advance detection of the sinkholes that are occurring more frequently now is an important way of preventing human fatalities and property damage. Unlike naturally occurring sinkholes, human-induced ones in urban areas are typically due to groundwater disturbances and leaks of water and sewage caused by large-scale construction. Although many sinkhole detection methods have been developed, it is still difficult to predict sinkholes that occur in depth areas. In addition, conventional methods are inappropriate for scanning a large area because of their high cost. Therefore, this paper uses a drone combined with a thermal far-infrared (FIR) camera to detect potential sinkholes over a large area based on computer vision and pattern classification techniques. To make a standard dataset, we dug eight holes of depths 0.5-2 m in increments of 0.5 m and with a maximum width of 1 m. We filmed these using the drone-based FIR camera at a height of 50 m. We first detect candidate regions by analysing cold spots in the thermal images based on the fact that a sinkhole typically has a lower thermal energy than its background. Then, these regions are classified into sinkhole and non-sinkhole classes using a pattern classifier. In this study, we ensemble the classification results based on a light convolutional neural network (CNN) and those based on a Boosted Random Forest (BRF) with handcrafted features. We apply the proposed ensemble method successfully to sinkhole data for various sizes and depths in different environments, and prove that the CNN ensemble and the BRF one with handcrafted features are better at detecting sinkholes than other classifiers or standalone CNN.

  12. Three-dimensional displays for natural hazards analysis, using classified Landsat Thematic Mapper digital data and large-scale digital elevation models

    NASA Technical Reports Server (NTRS)

    Butler, David R.; Walsh, Stephen J.; Brown, Daniel G.

    1991-01-01

    Methods are described for using Landsat Thematic Mapper digital data and digital elevation models for the display of natural hazard sites in a mountainous region of northwestern Montana, USA. Hazard zones can be easily identified on the three-dimensional images. Proximity of facilities such as highways and building locations to hazard sites can also be easily displayed. A temporal sequence of Landsat TM (or similar) satellite data sets could also be used to display landscape changes associated with dynamic natural hazard processes.

  13. Structure identification methods for atomistic simulations of crystalline materials

    DOE PAGES

    Stukowski, Alexander

    2012-05-28

    Here, we discuss existing and new computational analysis techniques to classify local atomic arrangements in large-scale atomistic computer simulations of crystalline solids. This article includes a performance comparison of typical analysis algorithms such as common neighbor analysis (CNA), centrosymmetry analysis, bond angle analysis, bond order analysis and Voronoi analysis. In addition we propose a simple extension to the CNA method that makes it suitable for multi-phase systems. Finally, we introduce a new structure identification algorithm, the neighbor distance analysis, which is designed to identify atomic structure units in grain boundaries.

  14. Analysis of hydrological features of portions of the Lake Ontario basin using Skylab and aircraft data

    NASA Technical Reports Server (NTRS)

    Polcyn, F. C. (Principal Investigator); Rebel, D. L.; Colwell, J. E.

    1976-01-01

    The author has identified the following significant results. S190A and S190B photography proved to be useful for mapping large scale geomorophological features, and for assessing water depth and water quality. Available S192 data were affected by low frequency noise caused by diode light. Hydrological features were classified, and upland green herbaceous vegetation was separated into several classes based on percent vegetation cover. A model for estimating surface soil moisture based on red and near infrared reflectance data was developed and subsequently implemented.

  15. Deep convolutional neural network based antenna selection in multiple-input multiple-output system

    NASA Astrophysics Data System (ADS)

    Cai, Jiaxin; Li, Yan; Hu, Ying

    2018-03-01

    Antenna selection of wireless communication system has attracted increasing attention due to the challenge of keeping a balance between communication performance and computational complexity in large-scale Multiple-Input MultipleOutput antenna systems. Recently, deep learning based methods have achieved promising performance for large-scale data processing and analysis in many application fields. This paper is the first attempt to introduce the deep learning technique into the field of Multiple-Input Multiple-Output antenna selection in wireless communications. First, the label of attenuation coefficients channel matrix is generated by minimizing the key performance indicator of training antenna systems. Then, a deep convolutional neural network that explicitly exploits the massive latent cues of attenuation coefficients is learned on the training antenna systems. Finally, we use the adopted deep convolutional neural network to classify the channel matrix labels of test antennas and select the optimal antenna subset. Simulation experimental results demonstrate that our method can achieve better performance than the state-of-the-art baselines for data-driven based wireless antenna selection.

  16. An integration of minimum local feature representation methods to recognize large variation of foods

    NASA Astrophysics Data System (ADS)

    Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali

    2017-10-01

    Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.

  17. Large-scale network integration in the human brain tracks temporal fluctuations in memory encoding performance.

    PubMed

    Keerativittayayut, Ruedeerat; Aoki, Ryuta; Sarabi, Mitra Taghizadeh; Jimura, Koji; Nakahara, Kiyoshi

    2018-06-18

    Although activation/deactivation of specific brain regions have been shown to be predictive of successful memory encoding, the relationship between time-varying large-scale brain networks and fluctuations of memory encoding performance remains unclear. Here we investigated time-varying functional connectivity patterns across the human brain in periods of 30-40 s, which have recently been implicated in various cognitive functions. During functional magnetic resonance imaging, participants performed a memory encoding task, and their performance was assessed with a subsequent surprise memory test. A graph analysis of functional connectivity patterns revealed that increased integration of the subcortical, default-mode, salience, and visual subnetworks with other subnetworks is a hallmark of successful memory encoding. Moreover, multivariate analysis using the graph metrics of integration reliably classified the brain network states into the period of high (vs. low) memory encoding performance. Our findings suggest that a diverse set of brain systems dynamically interact to support successful memory encoding. © 2018, Keerativittayayut et al.

  18. ELUCIDATING BRAIN CONNECTIVITY NETWORKS IN MAJOR DEPRESSIVE DISORDER USING CLASSIFICATION-BASED SCORING.

    PubMed

    Sacchet, Matthew D; Prasad, Gautam; Foland-Ross, Lara C; Thompson, Paul M; Gotlib, Ian H

    2014-04-01

    Graph theory is increasingly used in the field of neuroscience to understand the large-scale network structure of the human brain. There is also considerable interest in applying machine learning techniques in clinical settings, for example, to make diagnoses or predict treatment outcomes. Here we used support-vector machines (SVMs), in conjunction with whole-brain tractography, to identify graph metrics that best differentiate individuals with Major Depressive Disorder (MDD) from nondepressed controls. To do this, we applied a novel feature-scoring procedure that incorporates iterative classifier performance to assess feature robustness. We found that small-worldness , a measure of the balance between global integration and local specialization, most reliably differentiated MDD from nondepressed individuals. Post-hoc regional analyses suggested that heightened connectivity of the subcallosal cingulate gyrus (SCG) in MDDs contributes to these differences. The current study provides a novel way to assess the robustness of classification features and reveals anomalies in large-scale neural networks in MDD.

  19. Cloud fraction at the ARM SGP site: reducing uncertainty with self-organizing maps

    NASA Astrophysics Data System (ADS)

    Kennedy, Aaron D.; Dong, Xiquan; Xi, Baike

    2016-04-01

    Instrument downtime leads to uncertainty in the monthly and annual record of cloud fraction (CF), making it difficult to perform time series analyses of cloud properties and perform detailed evaluations of model simulations. As cloud occurrence is partially controlled by the large-scale atmospheric environment, this knowledge is used to reduce uncertainties in the instrument record. Synoptic patterns diagnosed from the North American Regional Reanalysis (NARR) during the period 1997-2010 are classified using a competitive neural network known as the self-organizing map (SOM). The classified synoptic states are then compared to the Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP) instrument record to determine the expected CF. A number of SOMs are tested to understand how the number of classes and the period of classifications impact the relationship between classified states and CFs. Bootstrapping is utilized to quantify the uncertainty of the instrument record when statistical information from the SOM is included. Although all SOMs significantly reduce the uncertainty of the CF record calculated in Kennedy et al. (Theor Appl Climatol 115:91-105, 2014), SOMs with a large number of classes and separated by month are required to produce the lowest uncertainty and best agreement with the annual cycle of CF. This result may be due to a manifestation of seasonally dependent biases in NARR. With use of the SOMs, the average uncertainty in monthly CF is reduced in half from the values calculated in Kennedy et al. (Theor Appl Climatol 115:91-105, 2014).

  20. FLARE (Facility for Laboratory Reconnection Experiments): A Major Next-Step for Laboratory Studies of Magnetic Reconnection

    NASA Astrophysics Data System (ADS)

    Ji, H.; Bhattacharjee, A.; Prager, S.; Daughton, W. S.; Bale, S. D.; Carter, T. A.; Crocker, N.; Drake, J. F.; Egedal, J.; Sarff, J.; Wallace, J.; Belova, E.; Ellis, R.; Fox, W. R., II; Heitzenroeder, P.; Kalish, M.; Jara-Almonte, J.; Myers, C. E.; Que, W.; Ren, Y.; Titus, P.; Yamada, M.; Yoo, J.

    2014-12-01

    A new intermediate-scale plasma experiment, called the Facility for Laboratory Reconnection Experiments or FLARE, is under construction at Princeton as a joint project by five universities and two national labs to study magnetic reconnection in regimes directly relevant to space, solar and astrophysical plasmas. The currently existing small-scale experiments have been focusing on the single X-line reconnection process in plasmas either with small effective sizes or at low Lundquist numbers, both of which are typically very large in natural plasmas. These new regimes involve multiple X-lines as guided by a reconnection "phase diagram", in which different coupling mechanisms from the global system scale to the local dissipation scale are classified into different reconnection phases [H. Ji & W. Daughton, Phys. Plasmas 18, 111207 (2011)]. The design of the FLARE device is based on the existing Magnetic Reconnection Experiment (MRX) at Princeton (http://mrx.pppl.gov) and is to provide experimental access to the new phases involving multiple X-lines at large effective sizes and high Lundquist numbers, directly relevant to space and solar plasmas. The motivating major physics questions, the construction status, and the planned collaborative research especially with space and solar research communities will be discussed.

  1. FLARE (Facility for Laboratory Reconnection Experiments): A Major Next-Step for Laboratory Studies of Magnetic Reconnection

    NASA Astrophysics Data System (ADS)

    Ji, Hantao; Bhattacharjee, A.; Prager, S.; Daughton, W.; Bale, Stuart D.; Carter, T.; Crocker, N.; Drake, J.; Egedal, J.; Sarff, J.; Fox, W.; Jara-Almonte, J.; Myers, C.; Ren, Y.; Yamada, M.; Yoo, J.

    2015-04-01

    A new intermediate-scale plasma experiment, called the Facility for Laboratory Reconnection Experiments or FLARE (flare.pppl.gov), is under construction at Princeton as a joint project by five universities and two national labs to study magnetic reconnection in regimes directly relevant to heliophysical and astrophysical plasmas. The currently existing small-scale experiments have been focusing on the single X-line reconnection process in plasmas either with small effective sizes or at low Lundquist numbers, both of which are typically very large in natural plasmas. These new regimes involve multiple X-lines as guided by a reconnection "phase diagram", in which different coupling mechanisms from the global system scale to the local dissipation scale are classified into different reconnection phases [H. Ji & W. Daughton, Phys. Plasmas 18, 111207 (2011)]. The design of the FLARE device is based on the existing Magnetic Reconnection Experiment (MRX) (mrx.pppl.gov) and is to provide experimental access to the new phases involving multiple X-lines at large effective sizes and high Lundquist numbers, directly relevant to magnetospheric, solar wind, and solar coronal plasmas. After a brief summary of recent laboratory results on the topic of magnetic reconnection, the motivating major physics questions, the construction status, and the planned collaborative research especially with heliophysics communities will be discussed.

  2. Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.

    PubMed

    Nelson, Kerrie P; Mitani, Aya A; Edwards, Don

    2017-09-10

    Widespread inconsistencies are commonly observed between physicians' ordinal classifications in screening tests results such as mammography. These discrepancies have motivated large-scale agreement studies where many raters contribute ratings. The primary goal of these studies is to identify factors related to physicians and patients' test results, which may lead to stronger consistency between raters' classifications. While ordered categorical scales are frequently used to classify screening test results, very few statistical approaches exist to model agreement between multiple raters. Here we develop a flexible and comprehensive approach to assess the influence of rater and subject characteristics on agreement between multiple raters' ordinal classifications in large-scale agreement studies. Our approach is based upon the class of generalized linear mixed models. Novel summary model-based measures are proposed to assess agreement between all, or a subgroup of raters, such as experienced physicians. Hypothesis tests are described to formally identify factors such as physicians' level of experience that play an important role in improving consistency of ratings between raters. We demonstrate how unique characteristics of individual raters can be assessed via conditional modes generated during the modeling process. Simulation studies are presented to demonstrate the performance of the proposed methods and summary measure of agreement. The methods are applied to a large-scale mammography agreement study to investigate the effects of rater and patient characteristics on the strength of agreement between radiologists. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  3. Abnormal ranges of vital signs in children in Japanese prehospital settings.

    PubMed

    Nosaka, Nobuyuki; Muguruma, Takashi; Knaup, Emily; Tsukahara, Kohei; Enomoto, Yuki; Kaku, Noriyuki

    2015-10-01

    The revised Fire Service Law obliges each prefectural government in Japan to establish a prehospital acuity scale. The Foundation for Ambulance Service Development (FASD) created an acuity scale for use as a reference. Our preliminary survey revealed that 32 of 47 prefectures directly applied the FASD scale for children. This scale shows abnormal ranges of heart rate and respiratory rate in young children. This study aimed to evaluate the validity of the abnormal ranges on the FASD scale to assess its overall performance for triage purposes in paediatric patients. We evaluated the validity of the ranges by comparing published centile charts for these vital signs with records of 1,296 ambulance patients. A large portion of the abnormal ranges on the scale substantially overlapped with the normal centile charts. Triage decisions using the FASD scale of vital signs properly classified 22% ( n  = 287) of children. The sensitivity and specificity for high urgency were as high as 91% (95% confidence interval, 82-96%) and as low as 18% (95% confidence interval, 16-20%). We found there is room for improvement of the abnormal ranges on the FASD scale.

  4. Structural Controllability and Controlling Centrality of Temporal Networks

    PubMed Central

    Pan, Yujian; Li, Xiang

    2014-01-01

    Temporal networks are such networks where nodes and interactions may appear and disappear at various time scales. With the evidence of ubiquity of temporal networks in our economy, nature and society, it's urgent and significant to focus on its structural controllability as well as the corresponding characteristics, which nowadays is still an untouched topic. We develop graphic tools to study the structural controllability as well as its characteristics, identifying the intrinsic mechanism of the ability of individuals in controlling a dynamic and large-scale temporal network. Classifying temporal trees of a temporal network into different types, we give (both upper and lower) analytical bounds of the controlling centrality, which are verified by numerical simulations of both artificial and empirical temporal networks. We find that the positive relationship between aggregated degree and controlling centrality as well as the scale-free distribution of node's controlling centrality are virtually independent of the time scale and types of datasets, meaning the inherent robustness and heterogeneity of the controlling centrality of nodes within temporal networks. PMID:24747676

  5. Comparison Analysis among Large Amount of SNS Sites

    NASA Astrophysics Data System (ADS)

    Toriumi, Fujio; Yamamoto, Hitoshi; Suwa, Hirohiko; Okada, Isamu; Izumi, Kiyoshi; Hashimoto, Yasuhiro

    In recent years, application of Social Networking Services (SNS) and Blogs are growing as new communication tools on the Internet. Several large-scale SNS sites are prospering; meanwhile, many sites with relatively small scale are offering services. Such small-scale SNSs realize small-group isolated type of communication while neither mixi nor MySpace can do that. However, the studies on SNS are almost about particular large-scale SNSs and cannot analyze whether their results apply for general features or for special characteristics on the SNSs. From the point of view of comparison analysis on SNS, comparison with just several types of those cannot reach a statistically significant level. We analyze many SNS sites with the aim of classifying them by using some approaches. Our paper classifies 50,000 sites for small-scale SNSs and gives their features from the points of network structure, patterns of communication, and growth rate of SNS. The result of analysis for network structure shows that many SNS sites have small-world attribute with short path lengths and high coefficients of their cluster. Distribution of degrees of the SNS sites is close to power law. This result indicates the small-scale SNS sites raise the percentage of users with many friends than mixi. According to the analysis of their coefficients of assortativity, those SNS sites have negative values of assortativity, and that means users with high degree tend to connect users with small degree. Next, we analyze the patterns of user communication. A friend network of SNS is explicit while users' communication behaviors are defined as an implicit network. What kind of relationships do these networks have? To address this question, we obtain some characteristics of users' communication structure and activation patterns of users on the SNS sites. By using new indexes, friend aggregation rate and friend coverage rate, we show that SNS sites with high value of friend coverage rate activate diary postings and their comments. Besides, they become activated when hub users with high degree do not behave actively on the sites with high value of friend aggregation rate and high value of friend coverage rate. On the other hand, activation emerges when hub users behave actively on the sites with low value of friend aggregation rate and high value of friend coverage rate. Finally, we observe SNS sites which are increasing the number of users considerably, from the viewpoint of network structure, and extract characteristics of high growth SNS sites. As a result of discrimination on the basis of the decision tree analysis, we can recognize the high growth SNS sites with a high degree of accuracy. Besides, this approach suggests mixi and the other small-scale SNS sites have different character trait.

  6. Assessment of the effects and limitations of the 1998 to 2008 Abbreviated Injury Scale map using a large population-based dataset.

    PubMed

    Palmer, Cameron S; Franklyn, Melanie

    2011-01-07

    Trauma systems should consistently monitor a given trauma population over a period of time. The Abbreviated Injury Scale (AIS) and derived scores such as the Injury Severity Score (ISS) are commonly used to quantify injury severities in trauma registries. To reflect contemporary trauma management and treatment, the most recent version of the AIS (AIS08) contains many codes which differ in severity from their equivalents in the earlier 1998 version (AIS98). Consequently, the adoption of AIS08 may impede comparisons between data coded using different AIS versions. It may also affect the number of patients classified as major trauma. The entire AIS98-coded injury dataset of a large population based trauma registry was retrieved and mapped to AIS08 using the currently available AIS98-AIS08 dictionary map. The percentage of codes which had increased or decreased in severity, or could not be mapped, was examined in conjunction with the effect of these changes to the calculated ISS. The potential for free text information accompanying AIS coding to improve the quality of AIS mapping was explored. A total of 128280 AIS98-coded injuries were evaluated in 32134 patients, 15471 patients of whom were classified as major trauma. Although only 4.5% of dictionary codes decreased in severity from AIS98 to AIS08, this represented almost 13% of injuries in the registry. In 4.9% of patients, no injuries could be mapped. ISS was potentially unreliable in one-third of patients, as they had at least one AIS98 code which could not be mapped. Using AIS08, the number of patients classified as major trauma decreased by between 17.3% and 30.3%. Evaluation of free text descriptions for some injuries demonstrated the potential to improve mapping between AIS versions. Converting AIS98-coded data to AIS08 results in a significant decrease in the number of patients classified as major trauma. Many AIS98 codes are missing from the existing AIS map, and across a trauma population the AIS08 dataset estimates which it produces are of insufficient quality to be used in practice. However, it may be possible to improve AIS98 to AIS08 mapping to the point where it is useful to established registries.

  7. Assessment of the effects and limitations of the 1998 to 2008 Abbreviated Injury Scale map using a large population-based dataset

    PubMed Central

    2011-01-01

    Background Trauma systems should consistently monitor a given trauma population over a period of time. The Abbreviated Injury Scale (AIS) and derived scores such as the Injury Severity Score (ISS) are commonly used to quantify injury severities in trauma registries. To reflect contemporary trauma management and treatment, the most recent version of the AIS (AIS08) contains many codes which differ in severity from their equivalents in the earlier 1998 version (AIS98). Consequently, the adoption of AIS08 may impede comparisons between data coded using different AIS versions. It may also affect the number of patients classified as major trauma. Methods The entire AIS98-coded injury dataset of a large population based trauma registry was retrieved and mapped to AIS08 using the currently available AIS98-AIS08 dictionary map. The percentage of codes which had increased or decreased in severity, or could not be mapped, was examined in conjunction with the effect of these changes to the calculated ISS. The potential for free text information accompanying AIS coding to improve the quality of AIS mapping was explored. Results A total of 128280 AIS98-coded injuries were evaluated in 32134 patients, 15471 patients of whom were classified as major trauma. Although only 4.5% of dictionary codes decreased in severity from AIS98 to AIS08, this represented almost 13% of injuries in the registry. In 4.9% of patients, no injuries could be mapped. ISS was potentially unreliable in one-third of patients, as they had at least one AIS98 code which could not be mapped. Using AIS08, the number of patients classified as major trauma decreased by between 17.3% and 30.3%. Evaluation of free text descriptions for some injuries demonstrated the potential to improve mapping between AIS versions. Conclusions Converting AIS98-coded data to AIS08 results in a significant decrease in the number of patients classified as major trauma. Many AIS98 codes are missing from the existing AIS map, and across a trauma population the AIS08 dataset estimates which it produces are of insufficient quality to be used in practice. However, it may be possible to improve AIS98 to AIS08 mapping to the point where it is useful to established registries. PMID:21214906

  8. Characterisation of mental health conditions in social media using Informed Deep Learning.

    PubMed

    Gkotsis, George; Oellrich, Anika; Velupillai, Sumithra; Liakata, Maria; Hubbard, Tim J P; Dobson, Richard J B; Dutta, Rina

    2017-03-22

    The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients' own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of 'in the moment' daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions.

  9. Long-term variability of wind patterns at hub-height over Texas

    NASA Astrophysics Data System (ADS)

    Jung, J.; Jeon, W.; Choi, Y.; Souri, A.

    2017-12-01

    Wind energy is getting more attention because of its environmentally friendly attributes. Texas is a state with significant capacity and number of wind turbines. Wind power generation is significantly affected by wind patterns, and it is important to understand this seasonal and decadal variability for long-term power generation from wind turbines. This study focused on the trends of changes in wind pattern and its strength at two hub-heights (80 m and 110 m) over 30-years (1986 to 2015). We only analyzed summer data(June to September) because of concentrated electricity usage in Texas. We extracted hub-height wind data (U and V components) from the three-hourly National Centers for Environmental Prediction-North American Regional Reanalysis (NCEP-NARR) and classified wind patterns properly by using nonhierarchical K-means method. Hub-height wind patterns in summer seasons of 1986 to 2015 were classified in six classes at day and seven classes at night. Mean wind speed was 4.6 ms-1 at day and 5.4 ms-1 at night, but showed large variability in time and space. We combined each cluster's frequencies and wind speed tendencies with large scale atmospheric circulation features and quantified the amount of wind power generation.

  10. Characterisation of mental health conditions in social media using Informed Deep Learning

    NASA Astrophysics Data System (ADS)

    Gkotsis, George; Oellrich, Anika; Velupillai, Sumithra; Liakata, Maria; Hubbard, Tim J. P.; Dobson, Richard J. B.; Dutta, Rina

    2017-03-01

    The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients’ own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of ‘in the moment’ daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions.

  11. Analysis of transcriptome in hickory (Carya cathayensis), and uncover the dynamics in the hormonal signaling pathway during graft process.

    PubMed

    Qiu, Lingling; Jiang, Bo; Fang, Jia; Shen, Yike; Fang, Zhongxiang; Rm, Saravana Kumar; Yi, Keke; Shen, Chenjia; Yan, Daoliang; Zheng, Bingsong

    2016-11-17

    Hickory (Carya cathayensis), a woody plant with high nutritional and economic value, is widely planted in China. Due to its long juvenile phase, grafting is a useful technique for large-scale cultivation of hickory. To reveal the molecular mechanism during the graft process, we sequenced the transcriptomes of graft union in hickory. In our study, six RNA-seq libraries yielded a total of 83,676,860 clean short reads comprising 4.19 Gb of sequence data. A large number of differentially expressed genes (DEGs) at three time points during the graft process were identified. In detail, 777 DEGs in the 7 d vs 0 d (day after grafting) comparison were classified into 11 enriched Gene Ontology (GO) categories, and 262 DEGs in the 14 d vs 0 d comparison were classified into 15 enriched GO categories. Furthermore, an overview of the PPI network was constructed by these DEGs. In addition, 20 genes related to the auxin-and cytokinin-signaling pathways were identified, and some were validated by qRT-PCR analysis. Our comprehensive analysis provides basic information on the candidate genes and hormone signaling pathways involved in the graft process in hickory and other woody plants.

  12. Global detection approach for clustered microcalcifications in mammograms using a deep learning network.

    PubMed

    Wang, Juan; Nishikawa, Robert M; Yang, Yongyi

    2017-04-01

    In computerized detection of clustered microcalcifications (MCs) from mammograms, the traditional approach is to apply a pattern detector to locate the presence of individual MCs, which are subsequently grouped into clusters. Such an approach is often susceptible to the occurrence of false positives (FPs) caused by local image patterns that resemble MCs. We investigate the feasibility of a direct detection approach to determining whether an image region contains clustered MCs or not. Toward this goal, we develop a deep convolutional neural network (CNN) as the classifier model to which the input consists of a large image window ([Formula: see text] in size). The multiple layers in the CNN classifier are trained to automatically extract image features relevant to MCs at different spatial scales. In the experiments, we demonstrated this approach on a dataset consisting of both screen-film mammograms and full-field digital mammograms. We evaluated the detection performance both on classifying image regions of clustered MCs using a receiver operating characteristic (ROC) analysis and on detecting clustered MCs from full mammograms by a free-response receiver operating characteristic analysis. For comparison, we also considered a recently developed MC detector with FP suppression. In classifying image regions of clustered MCs, the CNN classifier achieved 0.971 in the area under the ROC curve, compared to 0.944 for the MC detector. In detecting clustered MCs from full mammograms, at 90% sensitivity, the CNN classifier obtained an FP rate of 0.69 clusters/image, compared to 1.17 clusters/image by the MC detector. These results indicate that using global image features can be more effective in discriminating clustered MCs from FPs caused by various sources, such as linear structures, thereby providing a more accurate detection of clustered MCs on mammograms.

  13. Quantum ensembles of quantum classifiers.

    PubMed

    Schuld, Maria; Petruccione, Francesco

    2018-02-09

    Quantum machine learning witnesses an increasing amount of quantum algorithms for data-driven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum ensembles of quantum classifiers. Creating the ensemble corresponds to a state preparation routine, after which the quantum classifiers are evaluated in parallel and their combined decision is accessed by a single-qubit measurement. This framework naturally allows for exponentially large ensembles in which - similar to Bayesian learning - the individual classifiers do not have to be trained. As an example, we analyse an exponentially large quantum ensemble in which each classifier is weighed according to its performance in classifying the training data, leading to new results for quantum as well as classical machine learning.

  14. Thermodynamic model of social influence on two-dimensional square lattice: Case for two features

    NASA Astrophysics Data System (ADS)

    Genzor, Jozef; Bužek, Vladimír; Gendiar, Andrej

    2015-02-01

    We propose a thermodynamic multi-state spin model in order to describe equilibrial behavior of a society. Our model is inspired by the Axelrod model used in social network studies. In the framework of the statistical mechanics language, we analyze phase transitions of our model, in which the spin interaction J is interpreted as a mutual communication among individuals forming a society. The thermal fluctuations introduce a noise T into the communication, which suppresses long-range correlations. Below a certain phase transition point Tt, large-scale clusters of the individuals, who share a specific dominant property, are formed. The measure of the cluster sizes is an order parameter after spontaneous symmetry breaking. By means of the Corner transfer matrix renormalization group algorithm, we treat our model in the thermodynamic limit and classify the phase transitions with respect to inherent degrees of freedom. Each individual is chosen to possess two independent features f = 2 and each feature can assume one of q traits (e.g. interests). Hence, each individual is described by q2 degrees of freedom. A single first-order phase transition is detected in our model if q > 2, whereas two distinct continuous phase transitions are found if q = 2 only. Evaluating the free energy, order parameters, specific heat, and the entanglement von Neumann entropy, we classify the phase transitions Tt(q) in detail. The permanent existence of the ordered phase (the large-scale cluster formation with a non-zero order parameter) is conjectured below a non-zero transition point Tt(q) ≈ 0.5 in the asymptotic regime q → ∞.

  15. An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors.

    PubMed

    Yang, Runtao; Zhang, Chengjin; Gao, Rui; Zhang, Lina

    2015-09-07

    Antifreeze proteins (AFPs) play a pivotal role in the antifreeze effect of overwintering organisms. They have a wide range of applications in numerous fields, such as improving the production of crops and the quality of frozen foods. Accurate identification of AFPs may provide important clues to decipher the underlying mechanisms of AFPs in ice-binding and to facilitate the selection of the most appropriate AFPs for several applications. Based on an ensemble learning technique, this study proposes an AFP identification system called AFP-Ensemble. In this system, random forest classifiers are trained by different training subsets and then aggregated into a consensus classifier by majority voting. The resulting predictor yields a sensitivity of 0.892, a specificity of 0.940, an accuracy of 0.938 and a balanced accuracy of 0.916 on an independent dataset, which are far better than the results obtained by previous methods. These results reveal that AFP-Ensemble is an effective and promising predictor for large-scale determination of AFPs. The detailed feature analysis in this study may give useful insights into the molecular mechanisms of AFP-ice interactions and provide guidance for the related experimental validation. A web server has been designed to implement the proposed method.

  16. Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes

    PubMed Central

    Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.

    2012-01-01

    Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300

  17. A procedure for classifying textural facies in gravel-bed rivers

    Treesearch

    John M. Buffington; David R. Montgomery

    1999-01-01

    Textural patches (i.e., grain-size facies) are commonly observed in gravel-bed channels and are of significance for both physical and biological processes at subreach scales. We present a general framework for classifying textural patches that allows modification for particular study goals, while maintaining a basic degree of standardization. Textures are classified...

  18. Adjacent-Categories Mokken Models for Rater-Mediated Assessments

    PubMed Central

    Wind, Stefanie A.

    2016-01-01

    Molenaar extended Mokken’s original probabilistic-nonparametric scaling models for use with polytomous data. These polytomous extensions of Mokken’s original scaling procedure have facilitated the use of Mokken scale analysis as an approach to exploring fundamental measurement properties across a variety of domains in which polytomous ratings are used, including rater-mediated educational assessments. Because their underlying item step response functions (i.e., category response functions) are defined using cumulative probabilities, polytomous Mokken models can be classified as cumulative models based on the classifications of polytomous item response theory models proposed by several scholars. In order to permit a closer conceptual alignment with educational performance assessments, this study presents an adjacent-categories variation on the polytomous monotone homogeneity and double monotonicity models. Data from a large-scale rater-mediated writing assessment are used to illustrate the adjacent-categories approach, and results are compared with the original formulations. Major findings suggest that the adjacent-categories models provide additional diagnostic information related to individual raters’ use of rating scale categories that is not observed under the original formulation. Implications are discussed in terms of methods for evaluating rating quality. PMID:29795916

  19. Introducing the MCHF/OVRP/SDMP: Multicapacitated/Heterogeneous Fleet/Open Vehicle Routing Problems with Split Deliveries and Multiproducts

    PubMed Central

    Yilmaz Eroglu, Duygu; Caglar Gencosman, Burcu; Cavdur, Fatih; Ozmutlu, H. Cenk

    2014-01-01

    In this paper, we analyze a real-world OVRP problem for a production company. Considering real-world constrains, we classify our problem as multicapacitated/heterogeneous fleet/open vehicle routing problem with split deliveries and multiproduct (MCHF/OVRP/SDMP) which is a novel classification of an OVRP. We have developed a mixed integer programming (MIP) model for the problem and generated test problems in different size (10–90 customers) considering real-world parameters. Although MIP is able to find optimal solutions of small size (10 customers) problems, when the number of customers increases, the problem gets harder to solve, and thus MIP could not find optimal solutions for problems that contain more than 10 customers. Moreover, MIP fails to find any feasible solution of large-scale problems (50–90 customers) within time limits (7200 seconds). Therefore, we have developed a genetic algorithm (GA) based solution approach for large-scale problems. The experimental results show that the GA based approach reaches successful solutions with 9.66% gap in 392.8 s on average instead of 7200 s for the problems that contain 10–50 customers. For large-scale problems (50–90 customers), GA reaches feasible solutions of problems within time limits. In conclusion, for the real-world applications, GA is preferable rather than MIP to reach feasible solutions in short time periods. PMID:25045735

  20. An automated, high-throughput plant phenotyping system using machine learning-based plant segmentation and image analysis.

    PubMed

    Lee, Unseok; Chang, Sungyul; Putra, Gian Anantrio; Kim, Hyoungseok; Kim, Dong Hwan

    2018-01-01

    A high-throughput plant phenotyping system automatically observes and grows many plant samples. Many plant sample images are acquired by the system to determine the characteristics of the plants (populations). Stable image acquisition and processing is very important to accurately determine the characteristics. However, hardware for acquiring plant images rapidly and stably, while minimizing plant stress, is lacking. Moreover, most software cannot adequately handle large-scale plant imaging. To address these problems, we developed a new, automated, high-throughput plant phenotyping system using simple and robust hardware, and an automated plant-imaging-analysis pipeline consisting of machine-learning-based plant segmentation. Our hardware acquires images reliably and quickly and minimizes plant stress. Furthermore, the images are processed automatically. In particular, large-scale plant-image datasets can be segmented precisely using a classifier developed using a superpixel-based machine-learning algorithm (Random Forest), and variations in plant parameters (such as area) over time can be assessed using the segmented images. We performed comparative evaluations to identify an appropriate learning algorithm for our proposed system, and tested three robust learning algorithms. We developed not only an automatic analysis pipeline but also a convenient means of plant-growth analysis that provides a learning data interface and visualization of plant growth trends. Thus, our system allows end-users such as plant biologists to analyze plant growth via large-scale plant image data easily.

  1. Deep Adaptive Log-Demons: Diffeomorphic Image Registration with Very Large Deformations

    PubMed Central

    Jia, Kebin

    2015-01-01

    This paper proposes a new framework for capturing large and complex deformation in image registration. Traditionally, this challenging problem relies firstly on a preregistration, usually an affine matrix containing rotation, scale, and translation and afterwards on a nonrigid transformation. According to preregistration, the directly calculated affine matrix, which is obtained by limited pixel information, may misregistrate when large biases exist, thus misleading following registration subversively. To address this problem, for two-dimensional (2D) images, the two-layer deep adaptive registration framework proposed in this paper firstly accurately classifies the rotation parameter through multilayer convolutional neural networks (CNNs) and then identifies scale and translation parameters separately. For three-dimensional (3D) images, affine matrix is located through feature correspondences by a triplanar 2D CNNs. Then deformation removal is done iteratively through preregistration and demons registration. By comparison with the state-of-the-art registration framework, our method gains more accurate registration results on both synthetic and real datasets. Besides, principal component analysis (PCA) is combined with correlation like Pearson and Spearman to form new similarity standards in 2D and 3D registration. Experiment results also show faster convergence speed. PMID:26120356

  2. Deep Adaptive Log-Demons: Diffeomorphic Image Registration with Very Large Deformations.

    PubMed

    Zhao, Liya; Jia, Kebin

    2015-01-01

    This paper proposes a new framework for capturing large and complex deformation in image registration. Traditionally, this challenging problem relies firstly on a preregistration, usually an affine matrix containing rotation, scale, and translation and afterwards on a nonrigid transformation. According to preregistration, the directly calculated affine matrix, which is obtained by limited pixel information, may misregistrate when large biases exist, thus misleading following registration subversively. To address this problem, for two-dimensional (2D) images, the two-layer deep adaptive registration framework proposed in this paper firstly accurately classifies the rotation parameter through multilayer convolutional neural networks (CNNs) and then identifies scale and translation parameters separately. For three-dimensional (3D) images, affine matrix is located through feature correspondences by a triplanar 2D CNNs. Then deformation removal is done iteratively through preregistration and demons registration. By comparison with the state-of-the-art registration framework, our method gains more accurate registration results on both synthetic and real datasets. Besides, principal component analysis (PCA) is combined with correlation like Pearson and Spearman to form new similarity standards in 2D and 3D registration. Experiment results also show faster convergence speed.

  3. A CRITICAL ASSESSMENT OF BIODOSIMETRY METHODS FOR LARGE-SCALE INCIDENTS

    PubMed Central

    Swartz, Harold M.; Flood, Ann Barry; Gougelet, Robert M.; Rea, Michael E.; Nicolalde, Roberto J.; Williams, Benjamin B.

    2014-01-01

    Recognition is growing regarding the possibility that terrorism or large-scale accidents could result in potential radiation exposure of hundreds of thousands of people and that the present guidelines for evaluation after such an event are seriously deficient. Therefore, there is a great and urgent need for after-the-fact biodosimetric methods to estimate radiation dose. To accomplish this goal, the dose estimates must be at the individual level, timely, accurate, and plausibly obtained in large-scale disasters. This paper evaluates current biodosimetry methods, focusing on their strengths and weaknesses in estimating human radiation exposure in large-scale disasters at three stages. First, the authors evaluate biodosimetry’s ability to determine which individuals did not receive a significant exposure so they can be removed from the acute response system. Second, biodosimetry’s capacity to classify those initially assessed as needing further evaluation into treatment-level categories is assessed. Third, we review biodosimetry’s ability to guide treatment, both short- and long-term, is reviewed. The authors compare biodosimetric methods that are based on physical vs. biological parameters and evaluate the features of current dosimeters (capacity, speed and ease of getting information, and accuracy) to determine which are most useful in meeting patients’ needs at each of the different stages. Results indicate that the biodosimetry methods differ in their applicability to the three different stages, and that combining physical and biological techniques may sometimes be most effective. In conclusion, biodosimetry techniques have different properties, and knowledge of their properties for meeting the different needs for different stages will result in their most effective use in a nuclear disaster mass-casualty event. PMID:20065671

  4. A 3D Active Learning Application for NeMO-Net, the NASA Neural Multi-Modal Observation and Training Network for Global Coral Reef Assessment

    NASA Technical Reports Server (NTRS)

    van den Bergh, Jarrett; Schutz, Joey; Li, Alan; Chirayath, Ved

    2017-01-01

    NeMO-Net, the NASA neural multi-modal observation and training network for global coral reef assessment, is an open-source deep convolutional neural network and interactive active learning training software aiming to accurately assess the present and past dynamics of coral reef ecosystems through determination of percent living cover and morphology as well as mapping of spatial distribution. We present an interactive video game prototype for tablet and mobile devices where users interactively label morphology classifications over mm-scale 3D coral reef imagery captured using fluid lensing to create a dataset that will be used to train NeMO-Nets convolutional neural network. The application currently allows for users to classify preselected regions of coral in the Pacific and will be expanded to include additional regions captured using our NASA FluidCam instrument, presently the highest-resolution remote sensing benthic imaging technology capable of removing ocean wave distortion, as well as lower-resolution airborne remote sensing data from the ongoing NASA CORAL campaign. Active learning applications present a novel methodology for efficiently training large-scale Neural Networks wherein variances in identification can be rapidly mitigated against control data. NeMO-Net periodically checks users input against pre-classified coral imagery to gauge their accuracy and utilize in-game mechanics to provide classification training. Users actively communicate with a server and are requested to classify areas of coral for which other users had conflicting classifications and contribute their input to a larger database for ranking. In partnering with Mission Blue and IUCN, NeMO-Net leverages an international consortium of subject matter experts to classify areas of confusion identified by NeMO-Net and generate additional labels crucial for identifying decision boundary locations in coral reef assessment.

  5. A 3D Active Learning Application for NeMO-Net, the NASA Neural Multi-Modal Observation and Training Network for Global Coral Reef Assessment

    NASA Astrophysics Data System (ADS)

    van den Bergh, J.; Schutz, J.; Chirayath, V.; Li, A.

    2017-12-01

    NeMO-Net, the NASA neural multi-modal observation and training network for global coral reef assessment, is an open-source deep convolutional neural network and interactive active learning training software aiming to accurately assess the present and past dynamics of coral reef ecosystems through determination of percent living cover and morphology as well as mapping of spatial distribution. We present an interactive video game prototype for tablet and mobile devices where users interactively label morphology classifications over mm-scale 3D coral reef imagery captured using fluid lensing to create a dataset that will be used to train NeMO-Net's convolutional neural network. The application currently allows for users to classify preselected regions of coral in the Pacific and will be expanded to include additional regions captured using our NASA FluidCam instrument, presently the highest-resolution remote sensing benthic imaging technology capable of removing ocean wave distortion, as well as lower-resolution airborne remote sensing data from the ongoing NASA CORAL campaign.Active learning applications present a novel methodology for efficiently training large-scale Neural Networks wherein variances in identification can be rapidly mitigated against control data. NeMO-Net periodically checks users' input against pre-classified coral imagery to gauge their accuracy and utilizes in-game mechanics to provide classification training. Users actively communicate with a server and are requested to classify areas of coral for which other users had conflicting classifications and contribute their input to a larger database for ranking. In partnering with Mission Blue and IUCN, NeMO-Net leverages an international consortium of subject matter experts to classify areas of confusion identified by NeMO-Net and generate additional labels crucial for identifying decision boundary locations in coral reef assessment.

  6. Mental Representation and Cognitive Consequences of Chinese Individual Classifiers

    ERIC Educational Resources Information Center

    Gao, Ming Y.; Malt, Barbara C.

    2009-01-01

    Classifier languages are spoken by a large portion of the world's population, but psychologists have only recently begun to investigate the psychological reality of classifier categories and their potential for influencing non-linguistic thought. The current work evaluates both the mental representation of classifiers and potential cognitive…

  7. Bayley-III Cognitive and Language Scales in Preterm Children.

    PubMed

    Spencer-Smith, Megan M; Spittle, Alicia J; Lee, Katherine J; Doyle, Lex W; Anderson, Peter J

    2015-05-01

    This study aimed to assess the sensitivity and specificity of the Bayley Scales of Infant and Toddler Development, Third Edition (Bayley-III), Cognitive and Language scales at 24 months for predicting cognitive impairments in preterm children at 4 years. Children born <30 weeks' gestation completed the Bayley-III at 24 months and the Differential Ability Scale, Second Edition (DAS-II), at 4 years to assess cognitive functioning. Test norms and local term-born reference data were used to classify delay on the Bayley-III Cognitive and Language scales. Impairment on the DAS-II Global Conceptual Ability, Verbal, and Nonverbal Reasoning indices was classified relative to test norms. Scores < -1 SD relative to the mean were classified as mild/moderate delay or impairment, and scores < -2 SDs were classified as moderate delay or impairment. A total of 105 children completed the Bayley-III and DAS-II. The sensitivity of mild/moderate cognitive delay on the Bayley-III for predicting impairment on DAS-II indices ranged from 29.4% to 38.5% and specificity ranged from 92.3% to 95.5%. The sensitivity of mild/moderate language delay on the Bayley-III for predicting impairment on DAS-II indices ranged from 40% to 46.7% and specificity ranged from 81.1% to 85.7%. The use of local reference data at 24 months to classify delay increased sensitivity but reduced specificity. Receiver operating curve analysis identified optimum cut-point scores for the Bayley-III that were more consistent with using local reference data than Bayley-III normative data. In our cohort of very preterm children, delay on the Bayley-III Cognitive and Language scales was not strongly predictive of future impairments. More children destined for later cognitive impairment were identified by using cut-points based on local reference data than Bayley-III norms. Copyright © 2015 by the American Academy of Pediatrics.

  8. Relationship between chronotype and temperament/character among university students.

    PubMed

    Lee, Kounseok; Lee, Hye-Kyung; Jhung, Kyungun; Park, Jin Young

    2017-05-01

    Chronotype is largely classified as being morning or evening types according to preference for daily activity and the preferred bedtime. This study examined the relationship between chronotype and temperament/character dimensions among university students. A total of 2857 participants completed the 140-item Temperament and Character Inventory-Revised Short version (TCI-RS) from a 5-score scale as well as the 13-item composite scale for morningness-eveningness (CSM). In this study, we classified chronotype as "morning," "neither," or "evening" types according to CSM scores and compared the scores in terms of 4 temperament dimensions and 3 character dimensions. The evening type showed high values for novelty seeking and harm avoidance, whereas the morning type had high scores for persistence, self-directedness, and cooperativeness. A logistic regression analysis after controlling for age and gender showed that chronotype significantly associated with persistence and novelty seeking. The results of this study suggest that chronotype is different according to gender and age and in addition, chronotype closely correlates with temperament and character. Among these, eveningness was associated with high novelty seeking, whereas morningness was associated with high persistence. Further studies are required to investigate the relationship between chronotype and temperament/character dimensions in a wider age bracket. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  9. Geographic scale matters in detecting the relationship between neighbourhood food environments and obesity risk: an analysis of driver license records in Salt Lake County, Utah.

    PubMed

    Fan, Jessie X; Hanson, Heidi A; Zick, Cathleen D; Brown, Barbara B; Kowaleski-Jones, Lori; Smith, Ken R

    2014-08-19

    Empirical studies of the association between neighbourhood food environments and individual obesity risk have found mixed results. One possible cause of these mixed findings is the variation in neighbourhood geographic scale used. The purpose of this paper was to examine how various neighbourhood geographic scales affected the estimated relationship between food environments and obesity risk. Cross-sectional secondary data analysis. Salt Lake County, Utah, USA. 403,305 Salt Lake County adults 25-64 in the Utah driver license database between 1995 and 2008. Utah driver license data were geo-linked to 2000 US Census data and Dun & Bradstreet business data. Food outlets were classified into the categories of large grocery stores, convenience stores, limited-service restaurants and full-service restaurants, and measured at four neighbourhood geographic scales: Census block group, Census tract, ZIP code and a 1 km buffer around the resident's house. These measures were regressed on individual obesity status using multilevel random intercept regressions. Obesity. Food environment was important for obesity but the scale of the relevant neighbourhood differs for different type of outlets: large grocery stores were not significant at all four geographic scales, limited-service restaurants at the medium-to-large scale (Census tract or larger) and convenience stores and full-service restaurants at the smallest scale (Census tract or smaller). The choice of neighbourhood geographic scale can affect the estimated significance of the association between neighbourhood food environments and individual obesity risk. However, variations in geographic scale alone do not explain the mixed findings in the literature. If researchers are constrained to use one geographic scale with multiple categories of food outlets, using Census tract or 1 km buffer as the neighbourhood geographic unit is likely to allow researchers to detect most significant relationships. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  10. Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths

    NASA Astrophysics Data System (ADS)

    Leung, Andrew S.; Acquaviva, Viviana; Gawiser, Eric; Ciardullo, Robin; Komatsu, Eiichiro; Malz, A. I.; Zeimann, Gregory R.; Bridge, Joanna S.; Drory, Niv; Feldmeier, John J.; Finkelstein, Steven L.; Gebhardt, Karl; Gronwall, Caryl; Hagen, Alex; Hill, Gary J.; Schneider, Donald P.

    2017-07-01

    We present a Bayesian approach to the redshift classification of emission-line galaxies when only a single emission line is detected spectroscopically. We consider the case of surveys for high-redshift Lyα-emitting galaxies (LAEs), which have traditionally been classified via an inferred rest-frame equivalent width (EW {W}{Lyα }) greater than 20 Å. Our Bayesian method relies on known prior probabilities in measured emission-line luminosity functions and EW distributions for the galaxy populations, and returns the probability that an object in question is an LAE given the characteristics observed. This approach will be directly relevant for the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), which seeks to classify ˜106 emission-line galaxies into LAEs and low-redshift [{{O}} {{II}}] emitters. For a simulated HETDEX catalog with realistic measurement noise, our Bayesian method recovers 86% of LAEs missed by the traditional {W}{Lyα } > 20 Å cutoff over 2 < z < 3, outperforming the EW cut in both contamination and incompleteness. This is due to the method’s ability to trade off between the two types of binary classification error by adjusting the stringency of the probability requirement for classifying an observed object as an LAE. In our simulations of HETDEX, this method reduces the uncertainty in cosmological distance measurements by 14% with respect to the EW cut, equivalent to recovering 29% more cosmological information. Rather than using binary object labels, this method enables the use of classification probabilities in large-scale structure analyses. It can be applied to narrowband emission-line surveys as well as upcoming large spectroscopic surveys including Euclid and WFIRST.

  11. MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites

    PubMed Central

    2017-01-01

    Quality control of MRI is essential for excluding problematic acquisitions and avoiding bias in subsequent image processing and analysis. Visual inspection is subjective and impractical for large scale datasets. Although automated quality assessments have been demonstrated on single-site datasets, it is unclear that solutions can generalize to unseen data acquired at new sites. Here, we introduce the MRI Quality Control tool (MRIQC), a tool for extracting quality measures and fitting a binary (accept/exclude) classifier. Our tool can be run both locally and as a free online service via the OpenNeuro.org portal. The classifier is trained on a publicly available, multi-site dataset (17 sites, N = 1102). We perform model selection evaluating different normalization and feature exclusion approaches aimed at maximizing across-site generalization and estimate an accuracy of 76%±13% on new sites, using leave-one-site-out cross-validation. We confirm that result on a held-out dataset (2 sites, N = 265) also obtaining a 76% accuracy. Even though the performance of the trained classifier is statistically above chance, we show that it is susceptible to site effects and unable to account for artifacts specific to new sites. MRIQC performs with high accuracy in intra-site prediction, but performance on unseen sites leaves space for improvement which might require more labeled data and new approaches to the between-site variability. Overcoming these limitations is crucial for a more objective quality assessment of neuroimaging data, and to enable the analysis of extremely large and multi-site samples. PMID:28945803

  12. Automatic classification for mammogram backgrounds based on bi-rads complexity definition and on a multi content analysis framework

    NASA Astrophysics Data System (ADS)

    Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric

    2011-03-01

    Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.

  13. Regional modeling of large wildfires under current and potential future climates in Colorado and Wyoming, USA

    USGS Publications Warehouse

    West, Amanda; Kumar, Sunil; Jarnevich, Catherine S.

    2016-01-01

    Regional analysis of large wildfire potential given climate change scenarios is crucial to understanding areas most at risk in the future, yet wildfire models are not often developed and tested at this spatial scale. We fit three historical climate suitability models for large wildfires (i.e. ≥ 400 ha) in Colorado andWyoming using topography and decadal climate averages corresponding to wildfire occurrence at the same temporal scale. The historical models classified points of known large wildfire occurrence with high accuracies. Using a novel approach in wildfire modeling, we applied the historical models to independent climate and wildfire datasets, and the resulting sensitivities were 0.75, 0.81, and 0.83 for Maxent, Generalized Linear, and Multivariate Adaptive Regression Splines, respectively. We projected the historic models into future climate space using data from 15 global circulation models and two representative concentration pathway scenarios. Maps from these geospatial analyses can be used to evaluate the changing spatial distribution of climate suitability of large wildfires in these states. April relative humidity was the most important covariate in all models, providing insight to the climate space of large wildfires in this region. These methods incorporate monthly and seasonal climate averages at a spatial resolution relevant to land management (i.e. 1 km2) and provide a tool that can be modified for other regions of North America, or adapted for other parts of the world.

  14. Fractal multi-level organisation of human groups in a virtual world.

    PubMed

    Fuchs, Benedikt; Sornette, Didier; Thurner, Stefan

    2014-10-06

    Humans are fundamentally social. They form societies which consist of hierarchically layered nested groups of various quality, size, and structure. The anthropologic literature has classified these groups as support cliques, sympathy groups, bands, cognitive groups, tribes, linguistic groups, and so on. Anthropologic data show that, on average, each group consists of approximately three subgroups. However, a general understanding of the structural dependence of groups at different layers is largely missing. We extend these early findings to a very large high-precision large-scale internet-based social network data. We analyse the organisational structure of a complete, multi-relational, large social multiplex network of a human society consisting of about 400,000 odd players of an open-ended massive multiplayer online game for which we know all about their various group memberships at different layers. Remarkably, the online players' society exhibits the same type of structured hierarchical layers as found in hunter-gatherer societies. Our findings suggest that the hierarchical organisation of human society is deeply nested in human psychology.

  15. Fractal multi-level organisation of human groups in a virtual world

    PubMed Central

    Fuchs, Benedikt; Sornette, Didier; Thurner, Stefan

    2014-01-01

    Humans are fundamentally social. They form societies which consist of hierarchically layered nested groups of various quality, size, and structure. The anthropologic literature has classified these groups as support cliques, sympathy groups, bands, cognitive groups, tribes, linguistic groups, and so on. Anthropologic data show that, on average, each group consists of approximately three subgroups. However, a general understanding of the structural dependence of groups at different layers is largely missing. We extend these early findings to a very large high-precision large-scale internet-based social network data. We analyse the organisational structure of a complete, multi-relational, large social multiplex network of a human society consisting of about 400,000 odd players of an open-ended massive multiplayer online game for which we know all about their various group memberships at different layers. Remarkably, the online players' society exhibits the same type of structured hierarchical layers as found in hunter-gatherer societies. Our findings suggest that the hierarchical organisation of human society is deeply nested in human psychology. PMID:25283998

  16. Fractal multi-level organisation of human groups in a virtual world

    NASA Astrophysics Data System (ADS)

    Fuchs, Benedikt; Sornette, Didier; Thurner, Stefan

    2014-10-01

    Humans are fundamentally social. They form societies which consist of hierarchically layered nested groups of various quality, size, and structure. The anthropologic literature has classified these groups as support cliques, sympathy groups, bands, cognitive groups, tribes, linguistic groups, and so on. Anthropologic data show that, on average, each group consists of approximately three subgroups. However, a general understanding of the structural dependence of groups at different layers is largely missing. We extend these early findings to a very large high-precision large-scale internet-based social network data. We analyse the organisational structure of a complete, multi-relational, large social multiplex network of a human society consisting of about 400,000 odd players of an open-ended massive multiplayer online game for which we know all about their various group memberships at different layers. Remarkably, the online players' society exhibits the same type of structured hierarchical layers as found in hunter-gatherer societies. Our findings suggest that the hierarchical organisation of human society is deeply nested in human psychology.

  17. Landscape object-based analysis of wetland plant functional types: the effects of spatial scale, vegetation classes and classifier methods

    NASA Astrophysics Data System (ADS)

    Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.

    2011-12-01

    Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.

  18. Large-Scale Phylogenomics of the Lactobacillus casei Group Highlights Taxonomic Inconsistencies and Reveals Novel Clade-Associated Features

    PubMed Central

    Wuyts, Sander; Wittouck, Stijn; De Boeck, Ilke; Allonsius, Camille N.; Pasolli, Edoardo

    2017-01-01

    ABSTRACT Although the genotypic and phenotypic properties of the Lactobacillus casei group have been studied extensively, the taxonomic structure has been the subject of debate for a long time. Here, we performed a large-scale comparative analysis by using 183 publicly available genomes supplemented with a Lactobacillus strain isolated from the human upper respiratory tract. On the basis of this analysis, we identified inconsistencies in the taxonomy and reclassified all of the genomes according to their most closely related type strains. This led to the identification of a catalase-encoding gene in all 10 L. casei sensu stricto strains, making it the first described catalase-positive species in the Lactobacillus genus. Moreover, we found that 6 of 10 L. casei genomes contained a SecA2/SecY2 gene cluster with two putative glycosylated surface adhesin proteins. Altogether, our results highlight current inconsistencies in the taxonomy of the L. casei group and reveal new clade-associated functional features. IMPORTANCE The closely related species of the Lactobacillus casei group are extensively studied because of their applications in food fermentations and as probiotics. Our results show that many strains in this group are incorrectly classified and that reclassifying them to their most closely related species type strain improves the functional predictive power of their taxonomy. In addition, our findings may spark increased interest in the L. casei species. We find that after reclassification, only 10 genomes remain classified as L. casei. These strains show some interesting properties. First, they all appear to be catalase positive. This suggests that they have increased oxidative stress resistance. Second, we isolated an L. casei strain from the human upper respiratory tract and discovered that it and multiple other L. casei strains harbor one or even two large, glycosylated putative surface adhesins. This might inspire further exploration of this species as a potential probiotic organism. PMID:28845461

  19. Diversity in the representation of large-scale circulation associated with ENSO-Indian summer monsoon teleconnections in CMIP5 models

    NASA Astrophysics Data System (ADS)

    Ramu, Dandi A.; Chowdary, Jasti S.; Ramakrishna, S. S. V. S.; Kumar, O. S. R. U. B.

    2018-04-01

    Realistic simulation of large-scale circulation patterns associated with El Niño-Southern Oscillation (ENSO) is vital in coupled models in order to represent teleconnections to different regions of globe. The diversity in representing large-scale circulation patterns associated with ENSO-Indian summer monsoon (ISM) teleconnections in 23 Coupled Model Intercomparison Project Phase 5 (CMIP5) models is examined. CMIP5 models have been classified into three groups based on the correlation between Niño3.4 sea surface temperature (SST) index and ISM rainfall anomalies, models in group 1 (G1) overestimated El Niño-ISM teleconections and group 3 (G3) models underestimated it, whereas these teleconnections are better represented in group 2 (G2) models. Results show that in G1 models, El Niño-induced Tropical Indian Ocean (TIO) SST anomalies are not well represented. Anomalous low-level anticyclonic circulation anomalies over the southeastern TIO and western subtropical northwest Pacific (WSNP) cyclonic circulation are shifted too far west to 60° E and 120° E, respectively. This bias in circulation patterns implies dry wind advection from extratropics/midlatitudes to Indian subcontinent. In addition to this, large-scale upper level convergence together with lower level divergence over ISM region corresponding to El Niño are stronger in G1 models than in observations. Thus, unrealistic shift in low-level circulation centers corroborated by upper level circulation changes are responsible for overestimation of ENSO-ISM teleconnections in G1 models. Warm Pacific SST anomalies associated with El Niño are shifted too far west in many G3 models unlike in the observations. Further large-scale circulation anomalies over the Pacific and ISM region are misrepresented during El Niño years in G3 models. Too strong upper-level convergence away from Indian subcontinent and too weak WSNP cyclonic circulation are prominent in most of G3 models in which ENSO-ISM teleconnections are underestimated. On the other hand, many G2 models are able to represent most of large-scale circulation over Indo-Pacific region associated with El Niño and hence provide more realistic ENSO-ISM teleconnections. Therefore, this study advocates the importance of representation/simulation of large-scale circulation patterns during El Niño years in coupled models in order to capture El Niño-monsoon teleconnections well.

  20. High dimensional land cover inference using remotely sensed modis data

    NASA Astrophysics Data System (ADS)

    Glanz, Hunter S.

    Image segmentation persists as a major statistical problem, with the volume and complexity of data expanding alongside new technologies. Land cover classification, one of the most studied problems in Remote Sensing, provides an important example of image segmentation whose needs transcend the choice of a particular classification method. That is, the challenges associated with land cover classification pervade the analysis process from data pre-processing to estimation of a final land cover map. Many of the same challenges also plague the task of land cover change detection. Multispectral, multitemporal data with inherent spatial relationships have hardly received adequate treatment due to the large size of the data and the presence of missing values. In this work we propose a novel, concerted application of methods which provide a unified way to estimate model parameters, impute missing data, reduce dimensionality, classify land cover, and detect land cover changes. This comprehensive analysis adopts a Bayesian approach which incorporates prior knowledge to improve the interpretability, efficiency, and versatility of land cover classification and change detection. We explore a parsimonious, parametric model that allows for a natural application of principal components analysis to isolate important spectral characteristics while preserving temporal information. Moreover, it allows us to impute missing data and estimate parameters via expectation-maximization (EM). A significant byproduct of our framework includes a suite of training data assessment tools. To classify land cover, we employ a spanning tree approximation to a lattice Potts prior to incorporate spatial relationships in a judicious way and more efficiently access the posterior distribution of pixel labels. We then achieve exact inference of the labels via the centroid estimator. To detect land cover changes, we develop a new EM algorithm based on the same parametric model. We perform simulation studies to validate our models and methods, and conduct an extensive continental scale case study using MODIS data. The results show that we successfully classify land cover and recover the spatial patterns present in large scale data. Application of our change point method to an area in the Amazon successfully identifies the progression of deforestation through portions of the region.

  1. Brain-Computer Interface Based on Generation of Visual Images

    PubMed Central

    Bobrov, Pavel; Frolov, Alexander; Cantor, Charles; Fedulova, Irina; Bakhnyan, Mikhail; Zhavoronkov, Alexander

    2011-01-01

    This paper examines the task of recognizing EEG patterns that correspond to performing three mental tasks: relaxation and imagining of two types of pictures: faces and houses. The experiments were performed using two EEG headsets: BrainProducts ActiCap and Emotiv EPOC. The Emotiv headset becomes widely used in consumer BCI application allowing for conducting large-scale EEG experiments in the future. Since classification accuracy significantly exceeded the level of random classification during the first three days of the experiment with EPOC headset, a control experiment was performed on the fourth day using ActiCap. The control experiment has shown that utilization of high-quality research equipment can enhance classification accuracy (up to 68% in some subjects) and that the accuracy is independent of the presence of EEG artifacts related to blinking and eye movement. This study also shows that computationally-inexpensive Bayesian classifier based on covariance matrix analysis yields similar classification accuracy in this problem as a more sophisticated Multi-class Common Spatial Patterns (MCSP) classifier. PMID:21695206

  2. Microplastic pollution in the Northeast Atlantic Ocean: validated and opportunistic sampling.

    PubMed

    Lusher, Amy L; Burke, Ann; O'Connor, Ian; Officer, Rick

    2014-11-15

    Levels of marine debris, including microplastics, are largely un-documented in the Northeast Atlantic Ocean. Broad scale monitoring efforts are required to understand the distribution, abundance and ecological implications of microplastic pollution. A method of continuous sampling was developed to be conducted in conjunction with a wide range of vessel operations to maximise vessel time. Transects covering a total of 12,700 km were sampled through continuous monitoring of open ocean sub-surface water resulting in 470 samples. Items classified as potential plastics were identified in 94% of samples. A total of 2315 particles were identified, 89% were less than 5mm in length classifying them as microplastics. Average plastic abundance in the Northeast Atlantic was calculated as 2.46 particles m(-3). This is the first report to demonstrate the ubiquitous nature of microplastic pollution in the Northeast Atlantic Ocean and to present a potential method for standardised monitoring of microplastic pollution. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Lagrangian methods of cosmic web classification

    NASA Astrophysics Data System (ADS)

    Fisher, J. D.; Faltenbacher, A.; Johnson, M. S. T.

    2016-05-01

    The cosmic web defines the large-scale distribution of matter we see in the Universe today. Classifying the cosmic web into voids, sheets, filaments and nodes allows one to explore structure formation and the role environmental factors have on halo and galaxy properties. While existing studies of cosmic web classification concentrate on grid-based methods, this work explores a Lagrangian approach where the V-web algorithm proposed by Hoffman et al. is implemented with techniques borrowed from smoothed particle hydrodynamics. The Lagrangian approach allows one to classify individual objects (e.g. particles or haloes) based on properties of their nearest neighbours in an adaptive manner. It can be applied directly to a halo sample which dramatically reduces computational cost and potentially allows an application of this classification scheme to observed galaxy samples. Finally, the Lagrangian nature admits a straightforward inclusion of the Hubble flow negating the necessity of a visually defined threshold value which is commonly employed by grid-based classification methods.

  4. Classification of epilepsy types through global network analysis of scalp electroencephalograms

    NASA Astrophysics Data System (ADS)

    Lee, Uncheol; Kim, Seunghwan; Jung, Ki-Young

    2006-04-01

    Epilepsy is a dynamic disease in which self-organization and emergent structures occur dynamically at multiple levels of neuronal integration. Therefore, the transient relationship within multichannel electroencephalograms (EEGs) is crucial for understanding epileptic processes. In this paper, we show that the global relationship within multichannel EEGs provides us with more useful information in classifying two different epilepsy types than pairwise relationships such as cross correlation. To demonstrate this, we determine the global network structure within channels of the scalp EEG based on the minimum spanning tree method. The topological dissimilarity of the network structures from different types of temporal lobe epilepsy is described in the form of the divergence rate and is computed for 11 patients with left (LTLE) and right temporal lobe epilepsy (RTLE). We find that patients with LTLE and RTLE exhibit different large scale network structures, which emerge at the epoch immediately before the seizure onset, not in the preceding epochs. Our results suggest that patients with the two different epilepsy types display distinct large scale dynamical networks with characteristic epileptic network structures.

  5. Automatic segmentation and classification of mycobacterium tuberculosis with conventional light microscopy

    NASA Astrophysics Data System (ADS)

    Xu, Chao; Zhou, Dongxiang; Zhai, Yongping; Liu, Yunhui

    2015-12-01

    This paper realizes the automatic segmentation and classification of Mycobacterium tuberculosis with conventional light microscopy. First, the candidate bacillus objects are segmented by the marker-based watershed transform. The markers are obtained by an adaptive threshold segmentation based on the adaptive scale Gaussian filter. The scale of the Gaussian filter is determined according to the color model of the bacillus objects. Then the candidate objects are extracted integrally after region merging and contaminations elimination. Second, the shape features of the bacillus objects are characterized by the Hu moments, compactness, eccentricity, and roughness, which are used to classify the single, touching and non-bacillus objects. We evaluated the logistic regression, random forest, and intersection kernel support vector machines classifiers in classifying the bacillus objects respectively. Experimental results demonstrate that the proposed method yields to high robustness and accuracy. The logistic regression classifier performs best with an accuracy of 91.68%.

  6. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence.

    PubMed

    Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2015-06-15

    Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using 'learning to rank'. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. The software is available upon request. © The Author 2015. Published by Oxford University Press.

  7. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence

    PubMed Central

    Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2015-01-01

    Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn PMID:26072501

  8. Downscaling large-scale circulation to local winter climate using neural network techniques

    NASA Astrophysics Data System (ADS)

    Cavazos Perez, Maria Tereza

    1998-12-01

    The severe impacts of climate variability on society reveal the increasing need for improving regional-scale climate diagnosis. A new downscaling approach for climate diagnosis is developed here. It is based on neural network techniques that derive transfer functions from the large-scale atmospheric controls to the local winter climate in northeastern Mexico and southeastern Texas during the 1985-93 period. A first neural network (NN) model employs time-lagged component scores from a rotated principal component analysis of SLP, 500-hPa heights, and 1000-500 hPa thickness as predictors of daily precipitation. The model is able to reproduce the phase and, to some decree, the amplitude of large rainfall events, reflecting the influence of the large-scale circulation. Large errors are found over the Sierra Madre, over the Gulf of Mexico, and during El Nino events, suggesting an increase in the importance of meso-scale rainfall processes. However, errors are also due to the lack of randomization of the input data and the absence of local atmospheric predictors such as moisture. Thus, a second NN model uses time-lagged specific humidity at the Earth's surface and at the 700 hPa level, SLP tendency, and 700-500 hPa thickness as input to a self-organizing map (SOM) that pre-classifies the atmospheric fields into different patterns. The results from the SOM classification document that negative (positive) anomalies of winter precipitation over the region are associated with: (1) weaker (stronger) Aleutian low; (2) stronger (weaker) North Pacific high; (3) negative (positive) phase of the Pacific North American pattern; and (4) La Nina (El Nino) events. The SOM atmospheric patterns are then used as input to a feed-forward NN that captures over 60% of the daily rainfall variance and 94% of the daily minimum temperature variance over the region. This demonstrates the ability of artificial neural network models to simulate realistic relationships on daily time scales. The results of this research also reveal that the SOM pre-classification of days with similar atmospheric conditions succeeded in emphasizing the differences of the atmospheric variance conducive to extreme events. This resulted in a downscaling NN model that is highly sensitive to local-scale weather anomalies associated with El Nino and extreme cold events.

  9. Coccolithophorid blooms in the global ocean

    NASA Technical Reports Server (NTRS)

    Brown, Christopher W.; Yoder, James A.

    1994-01-01

    The global distribution pattern of coccolithophrid blooms was mapped in order to ascertain the prevalence of these blooms in the world's oceans and to estimate their worldwide production of CaCO3 and dimethyl sulfide (DMS). Mapping was accomplished by classifying pixels of 5-day global composites of coastal zone color scanner imagery into bloom and nonbloom classes using a supervised, multispectral classification scheme. Surface waters with the spectral signature of coccolithophorid blooms annually covered an average of 1.4 x 10(exp 6) sq km in the world oceans from 1979 to 1985, with the subpolar latitudes accounting for 71% of this surface area. Classified blooms were most extensive in the Subartic North Atlantic. Large expanses of the bloom signal were also detected in the North Pacific, on the Argentine shelf and slope, and in numerous lower latitude marginal seas and shelf regions. The greatest spatial extent of classified blooms in subpolar oceanic regions occurred in the months from summer to early autumn, while those in lower latitude marginal seas occurred in midwinter to early spring. Though the classification scheme was effcient in separating bloom and nonbloom classes during test simulations, and biogeographical literature generally confirms the resulting distribution pattern of blooms in the subpolar regions, the cause of the bloom signal is equivocal in some geographic areas, particularly on shelf regions at lower latitudes. Standing stock estimates suggest that the presumed Emiliania huxleyi blooms act as a significant source of calcite carbon and DMS sulfur on a regional scale. On a global scale, however, the satellite-detected coccolithophorid blooms are estimated to play only a minor role in the annual production of these two compounds and their flux from the surface mixed layer.

  10. Floating Forests: Validation of a Citizen Science Effort to Answer Global Ecological Questions

    NASA Astrophysics Data System (ADS)

    Rosenthal, I.; Byrnes, J.; Cavanaugh, K. C.; Haupt, A. J.; Trouille, L.; Bell, T. W.; Rassweiler, A.; Pérez-Matus, A.; Assis, J.

    2017-12-01

    Researchers undertaking long term, large-scale ecological analyses face significant challenges for data collection and processing. Crowdsourcing via citizen science can provide an efficient method for analyzing large data sets. However, many scientists have raised questions about the quality of data collected by citizen scientists. Here we use Floating-Forests (http://floatingforests.org), a citizen science platform for creating a global time series of giant kelp abundance, to show that ensemble classifications of satellite data can ensure data quality. Citizen scientists view satellite images of coastlines and classify kelp forests by tracing all visible patches of kelp. Each image is classified by fifteen citizen scientists before being retired. To validate citizen science results, all fifteen classifications are converted to a raster and overlaid on a calibration dataset generated from previous studies. Results show that ensemble classifications from citizen scientists are consistently accurate when compared to calibration data. Given that all source images were acquired by Landsat satellites, we expect this consistency to hold across all regions. At present, we have over 6000 web-based citizen scientists' classifications of almost 2.5 million images of kelp forests in California and Tasmania. These results are not only useful for remote sensing of kelp forests, but also for a wide array of applications that combine citizen science with remote sensing.

  11. Evaluation of ERTS-1 data for inventory of forest and rangeland and detection of forest stress. [Atlanta, Georgia, Manitou, Colorado, and Black Hills

    NASA Technical Reports Server (NTRS)

    Heller, R. C. (Principal Investigator); Aldrich, R. C.; Driscoll, R. S.; Francis, R. E.; Weber, F. P.

    1974-01-01

    The author has identified the following significant results. Results of photointerpretation indicated that ERTS is a good classifier of forest and nonforest lands (90 to 95 percent accurate). Photointerpreters could make this separation as accurately as signature analysis of the computer compatible tapes. Further breakdowns of cover types at each site could not be accurately classified by interpreters (60 percent) or computer analysts (74 percent). Exceptions were water, wet meadow, and coniferous stands. At no time could the large bark beetle infestations (many over 300 meters in size) be detected on ERTS images. The ERTS wavebands are too broad to distinguish the yellow, yellow-red, and red colors of the dying pine foliage from healthy green-yellow foliage. Forest disturbances could be detected on ERTS color composites about 90 percent of the time when compared with six-year-old photo index mosaics. ERTS enlargements (1:125,000 scale, preferably color prints) would be useful to forest managers of large ownerships over 5,000 hectares (12,500 acres) for broad area planning. Black-and-white enlargements can be used effectively as aerial navigation aids for precision aerial photography where maps are old or not available.

  12. Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity.

    PubMed

    Schneider, Nadine; Lowe, Daniel M; Sayle, Roger A; Landrum, Gregory A

    2015-01-26

    Fingerprint methods applied to molecules have proven to be useful for similarity determination and as inputs to machine-learning models. Here, we present the development of a new fingerprint for chemical reactions and validate its usefulness in building machine-learning models and in similarity assessment. Our final fingerprint is constructed as the difference of the atom-pair fingerprints of products and reactants and includes agents via calculated physicochemical properties. We validated the fingerprints on a large data set of reactions text-mined from granted United States patents from the last 40 years that have been classified using a substructure-based expert system. We applied machine learning to build a 50-class predictive model for reaction-type classification that correctly predicts 97% of the reactions in an external test set. Impressive accuracies were also observed when applying the classifier to reactions from an in-house electronic laboratory notebook. The performance of the novel fingerprint for assessing reaction similarity was evaluated by a cluster analysis that recovered 48 out of 50 of the reaction classes with a median F-score of 0.63 for the clusters. The data sets used for training and primary validation as well as all python scripts required to reproduce the analysis are provided in the Supporting Information.

  13. Characterisation of mental health conditions in social media using Informed Deep Learning

    PubMed Central

    Gkotsis, George; Oellrich, Anika; Velupillai, Sumithra; Liakata, Maria; Hubbard, Tim J. P.; Dobson, Richard J. B.; Dutta, Rina

    2017-01-01

    The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients’ own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of ‘in the moment’ daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions. PMID:28327593

  14. Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion.

    PubMed

    Wang, Jian-Gang; Sung, Eric; Yau, Wei-Yun

    2011-07-01

    Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.

  15. Text Classification for Assisting Moderators in Online Health Communities

    PubMed Central

    Huh, Jina; Yetisgen-Yildiz, Meliha; Pratt, Wanda

    2013-01-01

    Objectives Patients increasingly visit online health communities to get help on managing health. The large scale of these online communities makes it impossible for the moderators to engage in all conversations; yet, some conversations need their expertise. Our work explores low-cost text classification methods to this new domain of determining whether a thread in an online health forum needs moderators’ help. Methods We employed a binary classifier on WebMD’s online diabetes community data. To train the classifier, we considered three feature types: (1) word unigram, (2) sentiment analysis features, and (3) thread length. We applied feature selection methods based on χ2 statistics and under sampling to account for unbalanced data. We then performed a qualitative error analysis to investigate the appropriateness of the gold standard. Results Using sentiment analysis features, feature selection methods, and balanced training data increased the AUC value up to 0.75 and the F1-score up to 0.54 compared to the baseline of using word unigrams with no feature selection methods on unbalanced data (0.65 AUC and 0.40 F1-score). The error analysis uncovered additional reasons for why moderators respond to patients’ posts. Discussion We showed how feature selection methods and balanced training data can improve the overall classification performance. We present implications of weighing precision versus recall for assisting moderators of online health communities. Our error analysis uncovered social, legal, and ethical issues around addressing community members’ needs. We also note challenges in producing a gold standard, and discuss potential solutions for addressing these challenges. Conclusion Social media environments provide popular venues in which patients gain health-related information. Our work contributes to understanding scalable solutions for providing moderators’ expertise in these large-scale, social media environments. PMID:24025513

  16. Enhancing the performance of regional land cover mapping

    NASA Astrophysics Data System (ADS)

    Wu, Weicheng; Zucca, Claudio; Karam, Fadi; Liu, Guangping

    2016-10-01

    Different pixel-based, object-based and subpixel-based methods such as time-series analysis, decision-tree, and different supervised approaches have been proposed to conduct land use/cover classification. However, despite their proven advantages in small dataset tests, their performance is variable and less satisfactory while dealing with large datasets, particularly, for regional-scale mapping with high resolution data due to the complexity and diversity in landscapes and land cover patterns, and the unacceptably long processing time. The objective of this paper is to demonstrate the comparatively highest performance of an operational approach based on integration of multisource information ensuring high mapping accuracy in large areas with acceptable processing time. The information used includes phenologically contrasted multiseasonal and multispectral bands, vegetation index, land surface temperature, and topographic features. The performance of different conventional and machine learning classifiers namely Malahanobis Distance (MD), Maximum Likelihood (ML), Artificial Neural Networks (ANNs), Support Vector Machines (SVMs) and Random Forests (RFs) was compared using the same datasets in the same IDL (Interactive Data Language) environment. An Eastern Mediterranean area with complex landscape and steep climate gradients was selected to test and develop the operational approach. The results showed that SVMs and RFs classifiers produced most accurate mapping at local-scale (up to 96.85% in Overall Accuracy), but were very time-consuming in whole-scene classification (more than five days per scene) whereas ML fulfilled the task rapidly (about 10 min per scene) with satisfying accuracy (94.2-96.4%). Thus, the approach composed of integration of seasonally contrasted multisource data and sampling at subclass level followed by a ML classification is a suitable candidate to become an operational and effective regional land cover mapping method.

  17. The phase-space structure of nearby dark matter as constrained by the SDSS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leclercq, Florent; Percival, Will; Jasche, Jens

    Previous studies using numerical simulations have demonstrated that the shape of the cosmic web can be described by studying the Lagrangian displacement field. We extend these analyses, showing that it is now possible to perform a Lagrangian description of cosmic structure in the nearby Universe based on large-scale structure observations. Building upon recent Bayesian large-scale inference of initial conditions, we present a cosmographic analysis of the dark matter distribution and its evolution, referred to as the dark matter phase-space sheet, in the nearby universe as probed by the Sloan Digital Sky Survey main galaxy sample. We consider its stretchings andmore » foldings using a tetrahedral tessellation of the Lagrangian lattice. The method provides extremely accurate estimates of nearby density and velocity fields, even in regions of low galaxy density. It also measures the number of matter streams, and the deformation and parity reversals of fluid elements, which were previously thought inaccessible using observations. We illustrate the approach by showing the phase-space structure of known objects of the nearby Universe such as the Sloan Great Wall, the Coma cluster and the Boötes void. We dissect cosmic structures into four distinct components (voids, sheets, filaments, and clusters), using the Lagrangian classifiers DIVA, ORIGAMI, and a new scheme which we introduce and call LICH. Because these classifiers use information other than the sheer local density, identified structures explicitly carry physical information about their formation history. Accessing the phase-space structure of dark matter in galaxy surveys opens the way for new confrontations of observational data and theoretical models. We have made our data products publicly available.« less

  18. The phase-space structure of nearby dark matter as constrained by the SDSS

    NASA Astrophysics Data System (ADS)

    Leclercq, Florent; Jasche, Jens; Lavaux, Guilhem; Wandelt, Benjamin; Percival, Will

    2017-06-01

    Previous studies using numerical simulations have demonstrated that the shape of the cosmic web can be described by studying the Lagrangian displacement field. We extend these analyses, showing that it is now possible to perform a Lagrangian description of cosmic structure in the nearby Universe based on large-scale structure observations. Building upon recent Bayesian large-scale inference of initial conditions, we present a cosmographic analysis of the dark matter distribution and its evolution, referred to as the dark matter phase-space sheet, in the nearby universe as probed by the Sloan Digital Sky Survey main galaxy sample. We consider its stretchings and foldings using a tetrahedral tessellation of the Lagrangian lattice. The method provides extremely accurate estimates of nearby density and velocity fields, even in regions of low galaxy density. It also measures the number of matter streams, and the deformation and parity reversals of fluid elements, which were previously thought inaccessible using observations. We illustrate the approach by showing the phase-space structure of known objects of the nearby Universe such as the Sloan Great Wall, the Coma cluster and the Boötes void. We dissect cosmic structures into four distinct components (voids, sheets, filaments, and clusters), using the Lagrangian classifiers DIVA, ORIGAMI, and a new scheme which we introduce and call LICH. Because these classifiers use information other than the sheer local density, identified structures explicitly carry physical information about their formation history. Accessing the phase-space structure of dark matter in galaxy surveys opens the way for new confrontations of observational data and theoretical models. We have made our data products publicly available.

  19. A Fast SVD-Hidden-nodes based Extreme Learning Machine for Large-Scale Data Analytics.

    PubMed

    Deng, Wan-Yu; Bai, Zuo; Huang, Guang-Bin; Zheng, Qing-Hua

    2016-05-01

    Big dimensional data is a growing trend that is emerging in many real world contexts, extending from web mining, gene expression analysis, protein-protein interaction to high-frequency financial data. Nowadays, there is a growing consensus that the increasing dimensionality poses impeding effects on the performances of classifiers, which is termed as the "peaking phenomenon" in the field of machine intelligence. To address the issue, dimensionality reduction is commonly employed as a preprocessing step on the Big dimensional data before building the classifiers. In this paper, we propose an Extreme Learning Machine (ELM) approach for large-scale data analytic. In contrast to existing approaches, we embed hidden nodes that are designed using singular value decomposition (SVD) into the classical ELM. These SVD nodes in the hidden layer are shown to capture the underlying characteristics of the Big dimensional data well, exhibiting excellent generalization performances. The drawback of using SVD on the entire dataset, however, is the high computational complexity involved. To address this, a fast divide and conquer approximation scheme is introduced to maintain computational tractability on high volume data. The resultant algorithm proposed is labeled here as Fast Singular Value Decomposition-Hidden-nodes based Extreme Learning Machine or FSVD-H-ELM in short. In FSVD-H-ELM, instead of identifying the SVD hidden nodes directly from the entire dataset, SVD hidden nodes are derived from multiple random subsets of data sampled from the original dataset. Comprehensive experiments and comparisons are conducted to assess the FSVD-H-ELM against other state-of-the-art algorithms. The results obtained demonstrated the superior generalization performance and efficiency of the FSVD-H-ELM. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Automated Detection of Microaneurysms Using Scale-Adapted Blob Analysis and Semi-Supervised Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adal, Kedir M.; Sidebe, Desire; Ali, Sharib

    2014-01-07

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are then introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier to detect true MAs. The developed system is built using onlymore » few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images.« less

  1. Comparison of Pixel-Based and Object-Based Classification Using Parameters and Non-Parameters Approach for the Pattern Consistency of Multi Scale Landcover

    NASA Astrophysics Data System (ADS)

    Juniati, E.; Arrofiqoh, E. N.

    2017-09-01

    Information extraction from remote sensing data especially land cover can be obtained by digital classification. In practical some people are more comfortable using visual interpretation to retrieve land cover information. However, it is highly influenced by subjectivity and knowledge of interpreter, also takes time in the process. Digital classification can be done in several ways, depend on the defined mapping approach and assumptions on data distribution. The study compared several classifiers method for some data type at the same location. The data used Landsat 8 satellite imagery, SPOT 6 and Orthophotos. In practical, the data used to produce land cover map in 1:50,000 map scale for Landsat, 1:25,000 map scale for SPOT and 1:5,000 map scale for Orthophotos, but using visual interpretation to retrieve information. Maximum likelihood Classifiers (MLC) which use pixel-based and parameters approach applied to such data, and also Artificial Neural Network classifiers which use pixel-based and non-parameters approach applied too. Moreover, this study applied object-based classifiers to the data. The classification system implemented is land cover classification on Indonesia topographic map. The classification applied to data source, which is expected to recognize the pattern and to assess consistency of the land cover map produced by each data. Furthermore, the study analyse benefits and limitations the use of methods.

  2. The future of primordial features with large-scale structure surveys

    NASA Astrophysics Data System (ADS)

    Chen, Xingang; Dvorkin, Cora; Huang, Zhiqi; Namjoo, Mohammad Hossein; Verde, Licia

    2016-11-01

    Primordial features are one of the most important extensions of the Standard Model of cosmology, providing a wealth of information on the primordial Universe, ranging from discrimination between inflation and alternative scenarios, new particle detection, to fine structures in the inflationary potential. We study the prospects of future large-scale structure (LSS) surveys on the detection and constraints of these features. We classify primordial feature models into several classes, and for each class we present a simple template of power spectrum that encodes the essential physics. We study how well the most ambitious LSS surveys proposed to date, including both spectroscopic and photometric surveys, will be able to improve the constraints with respect to the current Planck data. We find that these LSS surveys will significantly improve the experimental sensitivity on features signals that are oscillatory in scales, due to the 3D information. For a broad range of models, these surveys will be able to reduce the errors of the amplitudes of the features by a factor of 5 or more, including several interesting candidates identified in the recent Planck data. Therefore, LSS surveys offer an impressive opportunity for primordial feature discovery in the next decade or two. We also compare the advantages of both types of surveys.

  3. The future of primordial features with large-scale structure surveys

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Xingang; Namjoo, Mohammad Hossein; Dvorkin, Cora

    2016-11-01

    Primordial features are one of the most important extensions of the Standard Model of cosmology, providing a wealth of information on the primordial Universe, ranging from discrimination between inflation and alternative scenarios, new particle detection, to fine structures in the inflationary potential. We study the prospects of future large-scale structure (LSS) surveys on the detection and constraints of these features. We classify primordial feature models into several classes, and for each class we present a simple template of power spectrum that encodes the essential physics. We study how well the most ambitious LSS surveys proposed to date, including both spectroscopicmore » and photometric surveys, will be able to improve the constraints with respect to the current Planck data. We find that these LSS surveys will significantly improve the experimental sensitivity on features signals that are oscillatory in scales, due to the 3D information. For a broad range of models, these surveys will be able to reduce the errors of the amplitudes of the features by a factor of 5 or more, including several interesting candidates identified in the recent Planck data. Therefore, LSS surveys offer an impressive opportunity for primordial feature discovery in the next decade or two. We also compare the advantages of both types of surveys.« less

  4. Attention to local and global levels of hierarchical Navon figures affects rapid scene categorization.

    PubMed

    Brand, John; Johnson, Aaron P

    2014-01-01

    In four experiments, we investigated how attention to local and global levels of hierarchical Navon figures affected the selection of diagnostic spatial scale information used in scene categorization. We explored this issue by asking observers to classify hybrid images (i.e., images that contain low spatial frequency (LSF) content of one image, and high spatial frequency (HSF) content from a second image) immediately following global and local Navon tasks. Hybrid images can be classified according to either their LSF, or HSF content; thus, making them ideal for investigating diagnostic spatial scale preference. Although observers were sensitive to both spatial scales (Experiment 1), they overwhelmingly preferred to classify hybrids based on LSF content (Experiment 2). In Experiment 3, we demonstrated that LSF based hybrid categorization was faster following global Navon tasks, suggesting that LSF processing associated with global Navon tasks primed the selection of LSFs in hybrid images. In Experiment 4, replicating Experiment 3 but suppressing the LSF information in Navon letters by contrast balancing the stimuli examined this hypothesis. Similar to Experiment 3, observers preferred to classify hybrids based on LSF content; however and in contrast, LSF based hybrid categorization was slower following global than local Navon tasks.

  5. Attention to local and global levels of hierarchical Navon figures affects rapid scene categorization

    PubMed Central

    Brand, John; Johnson, Aaron P.

    2014-01-01

    In four experiments, we investigated how attention to local and global levels of hierarchical Navon figures affected the selection of diagnostic spatial scale information used in scene categorization. We explored this issue by asking observers to classify hybrid images (i.e., images that contain low spatial frequency (LSF) content of one image, and high spatial frequency (HSF) content from a second image) immediately following global and local Navon tasks. Hybrid images can be classified according to either their LSF, or HSF content; thus, making them ideal for investigating diagnostic spatial scale preference. Although observers were sensitive to both spatial scales (Experiment 1), they overwhelmingly preferred to classify hybrids based on LSF content (Experiment 2). In Experiment 3, we demonstrated that LSF based hybrid categorization was faster following global Navon tasks, suggesting that LSF processing associated with global Navon tasks primed the selection of LSFs in hybrid images. In Experiment 4, replicating Experiment 3 but suppressing the LSF information in Navon letters by contrast balancing the stimuli examined this hypothesis. Similar to Experiment 3, observers preferred to classify hybrids based on LSF content; however and in contrast, LSF based hybrid categorization was slower following global than local Navon tasks. PMID:25520675

  6. Revised Kuppuswamy's Socioeconomic Status Scale: Explained and Updated.

    PubMed

    Sharma, Rahul

    2017-10-15

    Some of the facets of the Kuppuswamy's socioeconomic status scale sometimes create confusion and require explanation on how to classify, and need some minor updates to bring the scale up-to-date. This article provides a revised scale that allows for the real-time update of the scale.

  7. Revised Kuppuswamy's Socioeconomic Status Scale: Explained and Updated.

    PubMed

    Sharma, Rahul

    2017-08-26

    Some of the facets of the Kuppuswamy's socioeconomic status scale sometimes create confusion and require explanation on how to classify, and need some minor updates to bring the scale up-to-date. This article provides a revised scale that allows for the real-time update of the scale.

  8. Transfer Learning with Convolutional Neural Networks for Classification of Abdominal Ultrasound Images.

    PubMed

    Cheng, Phillip M; Malhi, Harshawn S

    2017-04-01

    The purpose of this study is to evaluate transfer learning with deep convolutional neural networks for the classification of abdominal ultrasound images. Grayscale images from 185 consecutive clinical abdominal ultrasound studies were categorized into 11 categories based on the text annotation specified by the technologist for the image. Cropped images were rescaled to 256 × 256 resolution and randomized, with 4094 images from 136 studies constituting the training set, and 1423 images from 49 studies constituting the test set. The fully connected layers of two convolutional neural networks based on CaffeNet and VGGNet, previously trained on the 2012 Large Scale Visual Recognition Challenge data set, were retrained on the training set. Weights in the convolutional layers of each network were frozen to serve as fixed feature extractors. Accuracy on the test set was evaluated for each network. A radiologist experienced in abdominal ultrasound also independently classified the images in the test set into the same 11 categories. The CaffeNet network classified 77.3% of the test set images accurately (1100/1423 images), with a top-2 accuracy of 90.4% (1287/1423 images). The larger VGGNet network classified 77.9% of the test set accurately (1109/1423 images), with a top-2 accuracy of VGGNet was 89.7% (1276/1423 images). The radiologist classified 71.7% of the test set images correctly (1020/1423 images). The differences in classification accuracies between both neural networks and the radiologist were statistically significant (p < 0.001). The results demonstrate that transfer learning with convolutional neural networks may be used to construct effective classifiers for abdominal ultrasound images.

  9. Measurement instruments to assess posture, gait, and balance in Parkinson's disease: Critique and recommendations.

    PubMed

    Bloem, Bastiaan R; Marinus, Johan; Almeida, Quincy; Dibble, Lee; Nieuwboer, Alice; Post, Bart; Ruzicka, Evzen; Goetz, Christopher; Stebbins, Glenn; Martinez-Martin, Pablo; Schrag, Anette

    2016-09-01

    Disorders of posture, gait, and balance in Parkinson's disease (PD) are common and debilitating. This MDS-commissioned task force assessed clinimetric properties of existing rating scales, questionnaires, and timed tests that assess these features in PD. A literature review was conducted. Identified instruments were evaluated systematically and classified as "recommended," "suggested," or "listed." Inclusion of rating scales was restricted to those that could be used readily in clinical research and practice. One rating scale was classified as "recommended" (UPDRS-derived Postural Instability and Gait Difficulty score) and 2 as "suggested" (Tinetti Balance Scale, Rating Scale for Gait Evaluation). Three scales requiring equipment (Berg Balance Scale, Mini-BESTest, Dynamic Gait Index) also fulfilled criteria for "recommended" and 2 for "suggested" (FOG score, Gait and Balance Scale). Four questionnaires were "recommended" (Freezing of Gait Questionnaire, Activities-specific Balance Confidence Scale, Falls Efficacy Scale, Survey of Activities, and Fear of Falling in the Elderly-Modified). Four tests were classified as "recommended" (6-minute and 10-m walk tests, Timed Up-and-Go, Functional Reach). We identified several questionnaires that adequately assess freezing of gait and balance confidence in PD and a number of useful clinical tests. However, most clinical rating scales for gait, balance, and posture perform suboptimally or have been evaluated insufficiently. No instrument comprehensively and separately evaluates all relevant PD-specific gait characteristics with good clinimetric properties, and none provides separate balance and gait scores with adequate content validity for PD. We therefore recommend the development of such a PD-specific, easily administered, comprehensive gait and balance scale that separately assesses all relevant constructs. © 2016 International Parkinson and Movement Disorder Society. © 2016 International Parkinson and Movement Disorder Society.

  10. High-frequency electroencephalographic activity in left temporal area is associated with pleasant emotion induced by video clips.

    PubMed

    Kortelainen, Jukka; Väyrynen, Eero; Seppänen, Tapio

    2015-01-01

    Recent findings suggest that specific neural correlates for the key elements of basic emotions do exist and can be identified by neuroimaging techniques. In this paper, electroencephalogram (EEG) is used to explore the markers for video-induced emotions. The problem is approached from a classifier perspective: the features that perform best in classifying person's valence and arousal while watching video clips with audiovisual emotional content are searched from a large feature set constructed from the EEG spectral powers of single channels as well as power differences between specific channel pairs. The feature selection is carried out using a sequential forward floating search method and is done separately for the classification of valence and arousal, both derived from the emotional keyword that the subject had chosen after seeing the clips. The proposed classifier-based approach reveals a clear association between the increased high-frequency (15-32 Hz) activity in the left temporal area and the clips described as "pleasant" in the valence and "medium arousal" in the arousal scale. These clips represent the emotional keywords amusement and joy/happiness. The finding suggests the occurrence of a specific neural activation during video-induced pleasant emotion and the possibility to detect this from the left temporal area using EEG.

  11. A neural network for noise correlation classification

    NASA Astrophysics Data System (ADS)

    Paitz, Patrick; Gokhberg, Alexey; Fichtner, Andreas

    2018-02-01

    We present an artificial neural network (ANN) for the classification of ambient seismic noise correlations into two categories, suitable and unsuitable for noise tomography. By using only a small manually classified data subset for network training, the ANN allows us to classify large data volumes with low human effort and to encode the valuable subjective experience of data analysts that cannot be captured by a deterministic algorithm. Based on a new feature extraction procedure that exploits the wavelet-like nature of seismic time-series, we efficiently reduce the dimensionality of noise correlation data, still keeping relevant features needed for automated classification. Using global- and regional-scale data sets, we show that classification errors of 20 per cent or less can be achieved when the network training is performed with as little as 3.5 per cent and 16 per cent of the data sets, respectively. Furthermore, the ANN trained on the regional data can be applied to the global data, and vice versa, without a significant increase of the classification error. An experiment where four students manually classified the data, revealed that the classification error they would assign to each other is substantially larger than the classification error of the ANN (>35 per cent). This indicates that reproducibility would be hampered more by human subjectivity than by imperfections of the ANN.

  12. Systematic Analysis and Prediction of In Situ Cross Talk of O-GlcNAcylation and Phosphorylation

    PubMed Central

    Li, Ao; Wang, Minghui

    2015-01-01

    Reversible posttranslational modification (PTM) plays a very important role in biological process by changing properties of proteins. As many proteins are multiply modified by PTMs, cross talk of PTMs is becoming an intriguing topic and draws much attention. Currently, lots of evidences suggest that the PTMs work together to accomplish a specific biological function. However, both the general principles and underlying mechanism of PTM crosstalk are elusive. In this study, by using large-scale datasets we performed evolutionary conservation analysis, gene ontology enrichment, motif extraction of proteins with cross talk of O-GlcNAcylation and phosphorylation cooccurring on the same residue. We found that proteins with in situ O-GlcNAc/Phos cross talk were significantly enriched in some specific gene ontology terms and no obvious evolutionary pressure was observed. Moreover, 3 functional motifs associated with O-GlcNAc/Phos sites were extracted. We further used sequence features and GO features to predict O-GlcNAc/Phos cross talk sites based on phosphorylated sites and O-GlcNAcylated sites separately by the use of SVM model. The AUC of classifier based on phosphorylated sites is 0.896 and the other classifier based on GlcNAcylated sites is 0.843. Both classifiers achieved a relatively better performance compared with other existing methods. PMID:26601103

  13. Mercury⊕: An evidential reasoning image classifier

    NASA Astrophysics Data System (ADS)

    Peddle, Derek R.

    1995-12-01

    MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.

  14. Systematic Analysis and Prediction of In Situ Cross Talk of O-GlcNAcylation and Phosphorylation.

    PubMed

    Yao, Heming; Li, Ao; Wang, Minghui

    2015-01-01

    Reversible posttranslational modification (PTM) plays a very important role in biological process by changing properties of proteins. As many proteins are multiply modified by PTMs, cross talk of PTMs is becoming an intriguing topic and draws much attention. Currently, lots of evidences suggest that the PTMs work together to accomplish a specific biological function. However, both the general principles and underlying mechanism of PTM crosstalk are elusive. In this study, by using large-scale datasets we performed evolutionary conservation analysis, gene ontology enrichment, motif extraction of proteins with cross talk of O-GlcNAcylation and phosphorylation cooccurring on the same residue. We found that proteins with in situ O-GlcNAc/Phos cross talk were significantly enriched in some specific gene ontology terms and no obvious evolutionary pressure was observed. Moreover, 3 functional motifs associated with O-GlcNAc/Phos sites were extracted. We further used sequence features and GO features to predict O-GlcNAc/Phos cross talk sites based on phosphorylated sites and O-GlcNAcylated sites separately by the use of SVM model. The AUC of classifier based on phosphorylated sites is 0.896 and the other classifier based on GlcNAcylated sites is 0.843. Both classifiers achieved a relatively better performance compared with other existing methods.

  15. a Novel Ship Detection Method for Large-Scale Optical Satellite Images Based on Visual Lbp Feature and Visual Attention Model

    NASA Astrophysics Data System (ADS)

    Haigang, Sui; Zhina, Song

    2016-06-01

    Reliably ship detection in optical satellite images has a wide application in both military and civil fields. However, this problem is very difficult in complex backgrounds, such as waves, clouds, and small islands. Aiming at these issues, this paper explores an automatic and robust model for ship detection in large-scale optical satellite images, which relies on detecting statistical signatures of ship targets, in terms of biologically-inspired visual features. This model first selects salient candidate regions across large-scale images by using a mechanism based on biologically-inspired visual features, combined with visual attention model with local binary pattern (CVLBP). Different from traditional studies, the proposed algorithm is high-speed and helpful to focus on the suspected ship areas avoiding the separation step of land and sea. Largearea images are cut into small image chips and analyzed in two complementary ways: Sparse saliency using visual attention model and detail signatures using LBP features, thus accordant with sparseness of ship distribution on images. Then these features are employed to classify each chip as containing ship targets or not, using a support vector machine (SVM). After getting the suspicious areas, there are still some false alarms such as microwaves and small ribbon clouds, thus simple shape and texture analysis are adopted to distinguish between ships and nonships in suspicious areas. Experimental results show the proposed method is insensitive to waves, clouds, illumination and ship size.

  16. Distributed multimodal data fusion for large scale wireless sensor networks

    NASA Astrophysics Data System (ADS)

    Ertin, Emre

    2006-05-01

    Sensor network technology has enabled new surveillance systems where sensor nodes equipped with processing and communication capabilities can collaboratively detect, classify and track targets of interest over a large surveillance area. In this paper we study distributed fusion of multimodal sensor data for extracting target information from a large scale sensor network. Optimal tracking, classification, and reporting of threat events require joint consideration of multiple sensor modalities. Multiple sensor modalities improve tracking by reducing the uncertainty in the track estimates as well as resolving track-sensor data association problems. Our approach to solving the fusion problem with large number of multimodal sensors is construction of likelihood maps. The likelihood maps provide a summary data for the solution of the detection, tracking and classification problem. The likelihood map presents the sensory information in an easy format for the decision makers to interpret and is suitable with fusion of spatial prior information such as maps, imaging data from stand-off imaging sensors. We follow a statistical approach to combine sensor data at different levels of uncertainty and resolution. The likelihood map transforms each sensor data stream to a spatio-temporal likelihood map ideally suitable for fusion with imaging sensor outputs and prior geographic information about the scene. We also discuss distributed computation of the likelihood map using a gossip based algorithm and present simulation results.

  17. Space-Time Controls on Carbon Sequestration Over Large-Scale Amazon Basin

    NASA Technical Reports Server (NTRS)

    Smith, Eric A.; Cooper, Harry J.; Gu, Jiujing; Grose, Andrew; Norman, John; daRocha, Humberto R.; Starr, David O. (Technical Monitor)

    2002-01-01

    A major research focus of the LBA Ecology Program is an assessment of the carbon budget and the carbon sequestering capacity of the large scale forest-pasture system that dominates the Amazonia landscape, and its time-space heterogeneity manifest in carbon fluxes across the large scale Amazon basin ecosystem. Quantification of these processes requires a combination of in situ measurements, remotely sensed measurements from space, and a realistically forced hydrometeorological model coupled to a carbon assimilation model, capable of simulating details within the surface energy and water budgets along with the principle modes of photosynthesis and respiration. Here we describe the results of an investigation concerning the space-time controls of carbon sources and sinks distributed over the large scale Amazon basin. The results are derived from a carbon-water-energy budget retrieval system for the large scale Amazon basin, which uses a coupled carbon assimilation-hydrometeorological model as an integrating system, forced by both in situ meteorological measurements and remotely sensed radiation fluxes and precipitation retrieval retrieved from a combination of GOES, SSM/I, TOMS, and TRMM satellite measurements. Brief discussion concerning validation of (a) retrieved surface radiation fluxes and precipitation based on 30-min averaged surface measurements taken at Ji-Parana in Rondonia and Manaus in Amazonas, and (b) modeled carbon fluxes based on tower CO2 flux measurements taken at Reserva Jaru, Manaus and Fazenda Nossa Senhora. The space-time controls on carbon sequestration are partitioned into sets of factors classified by: (1) above canopy meteorology, (2) incoming surface radiation, (3) precipitation interception, and (4) indigenous stomatal processes varied over the different land covers of pristine rainforest, partially, and fully logged rainforests, and pasture lands. These are the principle meteorological, thermodynamical, hydrological, and biophysical control paths which perturb net carbon fluxes and sequestration, produce time-space switching of carbon sources and sinks, undergo modulation through atmospheric boundary layer feedbacks, and respond to any discontinuous intervention on the landscape itself such as produced by human intervention in converting rainforest to pasture or conducting selective/clearcut logging operations.

  18. Estimating local scaling properties for the classification of interstitial lung disease patterns

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel

    2011-03-01

    Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  19. Comparison of Single and Multi-Scale Method for Leaf and Wood Points Classification from Terrestrial Laser Scanning Data

    NASA Astrophysics Data System (ADS)

    Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie

    2018-04-01

    The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.

  20. Use of Large-Scale Multi-Configuration EMI Measurements to Characterize Subsurface Structures of the Vadose Zone.

    NASA Astrophysics Data System (ADS)

    Huisman, J. A.; Brogi, C.; Pätzold, S.; Weihermueller, L.; von Hebel, C.; Van Der Kruk, J.; Vereecken, H.

    2017-12-01

    Subsurface structures of the vadose zone can play a key role in crop yield potential, especially during water stress periods. Geophysical techniques like electromagnetic induction EMI can provide information about dominant shallow subsurface features. However, previous studies with EMI have typically not reached beyond the field scale. We used high-resolution large-scale multi-configuration EMI measurements to characterize patterns of soil structural organization (layering and texture) and their impact on crop productivity at the km2 scale. We collected EMI data on an agricultural area of 1 km2 (102 ha) near Selhausen (NRW, Germany). The area consists of 51 agricultural fields cropped in rotation. Therefore, measurements were collected between April and December 2016, preferably within few days after the harvest. EMI data were automatically filtered, temperature corrected, and interpolated onto a common grid of 1 m resolution. Inspecting the ECa maps, we identified three main sub-areas with different subsurface heterogeneity. We also identified small-scale geomorphological structures as well as anthropogenic activities such as soil management and buried drainage networks. To identify areas with similar subsurface structures, we applied image classification techniques. We fused ECa maps obtained with different coil distances in a multiband image and applied supervised and unsupervised classification methodologies. Both showed good results in reconstructing observed patterns in plant productivity and the subsurface structures associated with them. However, the supervised methodology proved more efficient in classifying the whole study area. In a second step, we selected hundred locations within the study area and obtained a soil profile description with type, depth, and thickness of the soil horizons. Using this ground truth data it was possible to assign a typical soil profile to each of the main classes obtained from the classification. The proposed methodology was effective in producing a high resolution subsurface model in a large and complex study area that extends well beyond the field scale.

  1. Large-scale serum protein biomarker discovery in Duchenne muscular dystrophy.

    PubMed

    Hathout, Yetrib; Brody, Edward; Clemens, Paula R; Cripe, Linda; DeLisle, Robert Kirk; Furlong, Pat; Gordish-Dressman, Heather; Hache, Lauren; Henricson, Erik; Hoffman, Eric P; Kobayashi, Yvonne Monique; Lorts, Angela; Mah, Jean K; McDonald, Craig; Mehler, Bob; Nelson, Sally; Nikrad, Malti; Singer, Britta; Steele, Fintan; Sterling, David; Sweeney, H Lee; Williams, Steve; Gold, Larry

    2015-06-09

    Serum biomarkers in Duchenne muscular dystrophy (DMD) may provide deeper insights into disease pathogenesis, suggest new therapeutic approaches, serve as acute read-outs of drug effects, and be useful as surrogate outcome measures to predict later clinical benefit. In this study a large-scale biomarker discovery was performed on serum samples from patients with DMD and age-matched healthy volunteers using a modified aptamer-based proteomics technology. Levels of 1,125 proteins were quantified in serum samples from two independent DMD cohorts: cohort 1 (The Parent Project Muscular Dystrophy-Cincinnati Children's Hospital Medical Center), 42 patients with DMD and 28 age-matched normal volunteers; and cohort 2 (The Cooperative International Neuromuscular Research Group, Duchenne Natural History Study), 51 patients with DMD and 17 age-matched normal volunteers. Forty-four proteins showed significant differences that were consistent in both cohorts when comparing DMD patients and healthy volunteers at a 1% false-discovery rate, a large number of significant protein changes for such a small study. These biomarkers can be classified by known cellular processes and by age-dependent changes in protein concentration. Our findings demonstrate both the utility of this unbiased biomarker discovery approach and suggest potential new diagnostic and therapeutic avenues for ameliorating the burden of DMD and, we hope, other rare and devastating diseases.

  2. Predicting Hydrologic Function With Aquatic Gene Fragments

    NASA Astrophysics Data System (ADS)

    Good, S. P.; URycki, D. R.; Crump, B. C.

    2018-03-01

    Recent advances in microbiology techniques, such as genetic sequencing, allow for rapid and cost-effective collection of large quantities of genetic information carried within water samples. Here we posit that the unique composition of aquatic DNA material within a water sample contains relevant information about hydrologic function at multiple temporal scales. In this study, machine learning was used to develop discharge prediction models trained on the relative abundance of bacterial taxa classified into operational taxonomic units (OTUs) based on 16S rRNA gene sequences from six large arctic rivers. We term this approach "genohydrology," and show that OTU relative abundances can be used to predict river discharge at monthly and longer timescales. Based on a single DNA sample from each river, the average Nash-Sutcliffe efficiency (NSE) for predicted mean monthly discharge values throughout the year was 0.84, while the NSE for predicted discharge values across different return intervals was 0.67. These are considerable improvements over predictions based only on the area-scaled mean specific discharge of five similar rivers, which had average NSE values of 0.64 and -0.32 for seasonal and recurrence interval discharge values, respectively. The genohydrology approach demonstrates that genetic diversity within the aquatic microbiome is a large and underutilized data resource with benefits for prediction of hydrologic function.

  3. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    PubMed

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Classification of Animal Movement Behavior through Residence in Space and Time.

    PubMed

    Torres, Leigh G; Orben, Rachael A; Tolkova, Irina; Thompson, David R

    2017-01-01

    Identification and classification of behavior states in animal movement data can be complex, temporally biased, time-intensive, scale-dependent, and unstandardized across studies and taxa. Large movement datasets are increasingly common and there is a need for efficient methods of data exploration that adjust to the individual variability of each track. We present the Residence in Space and Time (RST) method to classify behavior patterns in movement data based on the concept that behavior states can be partitioned by the amount of space and time occupied in an area of constant scale. Using normalized values of Residence Time and Residence Distance within a constant search radius, RST is able to differentiate behavior patterns that are time-intensive (e.g., rest), time & distance-intensive (e.g., area restricted search), and transit (short time and distance). We use grey-headed albatross (Thalassarche chrysostoma) GPS tracks to demonstrate RST's ability to classify behavior patterns and adjust to the inherent scale and individuality of each track. Next, we evaluate RST's ability to discriminate between behavior states relative to other classical movement metrics. We then temporally sub-sample albatross track data to illustrate RST's response to less resolved data. Finally, we evaluate RST's performance using datasets from four taxa with diverse ecology, functional scales, ecosystems, and data-types. We conclude that RST is a robust, rapid, and flexible method for detailed exploratory analysis and meta-analyses of behavioral states in animal movement data based on its ability to integrate distance and time measurements into one descriptive metric of behavior groupings. Given the increasing amount of animal movement data collected, it is timely and useful to implement a consistent metric of behavior classification to enable efficient and comparative analyses. Overall, the application of RST to objectively explore and compare behavior patterns in movement data can enhance our fine- and broad- scale understanding of animal movement ecology.

  5. Role of Molecular Profiling in Soft Tissue Sarcoma.

    PubMed

    Lindsay, Timothy; Movva, Sujana

    2018-05-01

    Diagnosis and treatment of soft tissue sarcoma (STS) is a particularly daunting task, largely due to the profound heterogeneity that characterizes these malignancies. Molecular profiling has emerged as a useful tool to confirm histologic diagnoses and more accurately classify these malignancies. Recent large-scale, multiplatform analyses have begun the work of establishing a more complete understanding of molecular profiling in STS subtypes and to identify new molecular alterations that may guide the development of novel targeted therapies. This review provides a brief and general overview of the role that molecular profiling has in STS, highlighting select sarcoma subtypes that are notable for recent developments. The role of molecular profiling as it relates to diagnostic strategies is discussed, along with ways that molecular profiling may provide guidance for potential therapeutic interventions. Copyright © 2018 by the National Comprehensive Cancer Network.

  6. Zooniverse: Combining Human and Machine Classifiers for the Big Survey Era

    NASA Astrophysics Data System (ADS)

    Fortson, Lucy; Wright, Darryl; Beck, Melanie; Lintott, Chris; Scarlata, Claudia; Dickinson, Hugh; Trouille, Laura; Willi, Marco; Laraia, Michael; Boyer, Amy; Veldhuis, Marten; Zooniverse

    2018-01-01

    Many analyses of astronomical data sets, ranging from morphological classification of galaxies to identification of supernova candidates, have relied on humans to classify data into distinct categories. Crowdsourced galaxy classifications via the Galaxy Zoo project provided a solution that scaled visual classification for extant surveys by harnessing the combined power of thousands of volunteers. However, the much larger data sets anticipated from upcoming surveys will require a different approach. Automated classifiers using supervised machine learning have improved considerably over the past decade but their increasing sophistication comes at the expense of needing ever more training data. Crowdsourced classification by human volunteers is a critical technique for obtaining these training data. But several improvements can be made on this zeroth order solution. Efficiency gains can be achieved by implementing a “cascade filtering” approach whereby the task structure is reduced to a set of binary questions that are more suited to simpler machines while demanding lower cognitive loads for humans.Intelligent subject retirement based on quantitative metrics of volunteer skill and subject label reliability also leads to dramatic improvements in efficiency. We note that human and machine classifiers may retire subjects differently leading to trade-offs in performance space. Drawing on work with several Zooniverse projects including Galaxy Zoo and Supernova Hunter, we will present recent findings from experiments that combine cohorts of human and machine classifiers. We show that the most efficient system results when appropriate subsets of the data are intelligently assigned to each group according to their particular capabilities.With sufficient online training, simple machines can quickly classify “easy” subjects, leaving more difficult (and discovery-oriented) tasks for volunteers. We also find humans achieve higher classification purity while samples produced by machines are typically more complete. These findings set the stage for further investigations, with the ultimate goal of efficiently and accurately labeling the wide range of data classes that will arise from the planned large astronomical surveys.

  7. A deep learning and novelty detection framework for rapid phenotyping in high-content screening

    PubMed Central

    Sommer, Christoph; Hoefler, Rudolf; Samwer, Matthias; Gerlich, Daniel W.

    2017-01-01

    Supervised machine learning is a powerful and widely used method for analyzing high-content screening data. Despite its accuracy, efficiency, and versatility, supervised machine learning has drawbacks, most notably its dependence on a priori knowledge of expected phenotypes and time-consuming classifier training. We provide a solution to these limitations with CellCognition Explorer, a generic novelty detection and deep learning framework. Application to several large-scale screening data sets on nuclear and mitotic cell morphologies demonstrates that CellCognition Explorer enables discovery of rare phenotypes without user training, which has broad implications for improved assay development in high-content screening. PMID:28954863

  8. Epidemics and dimensionality in hierarchical networks

    NASA Astrophysics Data System (ADS)

    Zheng, Da-Fang; Hui, P. M.; Trimper, Steffen; Zheng, Bo

    2005-07-01

    Epidemiological processes are studied within a recently proposed hierarchical network model using the susceptible-infected-refractory dynamics of an epidemic. Within the network model, a population may be characterized by H independent hierarchies or dimensions, each of which consists of groupings of individuals into layers of subgroups. Detailed numerical simulations reveal that for H>1, global spreading results regardless of the degree of homophily of the individuals forming a social circle. For H=1, a transition from global to local spread occurs as the population becomes decomposed into increasingly homophilous groups. Multiple dimensions in classifying individuals (nodes) thus make a society (computer network) highly susceptible to large-scale outbreaks of infectious diseases (viruses).

  9. The MIDAS processor. [Multivariate Interactive Digital Analysis System for multispectral scanner data

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Gordon, M. F.; Mclaughlin, R. H.; Marshall, R. E.

    1975-01-01

    The MIDAS (Multivariate Interactive Digital Analysis System) processor is a high-speed processor designed to process multispectral scanner data (from Landsat, EOS, aircraft, etc.) quickly and cost-effectively to meet the requirements of users of remote sensor data, especially from very large areas. MIDAS consists of a fast multipipeline preprocessor and classifier, an interactive color display and color printer, and a medium scale computer system for analysis and control. The system is designed to process data having as many as 16 spectral bands per picture element at rates of 200,000 picture elements per second into as many as 17 classes using a maximum likelihood decision rule.

  10. Associative Pattern Recognition In Analog VLSI Circuits

    NASA Technical Reports Server (NTRS)

    Tawel, Raoul

    1995-01-01

    Winner-take-all circuit selects best-match stored pattern. Prototype cascadable very-large-scale integrated (VLSI) circuit chips built and tested to demonstrate concept of electronic associative pattern recognition. Based on low-power, sub-threshold analog complementary oxide/semiconductor (CMOS) VLSI circuitry, each chip can store 128 sets (vectors) of 16 analog values (vector components), vectors representing known patterns as diverse as spectra, histograms, graphs, or brightnesses of pixels in images. Chips exploit parallel nature of vector quantization architecture to implement highly parallel processing in relatively simple computational cells. Through collective action, cells classify input pattern in fraction of microsecond while consuming power of few microwatts.

  11. Active aeolian processes on Mars: A regional study in Arabia and Meridiani Terrae

    USGS Publications Warehouse

    Silvestro, S.; Vaz, D.A.; Fenton, L.K.; Geissler, P.E.

    2011-01-01

    We present evidence of widespread aeolian activity in the Arabia Terra/Meridiani region (Mars), where different kinds of aeolian modifications have been detected and classified. Passing from the regional to the local scale, we describe one particular dune field in Meridiani Planum, where two ripple populations are distinguished by means of different migration rates. Moreover, a consistent change in the ripple pattern is accompanied by significant dune advancement (between 0.4-1 meter in one Martian year) that is locally triggered by large avalanche features. This suggests that dune advancement may be common throughout the Martian tropics. ?? 2011 by the American Geophysical Union.

  12. Clinical Research Informatics: Supporting the Research Study Lifecycle.

    PubMed

    Johnson, S B

    2017-08-01

    Objectives: The primary goal of this review is to summarize significant developments in the field of Clinical Research Informatics (CRI) over the years 2015-2016. The secondary goal is to contribute to a deeper understanding of CRI as a field, through the development of a strategy for searching and classifying CRI publications. Methods: A search strategy was developed to query the PubMed database, using medical subject headings to both select and exclude articles, and filtering publications by date and other characteristics. A manual review classified publications using stages in the "research study lifecycle", with key stages that include study definition, participant enrollment, data management, data analysis, and results dissemination. Results: The search strategy generated 510 publications. The manual classification identified 125 publications as relevant to CRI, which were classified into seven different stages of the research lifecycle, and one additional class that pertained to multiple stages, referring to general infrastructure or standards. Important cross-cutting themes included new applications of electronic media (Internet, social media, mobile devices), standardization of data and procedures, and increased automation through the use of data mining and big data methods. Conclusions: The review revealed increased interest and support for CRI in large-scale projects across institutions, regionally, nationally, and internationally. A search strategy based on medical subject headings can find many relevant papers, but a large number of non-relevant papers need to be detected using text words which pertain to closely related fields such as computational statistics and clinical informatics. The research lifecycle was useful as a classification scheme by highlighting the relevance to the users of clinical research informatics solutions. Georg Thieme Verlag KG Stuttgart.

  13. Open Land-Use Map: A Regional Land-Use Mapping Strategy for Incorporating OpenStreetMap with Earth Observations

    NASA Astrophysics Data System (ADS)

    Yang, D.; Fu, C. S.; Binford, M. W.

    2017-12-01

    The southeastern United States has high landscape heterogeneity, withheavily managed forestlands, highly developed agriculture lands, and multiple metropolitan areas. Human activities are transforming and altering land patterns and structures in both negative and positive manners. A land-use map for at the greater scale is a heavy computation task but is critical to most landowners, researchers, and decision makers, enabling them to make informed decisions for varying objectives. There are two major difficulties in generating the classification maps at the regional scale: the necessity of large training point sets and the expensive computation cost-in terms of both money and time-in classifier modeling. Volunteered Geographic Information (VGI) opens a new era in mapping and visualizing our world, where the platform is open for collecting valuable georeferenced information by volunteer citizens, and the data is freely available to the public. As one of the most well-known VGI initiatives, OpenStreetMap (OSM) contributes not only road network distribution, but also the potential for using this data to justify land cover and land use classifications. Google Earth Engine (GEE) is a platform designed for cloud-based mapping with a robust and fast computing power. Most large scale and national mapping approaches confuse "land cover" and "land-use", or build up the land-use database based on modeled land cover datasets. Unlike most other large-scale approaches, we distinguish and differentiate land-use from land cover. By focusing our prime objective of mapping land-use and management practices, a robust regional land-use mapping approach is developed by incorporating the OpenstreepMap dataset into Earth observation remote sensing imageries instead of the often-used land cover base maps.

  14. A generalized approach for producing, quantifying, and validating citizen science data from wildlife images

    PubMed Central

    Kosmala, Margaret; Lintott, Chris; Packer, Craig

    2016-01-01

    Abstract Citizen science has the potential to expand the scope and scale of research in ecology and conservation, but many professional researchers remain skeptical of data produced by nonexperts. We devised an approach for producing accurate, reliable data from untrained, nonexpert volunteers. On the citizen science website www.snapshotserengeti.org, more than 28,000 volunteers classified 1.51 million images taken in a large‐scale camera‐trap survey in Serengeti National Park, Tanzania. Each image was circulated to, on average, 27 volunteers, and their classifications were aggregated using a simple plurality algorithm. We validated the aggregated answers against a data set of 3829 images verified by experts and calculated 3 certainty metrics—level of agreement among classifications (evenness), fraction of classifications supporting the aggregated answer (fraction support), and fraction of classifiers who reported “nothing here” for an image that was ultimately classified as containing an animal (fraction blank)—to measure confidence that an aggregated answer was correct. Overall, aggregated volunteer answers agreed with the expert‐verified data on 98% of images, but accuracy differed by species commonness such that rare species had higher rates of false positives and false negatives. Easily calculated analysis of variance and post‐hoc Tukey tests indicated that the certainty metrics were significant indicators of whether each image was correctly classified or classifiable. Thus, the certainty metrics can be used to identify images for expert review. Bootstrapping analyses further indicated that 90% of images were correctly classified with just 5 volunteers per image. Species classifications based on the plurality vote of multiple citizen scientists can provide a reliable foundation for large‐scale monitoring of African wildlife. PMID:27111678

  15. Earthquake Hazard Class Mapping by Parcel in Las Vegas Valley

    NASA Astrophysics Data System (ADS)

    Pancha, A.; Pullammanappallil, S.; Louie, J. N.; Hellmer, W. K.

    2011-12-01

    Clark County, Nevada completed the very first effort in the United States to map earthquake hazard class systematically through an entire urban area. The map is used in development and disaster response planning, in addition to its direct use for building code implementation and enforcement. The County contracted with the Nevada System of Higher Education to classify about 500 square miles including urban Las Vegas Valley, and exurban areas considered for future development. The Parcel Map includes over 10,000 surface-wave array measurements accomplished over three years using Optim's SeisOpt° ReMi measurement and processing techniques adapted for large scale data. These array measurements classify individual parcels on the NEHRP hazard scale. Parallel "blind" tests were conducted at 93 randomly selected sites. The rms difference between the Vs30 values yielded by the blind data and analyses and the Parcel Map analyses is 4.92%. Only six of the blind-test sites showed a difference with a magnitude greater than 10%. We describe a "C+" Class for sites with Class B average velocities but soft surface soil. The measured Parcel Map shows a clearly definable C+ to C boundary on the west side of the Valley. The C to D boundary is much more complex. Using the parcel map in computing shaking in the Valley for scenario earthquakes is crucial for obtaining realistic predictions of ground motions.

  16. Wavelet-based statistical classification of skin images acquired with reflectance confocal microscopy

    PubMed Central

    Halimi, Abdelghafour; Batatia, Hadj; Le Digabel, Jimmy; Josse, Gwendal; Tourneret, Jean Yves

    2017-01-01

    Detecting skin lentigo in reflectance confocal microscopy images is an important and challenging problem. This imaging modality has not yet been widely investigated for this problem and there are a few automatic processing techniques. They are mostly based on machine learning approaches and rely on numerous classical image features that lead to high computational costs given the very large resolution of these images. This paper presents a detection method with very low computational complexity that is able to identify the skin depth at which the lentigo can be detected. The proposed method performs multiresolution decomposition of the image obtained at each skin depth. The distribution of image pixels at a given depth can be approximated accurately by a generalized Gaussian distribution whose parameters depend on the decomposition scale, resulting in a very-low-dimension parameter space. SVM classifiers are then investigated to classify the scale parameter of this distribution allowing real-time detection of lentigo. The method is applied to 45 healthy and lentigo patients from a clinical study, where sensitivity of 81.4% and specificity of 83.3% are achieved. Our results show that lentigo is identifiable at depths between 50μm and 60μm, corresponding to the average location of the the dermoepidermal junction. This result is in agreement with the clinical practices that characterize the lentigo by assessing the disorganization of the dermoepidermal junction. PMID:29296480

  17. Galactic satellite systems: radial distribution and environment dependence of galaxy morphology

    NASA Astrophysics Data System (ADS)

    Ann, H. B.; Park, Changbom; Choi, Yun-Young

    2008-09-01

    We have studied the radial distribution of the early (E/S0) and late (S/Irr) types of satellites around bright host galaxies. We made a volume-limited sample of 4986 satellites brighter than Mr = -18.0 associated with 2254 hosts brighter than Mr = -19.0 from the Sloan Digital Sky Survey Data Release 5 sample. The morphology of satellites is determined by an automated morphology classifier, but the host galaxies are visually classified. We found segregation of satellite morphology as a function of the projected distance from the host galaxy. The amplitude and shape of the early-type satellite fraction profile are found to depend on the host luminosity. This is the morphology-radius/density relation at the galactic scale. There is a strong tendency for morphology conformity between the host galaxy and its satellites. The early-type fraction of satellites hosted by early-type galaxies is systematically larger than that of late-type hosts, and is a strong function of the distance from the host galaxies. Fainter satellites are more vulnerable to the morphology transformation effects of hosts. Dependence of satellite morphology on the large-scale background density was detected. The fraction of early-type satellites increases in high-density regions for both early- and late-type hosts. It is argued that the conformity in morphology of galactic satellite system is mainly originated by the hydrodynamical and radiative effects of hosts on satellites.

  18. Machine-Learning Classifier for Patients with Major Depressive Disorder: Multifeature Approach Based on a High-Order Minimum Spanning Tree Functional Brain Network.

    PubMed

    Guo, Hao; Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie

    2017-01-01

    High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%.

  19. Machine-Learning Classifier for Patients with Major Depressive Disorder: Multifeature Approach Based on a High-Order Minimum Spanning Tree Functional Brain Network

    PubMed Central

    Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie

    2017-01-01

    High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%. PMID:29387141

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Springmeyer, R R; Brugger, E; Cook, R

    The Data group provides data analysis and visualization support to its customers. This consists primarily of the development and support of VisIt, a data analysis and visualization tool. Support ranges from answering questions about the tool, providing classes on how to use the tool, and performing data analysis and visualization for customers. The Information Management and Graphics Group supports and develops tools that enhance our ability to access, display, and understand large, complex data sets. Activities include applying visualization software for large scale data exploration; running video production labs on two networks; supporting graphics libraries and tools for end users;more » maintaining PowerWalls and assorted other displays; and developing software for searching and managing scientific data. Researchers in the Center for Applied Scientific Computing (CASC) work on various projects including the development of visualization techniques for large scale data exploration that are funded by the ASC program, among others. The researchers also have LDRD projects and collaborations with other lab researchers, academia, and industry. The IMG group is located in the Terascale Simulation Facility, home to Dawn, Atlas, BGL, and others, which includes both classified and unclassified visualization theaters, a visualization computer floor and deployment workshop, and video production labs. We continued to provide the traditional graphics group consulting and video production support. We maintained five PowerWalls and many other displays. We deployed a 576-node Opteron/IB cluster with 72 TB of memory providing a visualization production server on our classified network. We continue to support a 128-node Opteron/IB cluster providing a visualization production server for our unclassified systems and an older 256-node Opteron/IB cluster for the classified systems, as well as several smaller clusters to drive the PowerWalls. The visualization production systems includes NFS servers to provide dedicated storage for data analysis and visualization. The ASC projects have delivered new versions of visualization and scientific data management tools to end users and continue to refine them. VisIt had 4 releases during the past year, ending with VisIt 2.0. We released version 2.4 of Hopper, a Java application for managing and transferring files. This release included a graphical disk usage view which works on all types of connections and an aggregated copy feature for quickly transferring massive datasets quickly and efficiently to HPSS. We continue to use and develop Blockbuster and Telepath. Both the VisIt and IMG teams were engaged in a variety of movie production efforts during the past year in addition to the development tasks.« less

  1. Configurations and implementation of payroll system using open source erp: a case study of Koperasi PT Sri

    NASA Astrophysics Data System (ADS)

    Terminanto, A.; Swantoro, H. A.; Hidayanto, A. N.

    2017-12-01

    Enterprise Resource Planning (ERP) is an integrated information system to manage business processes of companies of various business scales. Because of the high cost of ERP investment, ERP implementation is usually done in large-scale enterprises, Due to the complexity of implementation problems, the success rate of ERP implementation is still low. Open Source System ERP becomes an alternative choice of ERP application to SME companies in terms of cost and customization. This study aims to identify characteristics and configure the implementation of OSS ERP Payroll module in KKPS (Employee Cooperative PT SRI) using OSS ERP Odoo and using ASAP method. This study is classified into case study research and action research. Implementation of OSS ERP Payroll module is done because the HR section of KKPS has not been integrated with other parts. The results of this study are the characteristics and configuration of OSS ERP payroll module in KKPS.

  2. From random microstructures to representative volume elements

    NASA Astrophysics Data System (ADS)

    Zeman, J.; Šejnoha, M.

    2007-06-01

    A unified treatment of random microstructures proposed in this contribution opens the way to efficient solutions of large-scale real world problems. The paper introduces a notion of statistically equivalent periodic unit cell (SEPUC) that replaces in a computational step the actual complex geometries on an arbitrary scale. A SEPUC is constructed such that its morphology conforms with images of real microstructures. Here, the appreciated two-point probability function and the lineal path function are employed to classify, from the statistical point of view, the geometrical arrangement of various material systems. Examples of statistically equivalent unit cells constructed for a unidirectional fibre tow, a plain weave textile composite and an irregular-coursed masonry wall are given. A specific result promoting the applicability of the SEPUC as a tool for the derivation of homogenized effective properties that are subsequently used in an independent macroscopic analysis is also presented.

  3. Using landscape limnology to classify freshwater ecosystems for multi-ecosystem management and conservation

    USGS Publications Warehouse

    Soranno, Patricia A.; Cheruvelil, Kendra Spence; Webster, Katherine E.; Bremigan, Mary T.; Wagner, Tyler; Stow, Craig A.

    2010-01-01

    Governmental entities are responsible for managing and conserving large numbers of lake, river, and wetland ecosystems that can be addressed only rarely on a case-by-case basis. We present a system for predictive classification modeling, grounded in the theoretical foundation of landscape limnology, that creates a tractable number of ecosystem classes to which management actions may be tailored. We demonstrate our system by applying two types of predictive classification modeling approaches to develop nutrient criteria for eutrophication management in 1998 north temperate lakes. Our predictive classification system promotes the effective management of multiple ecosystems across broad geographic scales by explicitly connecting management and conservation goals to the classification modeling approach, considering multiple spatial scales as drivers of ecosystem dynamics, and acknowledging the hierarchical structure of freshwater ecosystems. Such a system is critical for adaptive management of complex mosaics of freshwater ecosystems and for balancing competing needs for ecosystem services in a changing world.

  4. The Equations of Oceanic Motions

    NASA Astrophysics Data System (ADS)

    Müller, Peter

    2006-10-01

    Modeling and prediction of oceanographic phenomena and climate is based on the integration of dynamic equations. The Equations of Oceanic Motions derives and systematically classifies the most common dynamic equations used in physical oceanography, from large scale thermohaline circulations to those governing small scale motions and turbulence. After establishing the basic dynamical equations that describe all oceanic motions, M|ller then derives approximate equations, emphasizing the assumptions made and physical processes eliminated. He distinguishes between geometric, thermodynamic and dynamic approximations and between the acoustic, gravity, vortical and temperature-salinity modes of motion. Basic concepts and formulae of equilibrium thermodynamics, vector and tensor calculus, curvilinear coordinate systems, and the kinematics of fluid motion and wave propagation are covered in appendices. Providing the basic theoretical background for graduate students and researchers of physical oceanography and climate science, this book will serve as both a comprehensive text and an essential reference.

  5. Test-retest reliability of the Capute scales for neurodevelopmental screening of a high risk sample: Impact of test-retest interval and degree of neonatal risk.

    PubMed

    McCurdy, M; Bellows, A; Deng, D; Leppert, M; Mahone, E; Pritchard, A

    2015-01-01

    Reliable and valid screening and assessment tools are necessary to identify children at risk for neurodevelopmental disabilities who may require additional services. This study evaluated the test-retest reliability of the Capute Scales in a high-risk sample, hypothesizing adequate reliability across 6- and 12-month intervals. Capute Scales scores (N = 66) were collected via retrospective chart review from a NICU follow-up clinic within a large urban medical center spanning three age-ranges: 12-18, 19-24, and 25-36 months. On average, participants were classified as very low birth weight and premature. Reliability of the Capute Scales was evaluated with intraclass correlation coefficients across length of test-retest interval, age at testing, and degree of neonatal complications. The Capute Scales demonstrated high reliability, regardless of length of test-retest interval (ranging from 6 to 14 months) or age of participant, for all index scores, including overall Developmental Quotient (DQ), language-based skill index (CLAMS) and nonverbal reasoning index (CAT). Linear regressions revealed that greater neonatal risk was related to poorer test-retest reliability; however, reliability coefficients remained strong. The Capute Scales afford clinicians a reliable and valid means of screening and assessing for neurodevelopmental delay within high-risk infant populations.

  6. Scene-Aware Adaptive Updating for Visual Tracking via Correlation Filters

    PubMed Central

    Zhang, Sirou; Qiao, Xiaoya

    2017-01-01

    In recent years, visual object tracking has been widely used in military guidance, human-computer interaction, road traffic, scene monitoring and many other fields. The tracking algorithms based on correlation filters have shown good performance in terms of accuracy and tracking speed. However, their performance is not satisfactory in scenes with scale variation, deformation, and occlusion. In this paper, we propose a scene-aware adaptive updating mechanism for visual tracking via a kernel correlation filter (KCF). First, a low complexity scale estimation method is presented, in which the corresponding weight in five scales is employed to determine the final target scale. Then, the adaptive updating mechanism is presented based on the scene-classification. We classify the video scenes as four categories by video content analysis. According to the target scene, we exploit the adaptive updating mechanism to update the kernel correlation filter to improve the robustness of the tracker, especially in scenes with scale variation, deformation, and occlusion. We evaluate our tracker on the CVPR2013 benchmark. The experimental results obtained with the proposed algorithm are improved by 33.3%, 15%, 6%, 21.9% and 19.8% compared to those of the KCF tracker on the scene with scale variation, partial or long-time large-area occlusion, deformation, fast motion and out-of-view. PMID:29140311

  7. Aerodynamic Simulation of Ice Accretion on Airfoils

    NASA Technical Reports Server (NTRS)

    Broeren, Andy P.; Addy, Harold E., Jr.; Bragg, Michael B.; Busch, Greg T.; Montreuil, Emmanuel

    2011-01-01

    This report describes recent improvements in aerodynamic scaling and simulation of ice accretion on airfoils. Ice accretions were classified into four types on the basis of aerodynamic effects: roughness, horn, streamwise, and spanwise ridge. The NASA Icing Research Tunnel (IRT) was used to generate ice accretions within these four types using both subscale and full-scale models. Large-scale, pressurized windtunnel testing was performed using a 72-in.- (1.83-m-) chord, NACA 23012 airfoil model with high-fidelity, three-dimensional castings of the IRT ice accretions. Performance data were recorded over Reynolds numbers from 4.5 x 10(exp 6) to 15.9 x 10(exp 6) and Mach numbers from 0.10 to 0.28. Lower fidelity ice-accretion simulation methods were developed and tested on an 18-in.- (0.46-m-) chord NACA 23012 airfoil model in a small-scale wind tunnel at a lower Reynolds number. The aerodynamic accuracy of the lower fidelity, subscale ice simulations was validated against the full-scale results for a factor of 4 reduction in model scale and a factor of 8 reduction in Reynolds number. This research has defined the level of geometric fidelity required for artificial ice shapes to yield aerodynamic performance results to within a known level of uncertainty and has culminated in a proposed methodology for subscale iced-airfoil aerodynamic simulation.

  8. Hunt for Federal Funds Gives Classified Research a Lift

    ERIC Educational Resources Information Center

    Basken, Paul

    2012-01-01

    For some colleges and professors, classified research promises prestige and money. Powerhouses like the Massachusetts Institute of Technology and the Johns Hopkins University have for decades run large classified laboratories. But most other universities either do not allow such research or conduct it quietly, and in small doses. The…

  9. Validation of Autism Spectrum Disorder Diagnoses in Large Healthcare Systems with Electronic Medical Records

    ERIC Educational Resources Information Center

    Coleman, Karen J.; Lutsky, Marta A.; Yau, Vincent; Qian, Yinge; Pomichowski, Magdalena E.; Crawford, Phillip M.; Lynch, Frances L.; Madden, Jeanne M.; Owen-Smith, Ashli; Pearson, John A.; Pearson, Kathryn A.; Rusinak, Donna; Quinn, Virginia P.; Croen, Lisa A.

    2015-01-01

    To identify factors associated with valid Autism Spectrum Disorder (ASD) diagnoses from electronic sources in large healthcare systems. We examined 1,272 charts from ASD diagnosed youth <18 years old. Expert reviewers classified diagnoses as confirmed, probable, possible, ruled out, or not enough information. A total of 845 were classified with…

  10. Variability in warm-season atmospheric circulation and precipitation patterns over subtropical South America: relationships between the South Atlantic convergence zone and large-scale organized convection over the La Plata basin

    NASA Astrophysics Data System (ADS)

    Mattingly, Kyle S.; Mote, Thomas L.

    2017-01-01

    Warm-season precipitation variability over subtropical South America is characterized by an inverse relationship between the South Atlantic convergence zone (SACZ) and precipitation over the central and western La Plata basin of southeastern South America. This study extends the analysis of this "South American Seesaw" precipitation dipole to relationships between the SACZ and large, long-lived mesoscale convective systems (LLCSs) over the La Plata basin. By classifying SACZ events into distinct continental and oceanic categories and building a logistic regression model that relates LLCS activity across the region to continental and oceanic SACZ precipitation, a detailed account of spatial variability in the out-of-phase coupling between the SACZ and large-scale organized convection over the La Plata basin is provided. Enhanced precipitation in the continental SACZ is found to result in increased LLCS activity over northern, northeastern, and western sections of the La Plata basin, in association with poleward atmospheric moisture flux from the Amazon basin toward these regions, and a decrease in the probability of LLCS occurrence over the southeastern La Plata basin. Increased oceanic SACZ precipitation, however, was strongly related to reduced atmospheric moisture and decreased probability of LLCS occurrence over nearly the entire La Plata basin. These results suggest that continental SACZ activity and large-scale organized convection over the northern and eastern sections of the La Plata basin are closely tied to atmospheric moisture transport from the Amazon basin, while the warm coastal Brazil Current may also play an important role as an evaporative moisture source for LLCSs over the central and western La Plata basin.

  11. Local-scale models reveal ecological niche variability in amphibian and reptile communities from two contrasting biogeographic regions

    PubMed Central

    Santos, Xavier; Felicísimo, Ángel M.

    2016-01-01

    Ecological Niche Models (ENMs) are widely used to describe how environmental factors influence species distribution. Modelling at a local scale, compared to a large scale within a high environmental gradient, can improve our understanding of ecological species niches. The main goal of this study is to assess and compare the contribution of environmental variables to amphibian and reptile ENMs in two Spanish national parks located in contrasting biogeographic regions, i.e., the Mediterranean and the Atlantic area. The ENMs were built with maximum entropy modelling using 11 environmental variables in each territory. The contributions of these variables to the models were analysed and classified using various statistical procedures (Mann–Whitney U tests, Principal Components Analysis and General Linear Models). Distance to the hydrological network was consistently the most relevant variable for both parks and taxonomic classes. Topographic variables (i.e., slope and altitude) were the second most predictive variables, followed by climatic variables. Differences in variable contribution were observed between parks and taxonomic classes. Variables related to water availability had the larger contribution to the models in the Mediterranean park, while topography variables were decisive in the Atlantic park. Specific response curves to environmental variables were in accordance with the biogeographic affinity of species (Mediterranean and non-Mediterranean species) and taxonomy (amphibians and reptiles). Interestingly, these results were observed for species located in both parks, particularly those situated at their range limits. Our findings show that ecological niche models built at local scale reveal differences in habitat preferences within a wide environmental gradient. Therefore, modelling at local scales rather than assuming large-scale models could be preferable for the establishment of conservation strategies for herptile species in natural parks. PMID:27761304

  12. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    PubMed

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolodrubetz, Daniel W.; Pietrulewicz, Piotr; Stewart, Iain W.

    To predict the jet mass spectrum at a hadron collider it is crucial to account for the resummation of logarithms between the transverse momentum of the jet and its invariant mass m J . For small jet areas there are additional large logarithms of the jet radius R, which affect the convergence of the perturbative series. We present an analytic framework for exclusive jet production at the LHC which gives a complete description of the jet mass spectrum including realistic jet algorithms and jet vetoes. It factorizes the scales associated with m J , R, and the jet veto, enablingmore » in addition the systematic resummation of jet radius logarithms in the jet mass spectrum beyond leading logarithmic order. We discuss the factorization formulae for the peak and tail region of the jet mass spectrum and for small and large R, and the relations between the different regimes and how to combine them. Regions of experimental interest are classified which do not involve large nonglobal logarithms. We also present universal results for nonperturbative effects and discuss various jet vetoes.« less

  14. Classification of Different Degrees of Disability Following Intracerebral Hemorrhage: A Decision Tree Analysis from VISTA-ICH Collaboration.

    PubMed

    Phan, Thanh G; Chen, Jian; Beare, Richard; Ma, Henry; Clissold, Benjamin; Van Ly, John; Srikanth, Velandai

    2017-01-01

    Prognostication following intracerebral hemorrhage (ICH) has focused on poor outcome at the expense of lumping together mild and moderate disability. We aimed to develop a novel approach at classifying a range of disability following ICH. The Virtual International Stroke Trial Archive collaboration database was searched for patients with ICH and known volume of ICH on baseline CT scans. Disability was partitioned into mild [modified Rankin Scale (mRS) at 90 days of 0-2], moderate (mRS = 3-4), and severe disabilities (mRS = 5-6). We used binary and trichotomy decision tree methodology. The data were randomly divided into training (2/3 of data) and validation (1/3 data) datasets. The area under the receiver operating characteristic curve (AUC) was used to calculate the accuracy of the decision tree model. We identified 957 patients, age 65.9 ± 12.3 years, 63.7% males, and ICH volume 22.6 ± 22.1 ml. The binary tree showed that lower ICH volume (<13.7 ml), age (<66.5 years), serum glucose (<8.95 mmol/l), and systolic blood pressure (<170 mm Hg) discriminate between mild versus moderate-to-severe disabilities with AUC of 0.79 (95% CI 0.73-0.85). Large ICH volume (>27.9 ml), older age (>69.5 years), and low Glasgow Coma Scale (<15) classify severe disability with AUC of 0.80 (95% CI 0.75-0.86). The trichotomy tree showed that ICH volume, age, and serum glucose can separate mild, moderate, and severe disability groups with AUC 0.79 (95% CI 0.71-0.87). Both the binary and trichotomy methods provide equivalent discrimination of disability outcome after ICH. The trichotomy method can classify three categories at once, whereas this action was not possible with the binary method. The trichotomy method may be of use to clinicians and trialists for classifying a range of disability in ICH.

  15. WND-CHARM: Multi-purpose image classification using compound image transforms

    PubMed Central

    Orlov, Nikita; Shamir, Lior; Macura, Tomasz; Johnston, Josiah; Eckley, D. Mark; Goldberg, Ilya G.

    2008-01-01

    We describe a multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers. The proposed image classifier first extracts a large set of 1025 image features including polynomial decompositions, high contrast features, pixel statistics, and textures. These features are computed on the raw image, transforms of the image, and transforms of transforms of the image. The feature values are then used to classify test images into a set of pre-defined image classes. This classifier was tested on several different problems including biological image classification and face recognition. Although we cannot make a claim of universality, our experimental results show that this classifier performs as well or better than classifiers developed specifically for these image classification tasks. Our classifier’s high performance on a variety of classification problems is attributed to (i) a large set of features extracted from images; and (ii) an effective feature selection and weighting algorithm sensitive to specific image classification problems. The algorithms are available for free download from openmicroscopy.org. PMID:18958301

  16. New MYC IHC Classifier Integrating Quantitative Architecture Parameters to Predict MYC Gene Translocation in Diffuse Large B-Cell Lymphoma

    PubMed Central

    Dong, Wei-Feng; Canil, Sarah; Lai, Raymond; Morel, Didier; Swanson, Paul E.; Izevbaye, Iyare

    2018-01-01

    A new automated MYC IHC classifier based on bivariate logistic regression is presented. The predictor relies on image analysis developed with the open-source ImageJ platform. From a histologic section immunostained for MYC protein, 2 dimensionless quantitative variables are extracted: (a) relative distance between nuclei positive for MYC IHC based on euclidean minimum spanning tree graph and (b) coefficient of variation of the MYC IHC stain intensity among MYC IHC-positive nuclei. Distance between positive nuclei is suggested to inversely correlate MYC gene rearrangement status, whereas coefficient of variation is suggested to inversely correlate physiological regulation of MYC protein expression. The bivariate classifier was compared with 2 other MYC IHC classifiers (based on percentage of MYC IHC positive nuclei), all tested on 113 lymphomas including mostly diffuse large B-cell lymphomas with known MYC fluorescent in situ hybridization (FISH) status. The bivariate classifier strongly outperformed the “percentage of MYC IHC-positive nuclei” methods to predict MYC+ FISH status with 100% sensitivity (95% confidence interval, 94-100) associated with 80% specificity. The test is rapidly performed and might at a minimum provide primary IHC screening for MYC gene rearrangement status in diffuse large B-cell lymphomas. Furthermore, as this bivariate classifier actually predicts “permanent overexpressed MYC protein status,” it might identify nontranslocation-related chromosomal anomalies missed by FISH. PMID:27093450

  17. Cross-cultural validation of the Child Abuse Potential Inventory in Greece: a preliminary study.

    PubMed

    Diareme, S; Tsiantis, J; Tsitoura, S

    1997-11-01

    The aim of this study was first, to provide preliminary findings on the reliability and validity of a Greek translation of the CAP Inventory (Milner, 1986), and second, to examine whether there were any differences between Greek and American scores in the CAP Inventory. A convenience sample of 320 Greek parents was recruited from the outpatient unit of a large Children's Hospital in Athens, Greece. Greek scores were compared with American scores taken from the test manual. Internal consistency reliability was high for the Abuse scale (.91), two factor scales (Distress = .93 and Rigidity = .86) and one Validity scale (Inconsistency = .80). The Greek version of the Abuse scale had a similar factorial structure with the American version. Also, 78.1% of Greek parents were classified correctly as nonabusive by the Abuse scale. This rate was increased to 88.6% when invalid questionnaires were excluded from the sample. Comparisons between Greek and American mean scale scores indicated that Greek scores were significantly higher than American scores in all but one scale. Greeks had significantly lower scores than Americans in the Problems with Child and Self scale. Current findings including the high reliability, relatively high correct classification rates and factorial structure of the Greek Abuse scale are promising and support the idea of continuation of research for the development and validation of the Greek CAP Inventory. The difference between Greek and American scores in particular indicates the need for adjustment of cut off scores in the Greek scale.

  18. High-resolution simulation of deep pencil beam surveys - analysis of quasi-periodicity

    NASA Astrophysics Data System (ADS)

    Weiss, A. G.; Buchert, T.

    1993-07-01

    We carry out pencil beam constructions in a high-resolution simulation of the large-scale structure of galaxies. The initial density fluctuations are taken to have a truncated power spectrum. All the models have {OMEGA} = 1. As an example we present the results for the case of "Hot-Dark-Matter" (HDM) initial conditions with scale-free n = 1 power index on large scales as a representative of models with sufficient large-scale power. We use an analytic approximation for particle trajectories of a self-gravitating dust continuum and apply a local dynamical biasing of volume elements to identify luminous matter in the model. Using this method, we are able to resolve formally a simulation box of 1200h^-1^ Mpc (e.g. for HDM initial conditions) down to the scale of galactic halos using 2160^3^ particles. We consider this as the minimal resolution necessary for a sensible simulation of deep pencil beam data. Pencil beam probes are taken for a given epoch using the parameters of observed beams. In particular, our analysis concentrates on the detection of a quasi-periodicity in the beam probes using several different methods. The resulting beam ensembles are analyzed statistically using number distributions, pair-count histograms, unnormalized pair-counts, power spectrum analysis and trial-period folding. Periodicities are classified according to their significance level in the power spectrum of the beams. The simulation is designed for application to parameter studies which prepare future observational projects. We find that a large percentage of the beams show quasi- periodicities with periods which cluster at a certain length scale. The periods found range between one and eight times the cutoff length in the initial fluctuation spectrum. At significance levels similar to those of the data of Broadhurst et al. (1990), we find about 15% of the pencil beams to show periodicities, about 30% of which are around the mean separation of rich clusters, while the distribution of scales reaches values of more than 200h^-1^ Mpc. The detection of periodicities larger than the typical void size must not be due to missing of "walls" (like the so called "Great Wall" seen in the CfA catalogue of galaxies), but can be due to different clustering properties of galaxies along the beams.

  19. Validation of accuracy and community acceptance of the BIRTHweigh III scale for categorizing newborn weight in rural India.

    PubMed

    Darmstadt, G L; Kumar, V; Shearer, J C; Misra, R; Mohanty, S; Baqui, A H; Coffey, P S; Awasthi, S; Singh, J V; Santosham, M

    2007-10-01

    To determine the accuracy and acceptability of a handheld scale prototype designed for nonliterate users to classify newborns into three weight categories (>or=2,500 g; 2,000 to 2,499 g; and <2,000 g). Weights of 1,100 newborns in Uttar Pradesh, India, were measured on the test scale and validated against a gold standard. Mothers, family members and community health stakeholders were interviewed to assess the acceptability of the test scale. The test scale was highly sensitive and specific at classifying newborn weight (normal weight: 95.3 and 96.3%, respectively; low birth weight: 90.4 and 99.2%, respectively; very low birth weight: 91.7 and 98.4%, respectively). It was the overall agreement of the community that the test scale was more practical and easier to interpret than the gold standard. The BIRTHweigh III scale accurately identifies low birth weight and very low birth weight newborns to target weight-specific interventions. The scale is extremely practical and useful for resource-poor settings, especially those with low levels of literacy.

  20. Distinguishing centrarchid genera by use of lateral line scales

    USGS Publications Warehouse

    Roberts, N.M.; Rabeni, C.F.; Stanovick, J.S.

    2007-01-01

    Predator-prey relations involving fishes are often evaluated using scales remaining in gut contents or feces. While several reliable keys help identify North American freshwater fish scales to the family level, none attempt to separate the family Centrarchidae to the genus level. Centrarchidae is of particular concern in the midwestern United States because it contains several popular sport fishes, such as smallmouth bass Micropterus dolomieu, largemouth bass M. salmoides, and rock bass Ambloplites rupestris, as well as less-sought-after species of sunfishes Lepomis spp. and crappies Pomoxis spp. Differentiating sport fish from non-sport fish has important management implications. Morphological characteristics of lateral line scales (n = 1,581) from known centrarchid fishes were analyzed. The variability of measurements within and between genera was examined to select variables that were the most useful in further classifying unknown centrarchid scales. A linear discriminant analysis model was developed using 10 variables. Based on this model, 84.4% of Ambloplites scales, 81.2% of Lepomis scales, and 86.6% of Micropterus scales were classified correctly using a jackknife procedure. ?? Copyright by the American Fisheries Society 2007.

  1. Neurons from the adult human dentate nucleus: neural networks in the neuron classification.

    PubMed

    Grbatinić, Ivan; Marić, Dušica L; Milošević, Nebojša T

    2015-04-07

    Topological (central vs. border neuron type) and morphological classification of adult human dentate nucleus neurons according to their quantified histomorphological properties using neural networks on real and virtual neuron samples. In the real sample 53.1% and 14.1% of central and border neurons, respectively, are classified correctly with total of 32.8% of misclassified neurons. The most important result present 62.2% of misclassified neurons in border neurons group which is even greater than number of correctly classified neurons (37.8%) in that group, showing obvious failure of network to classify neurons correctly based on computational parameters used in our study. On the virtual sample 97.3% of misclassified neurons in border neurons group which is much greater than number of correctly classified neurons (2.7%) in that group, again confirms obvious failure of network to classify neurons correctly. Statistical analysis shows that there is no statistically significant difference in between central and border neurons for each measured parameter (p>0.05). Total of 96.74% neurons are morphologically classified correctly by neural networks and each one belongs to one of the four histomorphological types: (a) neurons with small soma and short dendrites, (b) neurons with small soma and long dendrites, (c) neuron with large soma and short dendrites, (d) neurons with large soma and long dendrites. Statistical analysis supports these results (p<0.05). Human dentate nucleus neurons can be classified in four neuron types according to their quantitative histomorphological properties. These neuron types consist of two neuron sets, small and large ones with respect to their perykarions with subtypes differing in dendrite length i.e. neurons with short vs. long dendrites. Besides confirmation of neuron classification on small and large ones, already shown in literature, we found two new subtypes i.e. neurons with small soma and long dendrites and with large soma and short dendrites. These neurons are most probably equally distributed throughout the dentate nucleus as no significant difference in their topological distribution is observed. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. A Label Propagation Approach for Detecting Buried Objects in Handheld GPR Data

    DTIC Science & Technology

    2016-04-17

    regions of interest that correspond to locations with anomalous signatures. Second, a classifier (or an ensemble of classifiers ) is used to assign a...investigated for almost two decades and several classifiers have been developed. Most of these methods are based on the supervised learning paradigm where...labeled target and clutter signatures are needed to train a classifier to discriminate between the two classes. Typically, large and diverse labeled

  3. Basic Scale on Insomnia complaints and Quality of Sleep (BaSIQS): reliability, initial validity and normative scores in higher education students.

    PubMed

    Allen Gomes, Ana; Ruivo Marques, Daniel; Meia-Via, Ana Maria; Meia-Via, Mariana; Tavares, José; Fernandes da Silva, Carlos; Pinto de Azevedo, Maria Helena

    2015-04-01

    Based on successive samples totaling more than 5000 higher education students, we scrutinized the reliability, structure, initial validity and normative scores of a brief self-report seven-item scale to screen for the continuum of nighttime insomnia complaints/perceived sleep quality, used by our team for more than a decade, henceforth labeled the Basic Scale on Insomnia complaints and Quality of Sleep (BaSIQS). In study/sample 1 (n = 1654), the items were developed based on part of a larger survey on higher education sleep-wake patterns. The test-retest study was conducted in an independent small group (n = 33) with a 2-8 week gap. In study/sample 2 (n = 360), focused mainly on validity, the BaSIQS was completed together with the Pittsburgh Sleep Quality Index (PSQI). In study 3, a large recent sample of students from universities all over the country (n = 2995) answered the BaSIQS items, based on which normative scores were determined, and an additional question on perceived sleep problems in order to further analyze the scale's validity. Regarding reliability, Cronbach alpha coefficients were systematically higher than 0.7, and the test-retest correlation coefficient was greater than 0.8. Structure analyses revealed consistently satisfactory two-factor and single-factor solutions. Concerning validity analyses, BaSIQS scores were significantly correlated with PSQI component scores and overall score (r = 0.652 corresponding to a large association); mean scores were significantly higher in those students classifying themselves as having sleep problems (p < 0.0001, d = 0.99 corresponding to a large effect size). In conclusion, the BaSIQS is very easy to administer, and appears to be a reliable and valid scale in higher education students. It might be a convenient short tool in research and applied settings to rapidly assess sleep quality or screen for insomnia complaints, and it may be easily used in other populations with minor adaptations.

  4. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    PubMed

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  5. A satellite geodetic survey of large-scale deformation of volcanic centres in the central Andes.

    PubMed

    Pritchard, Matthew E; Simons, Mark

    2002-07-11

    Surface deformation in volcanic areas usually indicates movement of magma or hydrothermal fluids at depth. Stratovolcanoes tend to exhibit a complex relationship between deformation and eruptive behaviour. The characteristically long time spans between such eruptions requires a long time series of observations to determine whether deformation without an eruption is common at a given edifice. Such studies, however, are logistically difficult to carry out in most volcanic arcs, as these tend to be remote regions with large numbers of volcanoes (hundreds to even thousands). Here we present a satellite-based interferometric synthetic aperture radar (InSAR) survey of the remote central Andes volcanic arc, a region formed by subduction of the Nazca oceanic plate beneath continental South America. Spanning the years 1992 to 2000, our survey reveals the background level of activity of about 900 volcanoes, 50 of which have been classified as potentially active. We find four centres of broad (tens of kilometres wide), roughly axisymmetric surface deformation. None of these centres are at volcanoes currently classified as potentially active, although two lie within about 10 km of volcanoes with known activity. Source depths inferred from the patterns of deformation lie between 5 and 17 km. In contrast to the four new sources found, we do not observe any deformation associated with recent eruptions of Lascar, Chile.

  6. Visual Perception-Based Statistical Modeling of Complex Grain Image for Product Quality Monitoring and Supervision on Assembly Production Line

    PubMed Central

    Chen, Qing; Xu, Pengfei; Liu, Wenzhong

    2016-01-01

    Computer vision as a fast, low-cost, noncontact, and online monitoring technology has been an important tool to inspect product quality, particularly on a large-scale assembly production line. However, the current industrial vision system is far from satisfactory in the intelligent perception of complex grain images, comprising a large number of local homogeneous fragmentations or patches without distinct foreground and background. We attempt to solve this problem based on the statistical modeling of spatial structures of grain images. We present a physical explanation in advance to indicate that the spatial structures of the complex grain images are subject to a representative Weibull distribution according to the theory of sequential fragmentation, which is well known in the continued comminution of ore grinding. To delineate the spatial structure of the grain image, we present a method of multiscale and omnidirectional Gaussian derivative filtering. Then, a product quality classifier based on sparse multikernel–least squares support vector machine is proposed to solve the low-confidence classification problem of imbalanced data distribution. The proposed method is applied on the assembly line of a food-processing enterprise to classify (or identify) automatically the production quality of rice. The experiments on the real application case, compared with the commonly used methods, illustrate the validity of our method. PMID:26986726

  7. Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features

    PubMed Central

    Mohammad-Noori, Morteza; Beer, Michael A.

    2014-01-01

    Abstract Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem. PMID:25033408

  8. Enhanced regulatory sequence prediction using gapped k-mer features.

    PubMed

    Ghandi, Mahmoud; Lee, Dongwon; Mohammad-Noori, Morteza; Beer, Michael A

    2014-07-01

    Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.

  9. Present and Future Redshift Surveys: ORS, DOGS and 2dF

    NASA Astrophysics Data System (ADS)

    Lahav, O.

    Three galaxy redshifts surveys and their analyses are discussed. (i) The recently completed Optical Redshift Survey (ORS) includes galaxies larger than 1.9 arcmin and/or brighter than $14.5^m$. It provides redshifts for $\\sim 8300 $ galaxies at Galactic latitude $|b|>20^o$. A new analysis of the survey explores the existence and extent of the Supergalactic Plane (SGP). Its orientation is found to be in good agreement with the standard SGP coordinates, and suggests that the SGP is at least as large as the survey (16000 km/sec in diameter). (ii) The Dwingeloo Obscured Galaxy Survey is aimed at finding galaxies hidden behind the Milky-Way using a blind search in 21 cm. The discovery of Dwingeloo1 illustrates that the survey will allow us to systematically survey the region $30^o < l < 200^o$ out to 4000 km/sec. (iii) The Anglo-Australian 2-degree-Field (2dF) survey will yield 250,000 redshifts for APM-selected galaxies brighter than $19.5^m$ to map the large scale structure on scales larger than $\\sim 30 \\Mpc$. To study morphological segregation and biasing the spectra will be classified using Artificial Neural Networks.

  10. [Interrelationships between soil fauna and soil environmental factors in China: research advance].

    PubMed

    Wang, Yi; Wei, Wei; Yang, Xing-zhong; Chen, Li-ding; Yang, Lei

    2010-09-01

    Soil fauna has close relations with various environmental factors in soil ecosystem. To explore the interrelationships between soil fauna and soil environmental factors is of vital importance to deep understand the dynamics of soil ecosystem and to assess the functioning of the ecosystem. The environmental factors affecting soil fauna can be classified as soil properties and soil external environment. The former contains soil basic physical and chemical properties, soil moisture, and soil pollution. The latter includes vegetation, land use type, landform, and climate, etc. From these aspects, this paper summarized the published literatures in China on the interrelationships between soil fauna and soil environmental factors. It was considered that several problems were existed in related studies, e.g., fewer researches were made in integrating soil fauna's bio-indicator function, research methods were needed to be improved, and the studies on the multi-environmental factors and their large scale spatial-temporal variability were in deficiency. Corresponding suggestions were proposed, i.e., more work should be done according to the practical needs, advanced experiences from abroad should be referenced, and comprehensive studies on multi-environmental factors and long-term monitoring should be conducted on large scale areas.

  11. Automatic location of L/H transition times for physical studies with a large statistical basis

    NASA Astrophysics Data System (ADS)

    González, S.; Vega, J.; Murari, A.; Pereira, A.; Dormido-Canto, S.; Ramírez, J. M.; contributors, JET-EFDA

    2012-06-01

    Completely automatic techniques to estimate and validate L/H transition times can be essential in L/H transition analyses. The generation of databases with hundreds of transition times and without human intervention is an important step to accomplish (a) L/H transition physics analysis, (b) validation of L/H theoretical models and (c) creation of L/H scaling laws. An entirely unattended methodology is presented in this paper to build large databases of transition times in JET using time series. The proposed technique has been applied to a dataset of 551 JET discharges between campaigns C21 and C26. A prediction with discharges that show a clear signature in time series is made through the locating properties of the wavelet transform. It is an accurate prediction and the uncertainty interval is ±3.2 ms. The discharges with a non-clear pattern in the time series use an L/H mode classifier based on discharges with a clear signature. In this case, the estimation error shows a distribution with mean and standard deviation of 27.9 ms and 37.62 ms, respectively. Two different regression methods have been applied to the measurements acquired at the transition times identified by the automatic system. The obtained scaling laws for the threshold power are not significantly different from those obtained using the data at the transition times determined manually by the experts. The automatic methods allow performing physical studies with a large number of discharges, showing, for example, that there are statistically different types of transitions characterized by different scaling laws.

  12. SEGMENTATION OF MITOCHONDRIA IN ELECTRON MICROSCOPY IMAGES USING ALGEBRAIC CURVES.

    PubMed

    Seyedhosseini, Mojtaba; Ellisman, Mark H; Tasdizen, Tolga

    2013-01-01

    High-resolution microscopy techniques have been used to generate large volumes of data with enough details for understanding the complex structure of the nervous system. However, automatic techniques are required to segment cells and intracellular structures in these multi-terabyte datasets and make anatomical analysis possible on a large scale. We propose a fully automated method that exploits both shape information and regional statistics to segment irregularly shaped intracellular structures such as mitochondria in electron microscopy (EM) images. The main idea is to use algebraic curves to extract shape features together with texture features from image patches. Then, these powerful features are used to learn a random forest classifier, which can predict mitochondria locations precisely. Finally, the algebraic curves together with regional information are used to segment the mitochondria at the predicted locations. We demonstrate that our method outperforms the state-of-the-art algorithms in segmentation of mitochondria in EM images.

  13. Implementation of neuromorphic systems: from discrete components to analog VLSI chips (testing and communication issues).

    PubMed

    Dante, V; Del Giudice, P; Mattia, M

    2001-01-01

    We review a series of implementations of electronic devices aiming at imitating to some extent structure and function of simple neural systems, with particular emphasis on communication issues. We first provide a short overview of general features of such "neuromorphic" devices and the implications of setting up "tests" for them. We then review the developments directly related to our work at the Istituto Superiore di Sanità (ISS): a pilot electronic neural network implementing a simple classifier, autonomously developing internal representations of incoming stimuli; an output network, collecting information from the previous classifier and extracting the relevant part to be forwarded to the observer; an analog, VLSI (very large scale integration) neural chip implementing a recurrent network of spiking neurons and plastic synapses, and the test setup for it; a board designed to interface the standard PCI (peripheral component interconnect) bus of a PC with a special purpose, asynchronous bus for communication among neuromorphic chips; a short and preliminary account of an application-oriented device, taking advantage of the above communication infrastructure.

  14. A compressed sensing method with analytical results for lidar feature classification

    NASA Astrophysics Data System (ADS)

    Allen, Josef D.; Yuan, Jiangbo; Liu, Xiuwen; Rahmes, Mark

    2011-04-01

    We present an innovative way to autonomously classify LiDAR points into bare earth, building, vegetation, and other categories. One desirable product of LiDAR data is the automatic classification of the points in the scene. Our algorithm automatically classifies scene points using Compressed Sensing Methods via Orthogonal Matching Pursuit algorithms utilizing a generalized K-Means clustering algorithm to extract buildings and foliage from a Digital Surface Models (DSM). This technology reduces manual editing while being cost effective for large scale automated global scene modeling. Quantitative analyses are provided using Receiver Operating Characteristics (ROC) curves to show Probability of Detection and False Alarm of buildings vs. vegetation classification. Histograms are shown with sample size metrics. Our inpainting algorithms then fill the voids where buildings and vegetation were removed, utilizing Computational Fluid Dynamics (CFD) techniques and Partial Differential Equations (PDE) to create an accurate Digital Terrain Model (DTM) [6]. Inpainting preserves building height contour consistency and edge sharpness of identified inpainted regions. Qualitative results illustrate other benefits such as Terrain Inpainting's unique ability to minimize or eliminate undesirable terrain data artifacts.

  15. Large-scale urban point cloud labeling and reconstruction

    NASA Astrophysics Data System (ADS)

    Zhang, Liqiang; Li, Zhuqiang; Li, Anjian; Liu, Fangyu

    2018-04-01

    The large number of object categories and many overlapping or closely neighboring objects in large-scale urban scenes pose great challenges in point cloud classification. In this paper, a novel framework is proposed for classification and reconstruction of airborne laser scanning point cloud data. To label point clouds, we present a rectified linear units neural network named ReLu-NN where the rectified linear units (ReLu) instead of the traditional sigmoid are taken as the activation function in order to speed up the convergence. Since the features of the point cloud are sparse, we reduce the number of neurons by the dropout to avoid over-fitting of the training process. The set of feature descriptors for each 3D point is encoded through self-taught learning, and forms a discriminative feature representation which is taken as the input of the ReLu-NN. The segmented building points are consolidated through an edge-aware point set resampling algorithm, and then they are reconstructed into 3D lightweight models using the 2.5D contouring method (Zhou and Neumann, 2010). Compared with deep learning approaches, the ReLu-NN introduced can easily classify unorganized point clouds without rasterizing the data, and it does not need a large number of training samples. Most of the parameters in the network are learned, and thus the intensive parameter tuning cost is significantly reduced. Experimental results on various datasets demonstrate that the proposed framework achieves better performance than other related algorithms in terms of classification accuracy and reconstruction quality.

  16. A characterization of workflow management systems for extreme-scale applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia

    We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less

  17. A characterization of workflow management systems for extreme-scale applications

    DOE PAGES

    Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia; ...

    2017-02-16

    We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less

  18. Scale-adaptive compressive tracking with feature integration

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Li, Jicheng; Chen, Xiao; Li, Shuxin

    2016-05-01

    Numerous tracking-by-detection methods have been proposed for robust visual tracking, among which compressive tracking (CT) has obtained some promising results. A scale-adaptive CT method based on multifeature integration is presented to improve the robustness and accuracy of CT. We introduce a keypoint-based model to achieve the accurate scale estimation, which can additionally give a prior location of the target. Furthermore, by the high efficiency of data-independent random projection matrix, multiple features are integrated into an effective appearance model to construct the naïve Bayes classifier. At last, an adaptive update scheme is proposed to update the classifier conservatively. Experiments on various challenging sequences demonstrate substantial improvements by our proposed tracker over CT and other state-of-the-art trackers in terms of dealing with scale variation, abrupt motion, deformation, and illumination changes.

  19. Floodplain Mapping for the Continental United States Using Machine Learning Techniques and Watershed Characteristics

    NASA Astrophysics Data System (ADS)

    Jafarzadegan, K.; Merwade, V.; Saksena, S.

    2017-12-01

    Using conventional hydrodynamic methods for floodplain mapping in large-scale and data-scarce regions is problematic due to the high cost of these methods, lack of reliable data and uncertainty propagation. In this study a new framework is proposed to generate 100-year floodplains for any gauged or ungauged watershed across the United States (U.S.). This framework uses Flood Insurance Rate Maps (FIRMs), topographic, climatic and land use data which are freely available for entire U.S. for floodplain mapping. The framework consists of three components, including a Random Forest classifier for watershed classification, a Probabilistic Threshold Binary Classifier (PTBC) for generating the floodplains, and a lookup table for linking the Random Forest classifier to the PTBC. The effectiveness and reliability of the proposed framework is tested on 145 watersheds from various geographical locations in the U.S. The validation results show that around 80 percent of total watersheds are predicted well, 14 percent have acceptable fit and less than five percent are predicted poorly compared to FIRMs. Another advantage of this framework is its ability in generating floodplains for all small rivers and tributaries. Due to the high accuracy and efficiency of this framework, it can be used as a preliminary decision making tool to generate 100-year floodplain maps for data-scarce regions and all tributaries where hydrodynamic methods are difficult to use.

  20. A space-based classification system for RF transients

    NASA Astrophysics Data System (ADS)

    Moore, K. R.; Call, D.; Johnson, S.; Payne, T.; Ford, W.; Spencer, K.; Wilkerson, J. F.; Baumgart, C.

    The FORTE (Fast On-Orbit Recording of Transient Events) small satellite is scheduled for launch in mid 1995. The mission is to measure and classify VHF (30-300 MHz) electromagnetic pulses, primarily due to lightning, within a high noise environment dominated by continuous wave carriers such as TV and FM stations. The FORTE Event Classifier will use specialized hardware to implement signal processing and neural network algorithms that perform onboard classification of RF transients and carriers. Lightning events will also be characterized with optical data telemetered to the ground. A primary mission science goal is to develop a comprehensive understanding of the correlation between the optical flash and the VHF emissions from lightning. By combining FORTE measurements with ground measurements and/or active transmitters, other science issues can be addressed. Examples include the correlation of global precipitation rates with lightning flash rates and location, the effects of large scale structures within the ionosphere (such as traveling ionospheric disturbances and horizontal gradients in the total electron content) on the propagation of broad bandwidth RF signals, and various areas of lightning physics. Event classification is a key feature of the FORTE mission. Neural networks are promising candidates for this application. The authors describe the proposed FORTE Event Classifier flight system, which consists of a commercially available digital signal processing board and a custom board, and discuss work on signal processing and neural network algorithms.

  1. Objectively classifying Southern Hemisphere extratropical cyclones

    NASA Astrophysics Data System (ADS)

    Catto, Jennifer

    2016-04-01

    There has been a long tradition in attempting to separate extratropical cyclones into different classes depending on their cloud signatures, airflows, synoptic precursors, or upper-level flow features. Depending on these features, the cyclones may have different impacts, for example in their precipitation intensity. It is important, therefore, to understand how the distribution of different cyclone classes may change in the future. Many of the previous classifications have been performed manually. In order to be able to evaluate climate models and understand how extratropical cyclones might change in the future, we need to be able to use an automated method to classify cyclones. Extratropical cyclones have been identified in the Southern Hemisphere from the ERA-Interim reanalysis dataset with a commonly used identification and tracking algorithm that employs 850 hPa relative vorticity. A clustering method applied to large-scale fields from ERA-Interim at the time of cyclone genesis (when the cyclone is first detected), has been used to objectively classify identified cyclones. The results are compared to the manual classification of Sinclair and Revell (2000) and the four objectively identified classes shown in this presentation are found to match well. The relative importance of diabatic heating in the clusters is investigated, as well as the differing precipitation characteristics. The success of the objective classification shows its utility in climate model evaluation and climate change studies.

  2. A new ionospheric storm scale based on TEC and foF2 statistics

    NASA Astrophysics Data System (ADS)

    Nishioka, Michi; Tsugawa, Takuya; Jin, Hidekatsu; Ishii, Mamoru

    2017-01-01

    In this paper, we propose the I-scale, a new ionospheric storm scale for general users in various regions in the world. With the I-scale, ionospheric storms can be classified at any season, local time, and location. Since the ionospheric condition largely depends on many factors such as solar irradiance, energy input from the magnetosphere, and lower atmospheric activity, it had been difficult to scale ionospheric storms, which are mainly caused by solar and geomagnetic activities. In this study, statistical analysis was carried out for total electron content (TEC) and F2 layer critical frequency (foF2) in Japan for 18 years from 1997 to 2014. Seasonal, local time, and latitudinal dependences of TEC and foF2 variabilities are excluded by normalizing each percentage variation using their statistical standard deviations. The I-scale is defined by setting thresholds to the normalized numbers to seven categories: I0, IP1, IP2, IP3, IN1, IN2, and IN3. I0 represents a quiet state, and IP1 (IN1), IP2 (IN2), and IP3 (IN3) represent moderate, strong, and severe positive (negative) storms, respectively. The proposed I-scale can be used for other locations, such as polar and equatorial regions. It is considered that the proposed I-scale can be a standardized scale to help the users to assess the impact of space weather on their systems.

  3. Evaluation of the MMPI-2-RF for Detecting Over-reported Symptoms in a Civil Forensic and Disability Setting.

    PubMed

    Nguyen, Constance T; Green, Debbie; Barr, William B

    2015-01-01

    This study investigated the classification accuracy of the Minnesota Multiphasic Personality Inventory-2-Restructured Form validity scales in a sample of disability claimants and civil forensic litigants. A criterion-groups design was used, classifying examinees as "Failed Slick Criteria" through low performance on at least two performance validity indices (stand-alone or embedded) and "Passed Slick Criteria." The stand-alone measures included the Test of Memory Malingering and the Dot Counting Test. The embedded indices were extracted from the Wechsler Adult Intelligence Scales Digit Span and Vocabulary subtests, the California Verbal Learning Test-II, and the Wisconsin Card Sorting Test. Among groups classified by primary complaints at the time of evaluation, those alleging neurological conditions were more frequently classified as Failed Slick Criteria than those alleging psychiatric or medical conditions. Among those with neurological or psychiatric complaints, the F-r, FBS-r, and RBS scales differentiated between those who Passed Slick Criteria from those who Failed Slick Criteria. The Fs scale was also significantly higher in the Failed Slick Criteria compared to Passed Slick Criteria examinees within the psychiatric complaints group. Results indicated that interpretation of scale scores should take into account the examinees' presenting illness. While this study has limitations, it highlights the possibility of different cutoffs depending on the presenting complaints and the need for further studies to cross-validate the results.

  4. Automated detection of microaneurysms using scale-adapted blob analysis and semi-supervised learning.

    PubMed

    Adal, Kedir M; Sidibé, Désiré; Ali, Sharib; Chaum, Edward; Karnowski, Thomas P; Mériaudeau, Fabrice

    2014-04-01

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier which can detect true MAs. The developed system is built using only few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. Pedestrian detection in crowded scenes with the histogram of gradients principle

    NASA Astrophysics Data System (ADS)

    Sidla, O.; Rosner, M.; Lypetskyy, Y.

    2006-10-01

    This paper describes a close to real-time scale invariant implementation of a pedestrian detector system which is based on the Histogram of Oriented Gradients (HOG) principle. Salient HOG features are first selected from a manually created very large database of samples with an evolutionary optimization procedure that directly trains a polynomial Support Vector Machine (SVM). Real-time operation is achieved by a cascaded 2-step classifier which uses first a very fast linear SVM (with the same features as the polynomial SVM) to reject most of the irrelevant detections and then computes the decision function with a polynomial SVM on the remaining set of candidate detections. Scale invariance is achieved by running the detector of constant size on scaled versions of the original input images and by clustering the results over all resolutions. The pedestrian detection system has been implemented in two versions: i) fully body detection, and ii) upper body only detection. The latter is especially suited for very busy and crowded scenarios. On a state-of-the-art PC it is able to run at a frequency of 8 - 20 frames/sec.

  6. Scale-free correlations in the geographical spreading of obesity

    NASA Astrophysics Data System (ADS)

    Gallos, Lazaros; Barttfeld, Pablo; Havlin, Shlomo; Sigman, Mariano; Makse, Hernan

    2012-02-01

    Obesity levels have been universally increasing. A crucial problem is to determine the influence of global and local drivers behind the obesity epidemic, to properly guide effective policies. Despite the numerous factors that affect the obesity evolution, we show a remarkable regularity expressed in a predictable pattern of spatial long-range correlations in the geographical spreading of obesity. We study the spatial clustering of obesity and a number of related health and economic indicators, and we use statistical physics methods to characterize the growth of the resulting clusters. The resulting scaling exponents allow us to broadly classify these indicators into two separate universality classes, weakly or strongly correlated. Weak correlations are found in generic human activity such as population distribution and the growth of the whole economy. Strong correlations are recovered, among others, for obesity, diabetes, and the food industry sectors associated with food consumption. Obesity turns out to be a global problem where local details are of little importance. The long-range correlations suggest influence that extends to large scales, hinting that the physical model of obesity clustering can be mapped to a long-range correlated percolation process.

  7. Signal-3L: A 3-layer approach for predicting signal peptides.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2007-11-16

    Functioning as an "address tag" that directs nascent proteins to their proper cellular and extracellular locations, signal peptides have become a crucial tool in finding new drugs or reprogramming cells for gene therapy. To effectively and timely use such a tool, however, the first important thing is to develop an automated method for rapidly and accurately identifying the signal peptide for a given nascent protein. With the avalanche of new protein sequences generated in the post-genomic era, the challenge has become even more urgent and critical. In this paper, we have developed a novel method for predicting signal peptide sequences and their cleavage sites in human, plant, animal, eukaryotic, Gram-positive, and Gram-negative protein sequences, respectively. The new predictor is called Signal-3L that consists of three prediction engines working, respectively, for the following three progressively deepening layers: (1) identifying a query protein as secretory or non-secretory by an ensemble classifier formed by fusing many individual OET-KNN (optimized evidence-theoretic K nearest neighbor) classifiers operated in various dimensions of PseAA (pseudo amino acid) composition spaces; (2) selecting a set of candidates for the possible signal peptide cleavage sites of a query secretory protein by a subsite-coupled discrimination algorithm; (3) determining the final cleavage site by fusing the global sequence alignment outcome for each of the aforementioned candidates through a voting system. Signal-3L is featured by high success prediction rates with short computational time, and hence is particularly useful for the analysis of large-scale datasets. Signal-3L is freely available as a web-server at http://chou.med.harvard.edu/bioinf/Signal-3L/ or http://202.120.37.186/bioinf/Signal-3L, where, to further support the demand of the related areas, the signal peptides identified by Signal-3L for all the protein entries in Swiss-Prot databank that do not have signal peptide annotations or are annotated with uncertain terms but are classified by Signal-3L as secretory proteins are provided in a downloadable file. The large-scale file is prepared with Microsoft Excel and named "Tab-Signal-3L.xls", and will be updated once a year to include new protein entries and reflect the continuous development of Signal-3L.

  8. A technique for extrapolating and validating forest cover across large regions. Calibrating AVHRR data with TM data

    Treesearch

    L.R. Iverson; E.A. Cook; R.L. Graham

    1989-01-01

    An approach to extending high-resolution forest cover information across large regions is presented and validated. Landsat Thematic Mapper (TM) data were classified into forest and nonforest for a portion of Jackson County, Illinois. The classified TM image was then used to determine the relationship between forest cover and the spectral signature of Advanced Very High...

  9. Parallel processing implementations of a contextual classifier for multispectral remote sensing data

    NASA Technical Reports Server (NTRS)

    Siegel, H. J.; Swain, P. H.; Smith, B. W.

    1980-01-01

    Contextual classifiers are being developed as a method to exploit the spatial/spectral context of a pixel to achieve accurate classification. Classification algorithms such as the contextual classifier typically require large amounts of computation time. One way to reduce the execution time of these tasks is through the use of parallelism. The applicability of the CDC flexible processor system and of a proposed multimicroprocessor system (PASM) for implementing contextual classifiers is examined.

  10. Statistical properties of a cloud ensemble - A numerical study

    NASA Technical Reports Server (NTRS)

    Tao, Wei-Kuo; Simpson, Joanne; Soong, Su-Tzai

    1987-01-01

    The statistical properties of cloud ensembles under a specified large-scale environment, such as mass flux by cloud drafts and vertical velocity as well as the condensation and evaporation associated with these cloud drafts, are examined using a three-dimensional numerical cloud ensemble model described by Soong and Ogura (1980) and Tao and Soong (1986). The cloud drafts are classified as active and inactive, and separate contributions to cloud statistics in areas of different cloud activity are then evaluated. The model results compare well with results obtained from aircraft measurements of a well-organized ITCZ rainband that occurred on August 12, 1974, during the Global Atmospheric Research Program's Atlantic Tropical Experiment.

  11. Order No. 0020 for the establishment of methods of reforestation, 19 March 1987.

    PubMed

    1988-01-01

    In light of the "Green Movement" policy and the development of the Colombia Major National Reforestation Plan, this Order establishes the following methods of promoting reforestation: 1) advice and guidance to those concerned with tree planting; 2) forestry planning for farms; and 3) grants of trees whenever possible in the proportions indicated in the Order. The Order classifies beneficiaries into small-size tree farmers (up to 20 hectares, and 80% of whose income derives from that activity); and medium- and large-scale tree farmers (more than 20 hectares). Other provisions of the Order deal with applications, dates of completion of requirements, and the actual methods of promoting reforestation. full text

  12. Classification of the Gabon SAR Mosaic Using a Wavelet Based Rule Classifier

    NASA Technical Reports Server (NTRS)

    Simard, Marc; Saatchi, Sasan; DeGrandi, Gianfranco

    2000-01-01

    A method is developed for semi-automated classification of SAR images of the tropical forest. Information is extracted using the wavelet transform (WT). The transform allows for extraction of structural information in the image as a function of scale. In order to classify the SAR image, a Desicion Tree Classifier is used. The method of pruning is used to optimize classification rate versus tree size. The results give explicit insight on the type of information useful for a given class.

  13. Natural streamflow simulation for two largest river basins in Poland: a baseline for identification of flow alterations

    NASA Astrophysics Data System (ADS)

    Piniewski, Mikołaj

    2016-05-01

    The objective of this study was to apply a previously developed large-scale and high-resolution SWAT model of the Vistula and the Odra basins, calibrated with the focus of natural flow simulation, in order to assess the impact of three different dam reservoirs on streamflow using the Indicators of Hydrologic Alteration (IHA). A tailored spatial calibration approach was designed, in which calibration was focused on a large set of relatively small non-nested sub-catchments with semi-natural flow regime. These were classified into calibration clusters based on the flow statistics similarity. After performing calibration and validation that gave overall positive results, the calibrated parameter values were transferred to the remaining part of the basins using an approach based on hydrological similarity of donor and target catchments. The calibrated model was applied in three case studies with the purpose of assessing the effect of dam reservoirs (Włocławek, Siemianówka and Czorsztyn Reservoirs) on streamflow alteration. Both the assessment based on gauged streamflow (Before-After design) and the one based on simulated natural streamflow showed large alterations in selected flow statistics related to magnitude, duration, high and low flow pulses and rate of change. Some benefits of using a large-scale and high-resolution hydrological model for the assessment of streamflow alteration include: (1) providing an alternative or complementary approach to the classical Before-After designs, (2) isolating the climate variability effect from the dam (or any other source of alteration) effect, (3) providing a practical tool that can be applied at a range of spatial scales over large area such as a country, in a uniform way. Thus, presented approach can be applied for designing more natural flow regimes, which is crucial for river and floodplain ecosystem restoration in the context of the European Union's policy on environmental flows.

  14. Nowcasting Earthquakes: A Comparison of Induced Earthquakes in Oklahoma and at the Geysers, California

    NASA Astrophysics Data System (ADS)

    Luginbuhl, Molly; Rundle, John B.; Hawkins, Angela; Turcotte, Donald L.

    2018-01-01

    Nowcasting is a new method of statistically classifying seismicity and seismic risk (Rundle et al. 2016). In this paper, the method is applied to the induced seismicity at the Geysers geothermal region in California and the induced seismicity due to fluid injection in Oklahoma. Nowcasting utilizes the catalogs of seismicity in these regions. Two earthquake magnitudes are selected, one large say M_{λ } ≥ 4, and one small say M_{σ } ≥ 2. The method utilizes the number of small earthquakes that occurs between pairs of large earthquakes. The cumulative probability distribution of these values is obtained. The earthquake potential score (EPS) is defined by the number of small earthquakes that has occurred since the last large earthquake, the point where this number falls on the cumulative probability distribution of interevent counts defines the EPS. A major advantage of nowcasting is that it utilizes "natural time", earthquake counts, between events rather than clock time. Thus, it is not necessary to decluster aftershocks and the results are applicable if the level of induced seismicity varies in time. The application of natural time to the accumulation of the seismic hazard depends on the applicability of Gutenberg-Richter (GR) scaling. The increasing number of small earthquakes that occur after a large earthquake can be scaled to give the risk of a large earthquake occurring. To illustrate our approach, we utilize the number of M_{σ } ≥ 2.75 earthquakes in Oklahoma to nowcast the number of M_{λ } ≥ 4.0 earthquakes in Oklahoma. The applicability of the scaling is illustrated during the rapid build-up of injection-induced seismicity between 2012 and 2016, and the subsequent reduction in seismicity associated with a reduction in fluid injections. The same method is applied to the geothermal-induced seismicity at the Geysers, California, for comparison.

  15. New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome: Line Based Normalization Method (LBNM)

    NASA Astrophysics Data System (ADS)

    Akdemir, Bayram; Güneş, Salih; Yosunkaya, Şebnem

    Sleep disorders are a very common unawareness illness among public. Obstructive Sleep Apnea Syndrome (OSAS) is characterized with decreased oxygen saturation level and repetitive upper respiratory tract obstruction episodes during full night sleep. In the present study, we have proposed a novel data normalization method called Line Based Normalization Method (LBNM) to evaluate OSAS using real data set obtained from Polysomnography device as a diagnostic tool in patients and clinically suspected of suffering OSAS. Here, we have combined the LBNM and classification methods comprising C4.5 decision tree classifier and Artificial Neural Network (ANN) to diagnose the OSAS. Firstly, each clinical feature in OSAS dataset is scaled by LBNM method in the range of [0,1]. Secondly, normalized OSAS dataset is classified using different classifier algorithms including C4.5 decision tree classifier and ANN, respectively. The proposed normalization method was compared with min-max normalization, z-score normalization, and decimal scaling methods existing in literature on the diagnosis of OSAS. LBNM has produced very promising results on the assessing of OSAS. Also, this method could be applied to other biomedical datasets.

  16. Identification of stair climbing ability levels in community-dwelling older adults based on the geometric mean of stair ascent and descent speed: The GeMSS classifier.

    PubMed

    Mayagoitia, Ruth E; Harding, John; Kitchen, Sheila

    2017-01-01

    The aim was to develop a quantitative approach to identify three stair-climbing ability levels of older adults: no, somewhat and considerable difficulty. Timed-up-and-go test, six-minute-walk test, and Berg balance scale were used for statistical comparison to a new stair climbing ability classifier based on the geometric mean of stair speeds (GeMSS) in ascent and descent on a flight of eight stairs with a 28° pitch in the housing unit where the participants, 28 (16 women) urban older adults (62-94 years), lived. Ordinal logistic regression revealed the thresholds between the three ability levels for each functional test were more stringent than thresholds found in the literature to classify walking ability levels. Though a small study, the intermediate classifier shows promise of early identification of difficulties with stairs, in order to make timely preventative interventions. Further studies are necessary to obtain scaling factors for stairs with other pitches. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Comparison of fall prediction by the Hessisch Oldendorf Fall Risk Scale and the Fall Risk Scale by Huhn in neurological rehabilitation: an observational study.

    PubMed

    Hermann, Olena; Schmidt, Simone B; Boltzmann, Melanie; Rollnik, Jens D

    2018-05-01

    To calculate scale performance of the newly developed Hessisch Oldendorf Fall Risk Scale (HOSS) for classifying fallers and non-fallers in comparison with the Risk of Falling Scale by Huhn (FSH), a frequently used assessment tool. A prospective observational trail was conducted. The study was performed in a large specialized neurological rehabilitation facility. The study population ( n = 690) included neurological and neurosurgery patients during neurological rehabilitation with varying levels of disability. Around the half of the study patients were independent and dependent in the activities of daily living (ADL), respectively. Fall risk of each patient was assessed by HOSS and FSH within the first seven days after admission. Event of fall during rehabilitation was compared with HOSS and FSH scores as well as the according fall risk. Scale performance including sensitivity and specificity was calculated for both scales. A total of 107 (15.5%) patients experienced at least one fall. In general, fallers were characterized by an older age, a prolonged length of stay, and a lower Barthel Index (higher dependence in the ADL) on admission than non-fallers. The verification of fall prediction for both scales showed a sensitivity of 83% and a specificity of 64% for the HOSS scale, and a sensitivity of 98% with a specificity of 12% for the FSH scale, respectively. The HOSS shows an adequate sensitivity, a higher specificity and therefore a better scale performance than the FSH. Thus, the HOSS might be superior to existing assessments.

  18. Attitudes toward elective abortion: preliminary evidence of validity for the Personal Beliefs Scale.

    PubMed

    Embree, R A

    1998-06-01

    On the Personal Beliefs Scale of Embree and Embree item ratings obtained from 100 Midwestern undergraduates were used to classify participants into neutral, other, or second-order psychosomaticism mind-body belief types. Responses to a hypothetical abnormal pregnancy were used to measure attitude toward abortion (prochoice vs antichoice), meaning of abortion (not murder vs murder), and empathy for the unborn(low, moderate, or high). Values of chi 2 were statistically significant for students classified by mind-body belief type versus attitude toward abortion, and the meaning of abortion, but not for empathy for the unborn.

  19. Coefficient of variation for use in crop area classification across multiple climates

    NASA Astrophysics Data System (ADS)

    Whelen, Tracy; Siqueira, Paul

    2018-05-01

    In this study, the coefficient of variation (CV) is introduced as a unitless statistical measurement for the classification of croplands using synthetic aperture radar (SAR) data. As a measurement of change, the CV is able to capture changing backscatter responses caused by cycles of planting, growing, and harvesting, and thus is able to differentiate these areas from a more static forest or urban area. Pixels with CV values above a given threshold are classified as crops, and below the threshold are non-crops. This paper uses cross-polarized L-band SAR data from the ALOS PALSAR satellite to classify eleven regions across the United States, covering a wide range of major crops and climates. Two separate sets of classification were done, with the first targeting the optimum classification thresholds for each dataset, and the second using a generalized threshold for all datasets to simulate a large-scale operationalized situation. Overall accuracies for the first phase of classification ranged from 66%-81%, and 62%-84% for the second phase. Visual inspection of the results shows numerous possibilities for improving the classifications while still using the same classification method, including increasing the number and temporal frequency of input images in order to better capture phenological events and mitigate the effects of major precipitation events, as well as more accurate ground truth data. These improvements would make the CV method a viable tool for monitoring agriculture throughout the year on a global scale.

  20. Comparative analysis of the Parent Attitudes about Childhood Vaccines (PACV) short scale and the five categories of vaccine acceptance identified by Gust et al.

    PubMed

    Oladejo, Omolade; Allen, Kristen; Amin, Avnika; Frew, Paula M; Bednarczyk, Robert A; Omer, Saad B

    2016-09-22

    There is a need to develop a standardized tool to aid in identifying, measuring and classifying the unique needs of vaccine-hesitant parents (VHPs). This will also assist in designing tailored interventions to address these needs. The Parental Attitude about Childhood Vaccines (PACV) short scale developed by Opel et al., and the Gust et al. vaccine acceptance categories have been acknowledged as potentially useful tools to measure parental vaccine hesitancy. The PACV short scale requires further validation. In our study, we evaluated how the Gust et al. vaccine acceptance categories correspond with the PACV short scale. As part of a larger study on vaccine attitudes, using the PACV short scale and Gust et al. vaccine acceptance categories, we assessed the correlation between the two measures using Spearman correlation coefficient, and the association between the two measures using the Cochran-Mantel-Haentszel test of association. We used logistic regression modelling to compare the association between a child's up-to-date immunization status and (a) PACV short scale and (b) Gust et al. vaccine acceptance categories. The PACV short scale and Gust et al. vaccine acceptance categories were positively correlated (r=0.6, df=198, p<0.05), and the Cochran-Mantel-Haentszel test of association yielded a statistically significant association (p<0.05). The two scales similarly predicted children's up-to-date immunization status for all recommended childhood vaccines. The ability of the PACV short scale to identify and classify parental vaccine hesitancy is similar to classification using Gust et al. vaccine acceptance categories, and both measure linear entities. The PACV short scale is recommended for screening parents at their first pediatric visit because it is easier to administer. A clearer understanding of how to classify parental vaccine hesitancy can be used to design tailored interventions based on these classifications, to address their specific needs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Characterizing the Discussion of Antibiotics in the Twittersphere: What is the Bigger Picture?

    PubMed

    Kendra, Rachel Lynn; Karki, Suman; Eickholt, Jesse Lee; Gandy, Lisa

    2015-06-19

    User content posted through Twitter has been used for biosurveillance, to characterize public perception of health-related topics, and as a means of distributing information to the general public. Most of the existing work surrounding Twitter and health care has shown Twitter to be an effective medium for these problems but more could be done to provide finer and more efficient access to all pertinent data. Given the diversity of user-generated content, small samples or summary presentations of the data arguably omit a large part of the virtual discussion taking place in the Twittersphere. Still, managing, processing, and querying large amounts of Twitter data is not a trivial task. This work describes tools and techniques capable of handling larger sets of Twitter data and demonstrates their use with the issue of antibiotics. This work has two principle objectives: (1) to provide an open-source means to efficiently explore all collected tweets and query health-related topics on Twitter, specifically, questions such as what users are saying and how messages are spread, and (2) to characterize the larger discourse taking place on Twitter with respect to antibiotics. Open-source software suites Hadoop, Flume, and Hive were used to collect and query a large number of Twitter posts. To classify tweets by topic, a deep network classifier was trained using a limited number of manually classified tweets. The particular machine learning approach used also allowed the use of a large number of unclassified tweets to increase performance. Query-based analysis of the collected tweets revealed that a large number of users contributed to the online discussion and that a frequent topic mentioned was resistance. A number of prominent events related to antibiotics led to a number of spikes in activity but these were short in duration. The category-based classifier developed was able to correctly classify 70% of manually labeled tweets (using a 10-fold cross validation procedure and 9 classes). The classifier also performed well when evaluated on a per category basis. Using existing tools such as Hive, Flume, Hadoop, and machine learning techniques, it is possible to construct tools and workflows to collect and query large amounts of Twitter data to characterize the larger discussion taking place on Twitter with respect to a particular health-related topic. Furthermore, using newer machine learning techniques and a limited number of manually labeled tweets, an entire body of collected tweets can be classified to indicate what topics are driving the virtual, online discussion. The resulting classifier can also be used to efficiently explore collected tweets by category and search for messages of interest or exemplary content.

  2. Early diagnosis of osteoporosis using radiogrammetry and texture analysis from hand and wrist radiographs in Indian population.

    PubMed

    Areeckal, A S; Jayasheelan, N; Kamath, J; Zawadynski, S; Kocher, M; David S, S

    2018-03-01

    We propose an automated low cost tool for early diagnosis of onset of osteoporosis using cortical radiogrammetry and cancellous texture analysis from hand and wrist radiographs. The trained classifier model gives a good performance accuracy in classifying between healthy and low bone mass subjects. We propose a low cost automated diagnostic tool for early diagnosis of reduction in bone mass using cortical radiogrammetry and cancellous texture analysis of hand and wrist radiographs. Reduction in bone mass could lead to osteoporosis, a disease observed to be increasingly occurring at a younger age in recent times. Dual X-ray absorptiometry (DXA), currently used in clinical practice, is expensive and available only in urban areas in India. Therefore, there is a need to develop a low cost diagnostic tool in order to facilitate large-scale screening of people for early diagnosis of osteoporosis at primary health centers. Cortical radiogrammetry from third metacarpal bone shaft and cancellous texture analysis from distal radius are used to detect low bone mass. Cortical bone indices and cancellous features using Gray Level Run Length Matrices and Laws' masks are extracted. A neural network classifier is trained using these features to classify healthy subjects and subjects having low bone mass. In our pilot study, the proposed segmentation method shows 89.9 and 93.5% accuracy in detecting third metacarpal bone shaft and distal radius ROI, respectively. The trained classifier shows training accuracy of 94.3% and test accuracy of 88.5%. An automated diagnostic technique for early diagnosis of onset of osteoporosis is developed using cortical radiogrammetric measurements and cancellous texture analysis of hand and wrist radiographs. The work shows that a combination of cortical and cancellous features improves the diagnostic ability and is a promising low cost tool for early diagnosis of increased risk of osteoporosis.

  3. Vehicle classification in WAMI imagery using deep network

    NASA Astrophysics Data System (ADS)

    Yi, Meng; Yang, Fan; Blasch, Erik; Sheaff, Carolyn; Liu, Kui; Chen, Genshe; Ling, Haibin

    2016-05-01

    Humans have always had a keen interest in understanding activities and the surrounding environment for mobility, communication, and survival. Thanks to recent progress in photography and breakthroughs in aviation, we are now able to capture tens of megapixels of ground imagery, namely Wide Area Motion Imagery (WAMI), at multiple frames per second from unmanned aerial vehicles (UAVs). WAMI serves as a great source for many applications, including security, urban planning and route planning. These applications require fast and accurate image understanding which is time consuming for humans, due to the large data volume and city-scale area coverage. Therefore, automatic processing and understanding of WAMI imagery has been gaining attention in both industry and the research community. This paper focuses on an essential step in WAMI imagery analysis, namely vehicle classification. That is, deciding whether a certain image patch contains a vehicle or not. We collect a set of positive and negative sample image patches, for training and testing the detector. Positive samples are 64 × 64 image patches centered on annotated vehicles. We generate two sets of negative images. The first set is generated from positive images with some location shift. The second set of negative patches is generated from randomly sampled patches. We also discard those patches if a vehicle accidentally locates at the center. Both positive and negative samples are randomly divided into 9000 training images and 3000 testing images. We propose to train a deep convolution network for classifying these patches. The classifier is based on a pre-trained AlexNet Model in the Caffe library, with an adapted loss function for vehicle classification. The performance of our classifier is compared to several traditional image classifier methods using Support Vector Machine (SVM) and Histogram of Oriented Gradient (HOG) features. While the SVM+HOG method achieves an accuracy of 91.2%, the accuracy of our deep network-based classifier reaches 97.9%.

  4. Comparing Pixel- and Object-Based Approaches in Effectively Classifying Wetland-Dominated Landscapes

    PubMed Central

    Berhane, Tedros M.; Lane, Charles R.; Wu, Qiusheng; Anenkhonov, Oleg A.; Chepinoga, Victor V.; Autrey, Bradley C.; Liu, Hongxing

    2018-01-01

    Wetland ecosystems straddle both terrestrial and aquatic habitats, performing many ecological functions directly and indirectly benefitting humans. However, global wetland losses are substantial. Satellite remote sensing and classification informs wise wetland management and monitoring. Both pixel- and object-based classification approaches using parametric and non-parametric algorithms may be effectively used in describing wetland structure and habitat, but which approach should one select? We conducted both pixel- and object-based image analyses (OBIA) using parametric (Iterative Self-Organizing Data Analysis Technique, ISODATA, and maximum likelihood, ML) and non-parametric (random forest, RF) approaches in the Barguzin Valley, a large wetland (~500 km2) in the Lake Baikal, Russia, drainage basin. Four Quickbird multispectral bands plus various spatial and spectral metrics (e.g., texture, Non-Differentiated Vegetation Index, slope, aspect, etc.) were analyzed using field-based regions of interest sampled to characterize an initial 18 ISODATA-based classes. Parsimoniously using a three-layer stack (Quickbird band 3, water ratio index (WRI), and mean texture) in the analyses resulted in the highest accuracy, 87.9% with pixel-based RF, followed by OBIA RF (segmentation scale 5, 84.6% overall accuracy), followed by pixel-based ML (83.9% overall accuracy). Increasing the predictors from three to five by adding Quickbird bands 2 and 4 decreased the pixel-based overall accuracy while increasing the OBIA RF accuracy to 90.4%. However, McNemar’s chi-square test confirmed no statistically significant difference in overall accuracy among the classifiers (pixel-based ML, RF, or object-based RF) for either the three- or five-layer analyses. Although potentially useful in some circumstances, the OBIA approach requires substantial resources and user input (such as segmentation scale selection—which was found to substantially affect overall accuracy). Hence, we conclude that pixel-based RF approaches are likely satisfactory for classifying wetland-dominated landscapes. PMID:29707381

  5. Comparing Pixel- and Object-Based Approaches in Effectively Classifying Wetland-Dominated Landscapes.

    PubMed

    Berhane, Tedros M; Lane, Charles R; Wu, Qiusheng; Anenkhonov, Oleg A; Chepinoga, Victor V; Autrey, Bradley C; Liu, Hongxing

    2018-01-01

    Wetland ecosystems straddle both terrestrial and aquatic habitats, performing many ecological functions directly and indirectly benefitting humans. However, global wetland losses are substantial. Satellite remote sensing and classification informs wise wetland management and monitoring. Both pixel- and object-based classification approaches using parametric and non-parametric algorithms may be effectively used in describing wetland structure and habitat, but which approach should one select? We conducted both pixel- and object-based image analyses (OBIA) using parametric (Iterative Self-Organizing Data Analysis Technique, ISODATA, and maximum likelihood, ML) and non-parametric (random forest, RF) approaches in the Barguzin Valley, a large wetland (~500 km 2 ) in the Lake Baikal, Russia, drainage basin. Four Quickbird multispectral bands plus various spatial and spectral metrics (e.g., texture, Non-Differentiated Vegetation Index, slope, aspect, etc.) were analyzed using field-based regions of interest sampled to characterize an initial 18 ISODATA-based classes. Parsimoniously using a three-layer stack (Quickbird band 3, water ratio index (WRI), and mean texture) in the analyses resulted in the highest accuracy, 87.9% with pixel-based RF, followed by OBIA RF (segmentation scale 5, 84.6% overall accuracy), followed by pixel-based ML (83.9% overall accuracy). Increasing the predictors from three to five by adding Quickbird bands 2 and 4 decreased the pixel-based overall accuracy while increasing the OBIA RF accuracy to 90.4%. However, McNemar's chi-square test confirmed no statistically significant difference in overall accuracy among the classifiers (pixel-based ML, RF, or object-based RF) for either the three- or five-layer analyses. Although potentially useful in some circumstances, the OBIA approach requires substantial resources and user input (such as segmentation scale selection-which was found to substantially affect overall accuracy). Hence, we conclude that pixel-based RF approaches are likely satisfactory for classifying wetland-dominated landscapes.

  6. Combining Human and Machine Learning to Map Cropland in the 21st Century's Major Agricultural Frontier

    NASA Astrophysics Data System (ADS)

    Estes, L. D.; Debats, S. R.; Caylor, K. K.; Evans, T. P.; Gower, D.; McRitchie, D.; Searchinger, T.; Thompson, D. R.; Wood, E. F.; Zeng, L.

    2016-12-01

    In the coming decades, large areas of new cropland will be created to meet the world's rapidly growing food demands. Much of this new cropland will be in sub-Saharan Africa, where food needs will increase most and the area of remaining potential farmland is greatest. If we are to understand the impacts of global change, it is critical to accurately identify Africa's existing croplands and how they are changing. Yet the continent's smallholder-dominated agricultural systems are unusually challenging for remote sensing analyses, making accurate area estimates difficult to obtain, let alone important details related to field size and geometry. Fortunately, the rapidly growing archives of moderate to high-resolution satellite imagery hosted on open servers now offer an unprecedented opportunity to improve landcover maps. We present a system that integrates two critical components needed to capitalize on this opportunity: 1) human image interpretation and 2) machine learning (ML). Human judgment is needed to accurately delineate training sites within noisy imagery and a highly variable cover type, while ML provides the ability to scale and to interpret large feature spaces that defy human comprehension. Because large amounts of training data are needed (a major impediment for analysts), we use a crowdsourcing platform that connects amazon.com's Mechanical Turk service to satellite imagery hosted on open image servers. Workers map visible fields at pre-assigned sites, and are paid according to their mapping accuracy. Initial tests show overall high map accuracy and mapping rates >1800 km2/hour. The ML classifier uses random forests and randomized quasi-exhaustive feature selection, and is highly effective in classifying diverse agricultural types in southern Africa (AUC > 0.9). We connect the ML and crowdsourcing components to make an interactive learning framework. The ML algorithm performs an initial classification using a first batch of crowd-sourced maps, using thresholds of posterior probabilities to segregate sub-images classified with high or low confidence. Workers are then directed to collect new training data in low confidence sub-images, after which classification is repeated and re-assessed, and the entire process iterated until maximum possible accuracy is realized.

  7. Texture descriptions of lunar surface derived from LOLA data: Kilometer-scale roughness and entropy maps

    NASA Astrophysics Data System (ADS)

    Li, Bo; Ling, Zongcheng; Zhang, Jiang; Chen, Jian; Wu, Zhongchen; Ni, Yuheng; Zhao, Haowei

    2015-11-01

    The lunar global texture maps of roughness and entropy are derived at kilometer scales from Digital Elevation Models (DEMs) data obtained by Lunar Orbiter Laser Altimeter (LOLA) aboard on Lunar Reconnaissance Orbiter (LRO) spacecraft. We use statistical moments of a gray-level histogram of elevations in a neighborhood to compute the roughness and entropy value. Our texture descriptors measurements are shown in global maps at multi-sized square neighborhoods, whose length of side is 3, 5, 10, 20, 40 and 80 pixels, respectively. We found that large-scale topographical changes can only be displayed in maps with longer side of neighborhood, but the small scale global texture maps are more disorderly and unsystematic because of more complicated textures' details. Then, the frequency curves of texture maps are made out, whose shapes and distributions are changing as the spatial scales increases. Entropy frequency curve with minimum 3-pixel scale has large fluctuations and six peaks. According to this entropy curve we can classify lunar surface into maria, highlands, different parts of craters preliminarily. The most obvious textures in the middle-scale roughness and entropy maps are the two typical morphological units, smooth maria and rough highlands. For the impact crater, its roughness and entropy value are characterized by a multiple-ring structure obviously, and its different parts have different texture results. In the last, we made a 2D scatter plot between the two texture results of typical lunar maria and highlands. There are two clusters with largest dot density which are corresponded to the lunar highlands and maria separately. In the lunar mare regions (cluster A), there is a high correlation between roughness and entropy, but in the highlands (Cluster B), the entropy shows little change. This could be subjected to different geological processes of maria and highlands forming different landforms.

  8. Development of a simple measurement scale to evaluate the severity of non-specific low back pain for industrial ergonomics.

    PubMed

    Higuchi, Yoshiyuki; Izumi, Hiroyuki; Kumashiro, Mashaharu

    2010-06-01

    This study developed an assessment scale that hierarchically classifies degrees of low back pain severity. This assessment scale consists of two subscales: 1) pain intensity; 2) pain interference. First, the assessment scale devised by the authors was used to administer a self-administered questionnaire to 773 male workers in the car manufacturing industry. Subsequently, the validity of the measurement items was examined and some of them were revised. Next, the corrected low back pain scale was used in a self-administered questionnaire, the subjects of which were 5053 ordinary workers. The hierarchical validity between the measurement items was checked based on the results of Mokken Scale analysis. Finally, a low back pain assessment scale consisting of seven items was perfected. Quantitative assessment is made possible by scoring the items and low back pain severity can be classified into four hierarchical levels: none; mild; moderate; severe. STATEMENT OF RELEVANCE: The use of this scale devised by the authors allows a more detailed assessment of the degree of risk factor effect and also should prove useful both in selecting remedial measures for occupational low back pain and evaluating their efficacy.

  9. Integrating visual learning within a model-based ATR system

    NASA Astrophysics Data System (ADS)

    Carlotto, Mark; Nebrich, Mark

    2017-05-01

    Automatic target recognition (ATR) systems, like human photo-interpreters, rely on a variety of visual information for detecting, classifying, and identifying manmade objects in aerial imagery. We describe the integration of a visual learning component into the Image Data Conditioner (IDC) for target/clutter and other visual classification tasks. The component is based on an implementation of a model of the visual cortex developed by Serre, Wolf, and Poggio. Visual learning in an ATR context requires the ability to recognize objects independent of location, scale, and rotation. Our method uses IDC to extract, rotate, and scale image chips at candidate target locations. A bootstrap learning method effectively extends the operation of the classifier beyond the training set and provides a measure of confidence. We show how the classifier can be used to learn other features that are difficult to compute from imagery such as target direction, and to assess the performance of the visual learning process itself.

  10. Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data

    PubMed Central

    2017-01-01

    In this paper, we propose a new automatic hyperparameter selection approach for determining the optimal network configuration (network structure and hyperparameters) for deep neural networks using particle swarm optimization (PSO) in combination with a steepest gradient descent algorithm. In the proposed approach, network configurations were coded as a set of real-number m-dimensional vectors as the individuals of the PSO algorithm in the search procedure. During the search procedure, the PSO algorithm is employed to search for optimal network configurations via the particles moving in a finite search space, and the steepest gradient descent algorithm is used to train the DNN classifier with a few training epochs (to find a local optimal solution) during the population evaluation of PSO. After the optimization scheme, the steepest gradient descent algorithm is performed with more epochs and the final solutions (pbest and gbest) of the PSO algorithm to train a final ensemble model and individual DNN classifiers, respectively. The local search ability of the steepest gradient descent algorithm and the global search capabilities of the PSO algorithm are exploited to determine an optimal solution that is close to the global optimum. We constructed several experiments on hand-written characters and biological activity prediction datasets to show that the DNN classifiers trained by the network configurations expressed by the final solutions of the PSO algorithm, employed to construct an ensemble model and individual classifier, outperform the random approach in terms of the generalization performance. Therefore, the proposed approach can be regarded an alternative tool for automatic network structure and parameter selection for deep neural networks. PMID:29236718

  11. ExSTraCS 2.0: Description and Evaluation of a Scalable Learning Classifier System.

    PubMed

    Urbanowicz, Ryan J; Moore, Jason H

    2015-09-01

    Algorithmic scalability is a major concern for any machine learning strategy in this age of 'big data'. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExS-TraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExS-TraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExS-TraCS usability was made simpler through the elimination of previously critical run parameters.

  12. Assessment and monitoring practices of Australian fitness professionals.

    PubMed

    Bennie, Jason A; Wiesner, Glen H; van Uffelen, Jannique G Z; Harvey, Jack T; Craike, Melinda J; Biddle, Stuart J H

    2018-04-01

    Assessment and monitoring of client health and fitness is a key part of fitness professionals' practices. However, little is known about prevalence of this practice. This study describes the assessment/monitoring practices of a large sample of Australian fitness professionals. Cross-sectional. In 2014, 1206 fitness professionals completed an online survey. Respondents reported their frequency (4 point-scale: [1] 'never' to [4] 'always') of assessment/monitoring of eight health and fitness constructs (e.g. body composition, aerobic fitness). This was classified as: (i) 'high' ('always' assessing/monitoring ≥5 constructs); (ii) 'medium' (1-4 constructs); (iii) 'low' (0 constructs). Classifications are reported by demographic and fitness industry characteristics. The odds of being classified as a 'high assessor/monitor' according to social ecological correlates were examined using a multiple-factor logistic regression model. Mean age of respondents was 39.3 (±11.6) years and 71.6% were female. A total of 15.8% (95% CI: 13.7%-17.9%) were classified as a 'high' assessor/monitor. Constructs with the largest proportion of being 'always' assessed were body composition (47.7%; 95% CI: 45.0%-50.1%) and aerobic fitness (42.5%; 95% CI: 39.6%-45.3%). Those with the lowest proportion of being 'always' assessed were balance (24.0%; 95% CI: 24.7%-26.5%) and mental health (20.2%; 95% CI: 18.1%-29.6%). A perceived lack of client interest and fitness professionals not considering assessing their responsibility were associated with lower odds of being classified as a 'high assessor/monitor'. Most fitness professionals do not routinely assess/monitor client fitness and health. Key factors limiting client health assessment and monitoring include a perceived lack of client interest and professionals not considering this their role. Copyright © 2017. Published by Elsevier Ltd.

  13. Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

    NASA Astrophysics Data System (ADS)

    Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  14. Hum-mPLoc: an ensemble classifier for large-scale human protein subcellular location prediction by incorporating samples with multiple sites.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2007-04-20

    Proteins may simultaneously exist at, or move between, two or more different subcellular locations. Proteins with multiple locations or dynamic feature of this kind are particularly interesting because they may have some very special biological functions intriguing to investigators in both basic research and drug discovery. For instance, among the 6408 human protein entries that have experimentally observed subcellular location annotations in the Swiss-Prot database (version 50.7, released 19-Sept-2006), 973 ( approximately 15%) have multiple location sites. The number of total human protein entries (except those annotated with "fragment" or those with less than 50 amino acids) in the same database is 14,370, meaning a gap of (14,370-6408)=7962 entries for which no knowledge is available about their subcellular locations. Although one can use the computational approach to predict the desired information for the gap, so far all the existing methods for predicting human protein subcellular localization are limited in the case of single location site only. To overcome such a barrier, a new ensemble classifier, named Hum-mPLoc, was developed that can be used to deal with the case of multiple location sites as well. Hum-mPLoc is freely accessible to the public as a web server at http://202.120.37.186/bioinf/hum-multi. Meanwhile, for the convenience of people working in the relevant areas, Hum-mPLoc has been used to identify all human protein entries in the Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results thus obtained have been deposited in a downloadable file prepared with Microsoft Excel and named "Tab_Hum-mPLoc.xls". This file is available at the same website and will be updated twice a year to include new entries of human proteins and reflect the continuous development of Hum-mPLoc.

  15. Mapping raised bogs with an iterative one-class classification approach

    NASA Astrophysics Data System (ADS)

    Mack, Benjamin; Roscher, Ribana; Stenzel, Stefanie; Feilhauer, Hannes; Schmidtlein, Sebastian; Waske, Björn

    2016-10-01

    Land use and land cover maps are one of the most commonly used remote sensing products. In many applications the user only requires a map of one particular class of interest, e.g. a specific vegetation type or an invasive species. One-class classifiers are appealing alternatives to common supervised classifiers because they can be trained with labeled training data of the class of interest only. However, training an accurate one-class classification (OCC) model is challenging, particularly when facing a large image, a small class and few training samples. To tackle these problems we propose an iterative OCC approach. The presented approach uses a biased Support Vector Machine as core classifier. In an iterative pre-classification step a large part of the pixels not belonging to the class of interest is classified. The remaining data is classified by a final classifier with a novel model and threshold selection approach. The specific objective of our study is the classification of raised bogs in a study site in southeast Germany, using multi-seasonal RapidEye data and a small number of training sample. Results demonstrate that the iterative OCC outperforms other state of the art one-class classifiers and approaches for model selection. The study highlights the potential of the proposed approach for an efficient and improved mapping of small classes such as raised bogs. Overall the proposed approach constitutes a feasible approach and useful modification of a regular one-class classifier.

  16. Integrating remotely acquired and field data to assess effects of setback levees on riparian and aquatic habitat in glacial-melt water rivers

    USGS Publications Warehouse

    Konrad, C.P.; Black, R.W.; Voss, F.; Neale, C. M. U.

    2008-01-01

    Setback levees, in which levees are reconstructed at a greater distance from a river channel, are a promising restoration technique particularly for alluvial rivers with broad floodplains where river-floodplain connectivity is essential to ecological processes. Documenting the ecological outcomes of restoration activities is essential for assessing the comparative benefits of different restoration approaches and for justifying new restoration projects. Remote sensing of aquatic habitats offers one approach for comprehensive, objective documentation of river and floodplain habitats, but is difficult in glacial rivers because of high suspended-sediment concentrations, braiding and a lack of large, well-differentiated channel forms such as riffles and pools. Remote imagery and field surveys were used to assess the effects of recent and planned setback levees along the Puyallup River and, more generally, the application of multispectral imagery for classifying aquatic and riparian habitats in glacial-melt water rivers. Airborne images were acquired with a horizontal ground resolution of 0.5 m in three spectral bands (0.545-0.555, 0.665-0.675 and 0.790-0.810 ??m) spanning from green to near infrared (NIR) wavelengths. Field surveys identified river and floodplain habitat features and provided the basis for a comparative hydraulic analysis. Broad categories of aquatic habitat (smooth and rough water surface), exposed sediment (sand and boulder) and vegetated surfaces (herbaceous and deciduous shrub/forest) were classified accurately using the airborne images. Other categories [e.g. conifers, boulder, large woody debtis (LWD)] and subdivisions of broad categories (e.g. riffles and runs) were not successfully classified either because these features did not form large patches that could be identified on the imagery or their spectral reflectances were not distinct from those of other habitat types. Airborne imagery was critical for assessing fine-scale aquatic habitat heterogeneity including shallow, low-velocity regions that were not feasible or practical to map in the field in many cases due to their widespread distribution, small size and poorly defined boundaries with other habitat types. At the reach-scale, the setback levee affected the amount and distribution of riparian and aquatic habitats: (1) the area of all habitats was greater where levees had been set back and with relatively more vegetated floodplain habitat and relatively less exposed sediment and aquatic habitat, (2) where levees confine the river, less low-velocity aquatic habitat is present over a range of flows with a higher degree of bed instability during high flows. As river restoration proceeds in the Pacific Northwest and elsewhere, remotely acquired imagery will be important for documenting its effects on the amount and distribution of aquatic and floodplain habitats, complimenting field data as a quantitative basis for evaluating project efficacy.

  17. Lava Morphology Classification of a Fast-Spreading Ridge Using Deep-Towed Sonar Data: East Pacific Rise

    NASA Astrophysics Data System (ADS)

    Meyer, J.; White, S.

    2005-05-01

    Classification of lava morphology on a regional scale contributes to the understanding of the distribution and extent of lava flows at a mid-ocean ridge. Seafloor classification is essential to understand the regional undersea environment at midocean ridges. In this study, the development of a classification scheme is found to identify and extract textural patterns of different lava morphologies along the East Pacific Rise using DSL-120 side-scan and ARGO camera imagery. Application of an accurate image classification technique to side-scan sonar allows us to expand upon the locally available visual ground reference data to make the first comprehensive regional maps of small-scale lava morphology present at a mid-ocean ridge. The submarine lava morphologies focused upon in this study; sheet flows, lobate flows, and pillow flows; have unique textures. Several algorithms were applied to the sonar backscatter intensity images to produce multiple textural image layers useful in distinguishing the different lava morphologies. The intensity and spatially enhanced images were then combined and applied to a hybrid classification technique. The hybrid classification involves two integrated classifiers, a rule-based expert system classifier and a machine learning classifier. The complementary capabilities of the two integrated classifiers provided a higher accuracy of regional seafloor classification compared to using either classifier alone. Once trained, the hybrid classifier can then be applied to classify neighboring images with relative ease. This classification technique has been used to map the lava morphology distribution and infer spatial variability of lava effusion rates along two segments of the East Pacific Rise, 17 deg S and 9 deg N. Future use of this technique may also be useful for attaining temporal information. Repeated documentation of morphology classification in this dynamic environment can be compared to detect regional seafloor change.

  18. FR-type radio sources in COSMOS: relation of radio structure to size, accretion modes and large-scale environment

    NASA Astrophysics Data System (ADS)

    Vardoulaki, Eleni; Faustino Jimenez Andrade, Eric; Delvecchio, Ivan; Karim, Alexander; Smolčić, Vernesa; Magnelli, Benjamin; Bertoldi, Frank; Schinnener, Eva; Sargent, Mark; Finoguenov, Alexis; VLA COSMOS Team

    2018-01-01

    The radio sources associated with active galactic nuclei (AGN) can exhibit a variety of radio structures, from simple to more complex, giving rise to a variety of classification schemes. The question which still remains open, given deeper surveys revealing new populations of radio sources, is whether this plethora of radio structures can be attributed to the physical properties of the host or to the environment. Here we present an analysis on the radio structure of radio-selected AGN from the VLA-COSMOS Large Project at 3 GHz (JVLA-COSMOS; Smolčić et al.) in relation to: 1) their linear projected size, 2) the Eddington ratio, and 3) the environment their hosts lie within. We classify these as FRI (jet-like) and FRII (lobe-like) based on the FR-type classification scheme, and compare them to a sample of jet-less radio AGN in JVLA-COSMOS. We measure their linear projected sizes using a semi-automatic machine learning technique. Their Eddington ratios are calculated from X-ray data available for COSMOS. As environmental probes we take the X-ray groups (hundreds kpc) and the density fields (~Mpc-scale) in COSMOS. We find that FRII radio sources are on average larger than FRIs, which agrees with literature. But contrary to past studies, we find no dichotomy in FR objects in JVLA-COSMOS given their Eddington ratios, as on average they exhibit similar values. Furthermore our results show that the large-scale environment does not explain the observed dichotomy in lobe- and jet-like FR-type objects as both types are found on similar environments, but it does affect the shape of the radio structure introducing bents for objects closer to the centre of an X-ray group.

  19. Effects of Turbulence on Settling Velocities of Synthetic and Natural Particles

    NASA Astrophysics Data System (ADS)

    Jacobs, C.; Jendrassak, M.; Gurka, R.; Hackett, E. E.

    2014-12-01

    For large-scale sediment transport predictions, an important parameter is the settling or terminal velocity of particles because it plays a key role in determining the concentration of sediment particles within the water column as well as the deposition rate of particles onto the seabed. The settling velocity of particles is influenced by the fluid dynamic environment as well as attributes of the particle, such as its size, shape, and density. This laboratory study examines the effects of turbulence, generated by an oscillating grid, on both synthetic and natural particles for a range of flow conditions. Because synthetic particles are spherical, they serve as a reference for the natural particles that are irregular in shape. Particle image velocimetry (PIV) and high-speed imaging systems were used simultaneously to study the interaction between the fluid mechanics and sediment particles' dynamics in a tank. The particles' dynamics were analyzed using a custom two-dimensional tracking algorithm used to obtain distributions of the particle's velocity and acceleration. Turbulence properties, such as root-mean-square turbulent velocity and vorticity, were calculated from the PIV data. Results are classified by Stokes number, which was based-on the integral scale deduced from the auto-correlation function of velocity. We find particles with large Stokes numbers are unaffected by the turbulence, while particles with small Stokes numbers primarily show an increase in settling velocity in comparison to stagnant flow. The results also show an inverse relationship between Stokes number and standard deviation of the settling velocity. This research enables a better understanding of the interdependence between particles and turbulent flow, which can be used to improve parameterizations in large-scale sediment transport models.

  20. Measuring Poverty in Southern India: A Comparison of Socio-Economic Scales Evaluated against Childhood Stunting.

    PubMed

    Kattula, Deepthi; Venugopal, Srinivasan; Velusamy, Vasanthakumar; Sarkar, Rajiv; Jiang, Victoria; S, Mahasampath Gowri; Henry, Ankita; Deosaran, Jordanna Devi; Muliyil, Jayaprakash; Kang, Gagandeep

    2016-01-01

    Socioeconomic status (SES) scales measure poverty, wealth and economic inequality in a population to guide appropriate economic and public health policies. Measurement of poverty and comparison of material deprivation across nations is a challenge. This study compared four SES scales which have been used locally and internationally and evaluated them against childhood stunting, used as an indicator of chronic deprivation, in urban southern India. A door-to-door survey collected information on socio-demographic indicators such as education, occupation, assets, income and living conditions in a semi-urban slum area in Vellore, Tamil Nadu in southern India. A total of 7925 households were categorized by four SES scales-Kuppuswamy scale, Below Poverty Line scale (BPL), the modified Kuppuswamy scale, and the multidimensional poverty index (MDPI) and the level of agreement compared between scales. Logistic regression was used to test the association of SES scales with stunting. The Kuppuswamy, BPL, MDPI and modified Kuppuswamy scales classified 7.1%, 1%, 5.5%, and 55.3% of families as low SES respectively, indicating conservative estimation of low SES by the BPL and MDPI scales in comparison with the modified Kuppuswamy scale, which had the highest sensitivity (89%). Children from low SES classified by all scales had higher odds of stunting, but the level of agreement between scales was very poor ranging from 1%-15%. There is great non-uniformity between existing SES scales and cautious interpretation of SES scales is needed in the context of social, cultural, and economic realities.

  1. A decision support system using combined-classifier for high-speed data stream in smart grid

    NASA Astrophysics Data System (ADS)

    Yang, Hang; Li, Peng; He, Zhian; Guo, Xiaobin; Fong, Simon; Chen, Huajun

    2016-11-01

    Large volume of high-speed streaming data is generated by big power grids continuously. In order to detect and avoid power grid failure, decision support systems (DSSs) are commonly adopted in power grid enterprises. Among all the decision-making algorithms, incremental decision tree is the most widely used one. In this paper, we propose a combined classifier that is a composite of a cache-based classifier (CBC) and a main tree classifier (MTC). We integrate this classifier into a stream processing engine on top of the DSS such that high-speed steaming data can be transformed into operational intelligence efficiently. Experimental results show that our proposed classifier can return more accurate answers than other existing ones.

  2. Factorization for jet radius logarithms in jet mass spectra at the LHC

    DOE PAGES

    Kolodrubetz, Daniel W.; Pietrulewicz, Piotr; Stewart, Iain W.; ...

    2016-12-14

    To predict the jet mass spectrum at a hadron collider it is crucial to account for the resummation of logarithms between the transverse momentum of the jet and its invariant mass m J . For small jet areas there are additional large logarithms of the jet radius R, which affect the convergence of the perturbative series. We present an analytic framework for exclusive jet production at the LHC which gives a complete description of the jet mass spectrum including realistic jet algorithms and jet vetoes. It factorizes the scales associated with m J , R, and the jet veto, enablingmore » in addition the systematic resummation of jet radius logarithms in the jet mass spectrum beyond leading logarithmic order. We discuss the factorization formulae for the peak and tail region of the jet mass spectrum and for small and large R, and the relations between the different regimes and how to combine them. Regions of experimental interest are classified which do not involve large nonglobal logarithms. We also present universal results for nonperturbative effects and discuss various jet vetoes.« less

  3. A New View of the Dynamics of Reynolds Stress Generation in Turbulent Boundary Layers

    NASA Technical Reports Server (NTRS)

    Cantwell, Brian J.; Chacin, Juan M.

    1998-01-01

    The structure of a numerically simulated turbulent boundary layer over a flat plate at Re(theta) = 670 was studied using the invariants of the velocity gradient tensor (Q and R) and a related scalar quantity, the cubic discriminant (D = 27R(exp 2)/4 + Q(exp 3)). These invariants have previously been used to study the properties of the small-scale motions responsible for the dissipation of turbulent kinetic energy. In addition, these scalar quantities allow the local flow patterns to be unambiguously classified according to the terminology proposed by Chong et al. The use of the discriminant as a marker of coherent motions reveals complex, large-scale flow structures that are shown to be associated with the generation of Reynolds shear stress -u'v'(bar). These motions are characterized by high spatial gradients of the discriminant and are believed to be an important part of the mechanism that sustains turbulence in the near-wall region.

  4. Palm vein recognition based on directional empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Lee, Jen-Chun; Chang, Chien-Ping; Chen, Wei-Kuei

    2014-04-01

    Directional empirical mode decomposition (DEMD) has recently been proposed to make empirical mode decomposition suitable for the processing of texture analysis. Using DEMD, samples are decomposed into a series of images, referred to as two-dimensional intrinsic mode functions (2-D IMFs), from finer to large scale. A DEMD-based 2 linear discriminant analysis (LDA) for palm vein recognition is proposed. The proposed method progresses through three steps: (i) a set of 2-D IMF features of various scale and orientation are extracted using DEMD, (ii) the 2LDA method is then applied to reduce the dimensionality of the feature space in both the row and column directions, and (iii) the nearest neighbor classifier is used for classification. We also propose two strategies for using the set of 2-D IMF features: ensemble DEMD vein representation (EDVR) and multichannel DEMD vein representation (MDVR). In experiments using palm vein databases, the proposed MDVR-based 2LDA method achieved recognition accuracy of 99.73%, thereby demonstrating its feasibility for palm vein recognition.

  5. Mesoscale monitoring of the soil freeze/thaw boundary from orbital microwave radiometry

    NASA Technical Reports Server (NTRS)

    Dobson, Craig; Ulaby, Fawwaz T.; Zuerndorfer, Brian; England, Anthony W.

    1990-01-01

    A technique was developed for mapping the spatial extent of frozen soils from the spectral characteristics of the 10.7 to 37 GHz radiobrightness. Through computational models for the spectral radiobrightness of diurnally heated freesing soils, a distinctive radiobrightness signature was identified for frozen soils, and the signature was cast as a discriminant for unsupervised classification. In addition to large area images, local area spatial averages of radiobrightness were calculated for each radiobrightness channel at 7 meteorologic sites within the test region. Local area averages at the meteorologic sites were used to define the preliminary boundaries in the Freeze Indicator discriminate. Freeze Indicator images based upon Nimbus 7, Scanning Multichannel Microwave Radiometer (SMMR) data effectively map temporal variations in the freeze/thaw pattern for the northern Great Plains at the time scale of days. Diurnal thermal gradients have a small but measurable effect upon the SMMR spectral gradient. Scale-space filtering can be used to improve the spatial resolution of a freeze/thaw classified image.

  6. Burke-Fahn-Marsden dystonia severity, Gross Motor, Manual Ability, and Communication Function Classification scales in childhood hyperkinetic movement disorders including cerebral palsy: a 'Rosetta Stone' study.

    PubMed

    Elze, Markus C; Gimeno, Hortensia; Tustin, Kylee; Baker, Lesley; Lumsden, Daniel E; Hutton, Jane L; Lin, Jean-Pierre S-M

    2016-02-01

    Hyperkinetic movement disorders (HMDs) can be assessed using impairment-based scales or functional classifications. The Burke-Fahn-Marsden Dystonia Rating Scale-movement (BFM-M) evaluates dystonia impairment, but may not reflect functional ability. The Gross Motor Function Classification System (GMFCS), Manual Ability Classification System (MACS), and Communication Function Classification System (CFCS) are widely used in the literature on cerebral palsy to classify functional ability, but not in childhood movement disorders. We explore the concordance of these three functional scales in a large sample of paediatric HMDs and the impact of dystonia severity on these scales. Children with HMDs (n=161; median age 10y 3mo, range 2y 6mo-21y) were assessed using the BFM-M, GMFCS, MACS, and CFCS from 2007 to 2013. This cross-sectional study contrasts the information provided by these scales. All four scales were strongly associated (all Spearman's rank correlation coefficient rs >0.72, p<0.001), with worse dystonia severity implying worse function. Secondary dystonias had worse dystonia and less function than primary dystonias (p<0.001). A longer proportion of life lived with dystonia is associated with more severe dystonia (rs =0.42, p<0.001). The BFM-M is strongly linked with the GMFCS, MACS, and CFCS, irrespective of aetiology. Each scale offers interrelated but complementary information and is applicable to all aetiologies. Movement disorders including cerebral palsy can be effectively evaluated using these scales. © 2015 Mac Keith Press.

  7. Automatic Matching of Large Scale Images and Terrestrial LIDAR Based on App Synergy of Mobile Phone

    NASA Astrophysics Data System (ADS)

    Xia, G.; Hu, C.

    2018-04-01

    The digitalization of Cultural Heritage based on ground laser scanning technology has been widely applied. High-precision scanning and high-resolution photography of cultural relics are the main methods of data acquisition. The reconstruction with the complete point cloud and high-resolution image requires the matching of image and point cloud, the acquisition of the homonym feature points, the data registration, etc. However, the one-to-one correspondence between image and corresponding point cloud depends on inefficient manual search. The effective classify and management of a large number of image and the matching of large image and corresponding point cloud will be the focus of the research. In this paper, we propose automatic matching of large scale images and terrestrial LiDAR based on APP synergy of mobile phone. Firstly, we develop an APP based on Android, take pictures and record related information of classification. Secondly, all the images are automatically grouped with the recorded information. Thirdly, the matching algorithm is used to match the global and local image. According to the one-to-one correspondence between the global image and the point cloud reflection intensity image, the automatic matching of the image and its corresponding laser radar point cloud is realized. Finally, the mapping relationship between global image, local image and intensity image is established according to homonym feature point. So we can establish the data structure of the global image, the local image in the global image, the local image corresponding point cloud, and carry on the visualization management and query of image.

  8. Environmental Controls on Multi-Scale Soil Nutrient Variability in the Tropics: the Importance of Land-Cover Change

    NASA Astrophysics Data System (ADS)

    Holmes, K. W.; Kyriakidis, P. C.; Chadwick, O. A.; Matricardi, E.; Soares, J. V.; Roberts, D. A.

    2003-12-01

    The natural controls on soil variability and the spatial scales at which correlation exists among soil and environmental variables are critical information for evaluating the effects of deforestation. We detect different spatial scales of variability in soil nutrient levels over a large region (hundreds of thousands of km2) in the Amazon, analyze correlations among soil properties at these different scales, and evaluate scale-specific relationships among soil properties and the factors potentially driving soil development. Statistical relationships among physical drivers of soil formation, namely geology, precipitation, terrain attributes, classified soil types, and land cover derived from remote sensing, were included to determine which factors are related to soil biogeochemistry at each spatial scale. Surface and subsurface soil profile data from a 3000 sample database collected in Rond“nia, Brazil, were used to investigate patterns in pH, phosphorus, nitrogen, organic carbon, effective cation exchange capacity, calcium, magnesium, potassium, aluminum, sand, and clay in this environment grading from closed canopy tropical forest to savanna. We focus on pH in this presentation for simplicity, because pH is the single most important soil characteristic for determining the chemical environment of higher plants and soil microbial activity. We determined four spatial scales which characterize integrated patterns of soil chemistry: less than 3 km; 3 to 10 km; 10 to 68 km; and from 68 to 550 km (extent of study area). Although the finest observable scale was fixed by the field sampling density, the coarser scales were determined from relationships in the data through coregionalization modeling, rather than being imposed by the researcher. Processes which affect soils over short distances, such as land cover and terrain attributes, were good predictors of fine scale spatial components of nutrients; processes which affect soils over very large distances, such as precipitation and geology, were better predictors at coarse spatial scales. However, this result may be affected by the resolution of the available predictor maps. Land-cover change exerted a strong influence on soil chemistry at fine spatial scales, and had progressively less of an effect at coarser scales. It is important to note that land cover, and interactions among land cover and the other predictors, continued to be a significant predictor of soil chemistry at every spatial scale up to hundreds of thousands of kilometers.

  9. Bayesian approach to MSD-based analysis of particle motion in live cells.

    PubMed

    Monnier, Nilah; Guo, Syuan-Ming; Mori, Masashi; He, Jun; Lénárt, Péter; Bathe, Mark

    2012-08-08

    Quantitative tracking of particle motion using live-cell imaging is a powerful approach to understanding the mechanism of transport of biological molecules, organelles, and cells. However, inferring complex stochastic motion models from single-particle trajectories in an objective manner is nontrivial due to noise from sampling limitations and biological heterogeneity. Here, we present a systematic Bayesian approach to multiple-hypothesis testing of a general set of competing motion models based on particle mean-square displacements that automatically classifies particle motion, properly accounting for sampling limitations and correlated noise while appropriately penalizing model complexity according to Occam's Razor to avoid over-fitting. We test the procedure rigorously using simulated trajectories for which the underlying physical process is known, demonstrating that it chooses the simplest physical model that explains the observed data. Further, we show that computed model probabilities provide a reliability test for the downstream biological interpretation of associated parameter values. We subsequently illustrate the broad utility of the approach by applying it to disparate biological systems including experimental particle trajectories from chromosomes, kinetochores, and membrane receptors undergoing a variety of complex motions. This automated and objective Bayesian framework easily scales to large numbers of particle trajectories, making it ideal for classifying the complex motion of large numbers of single molecules and cells from high-throughput screens, as well as single-cell-, tissue-, and organism-level studies. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  10. A Frequency-Domain Implementation of a Sliding-Window Traffic Sign Detector for Large Scale Panoramic Datasets

    NASA Astrophysics Data System (ADS)

    Creusen, I. M.; Hazelhoff, L.; De With, P. H. N.

    2013-10-01

    In large-scale automatic traffic sign surveying systems, the primary computational effort is concentrated at the traffic sign detection stage. This paper focuses on reducing the computational load of particularly the sliding window object detection algorithm which is employed for traffic sign detection. Sliding-window object detectors often use a linear SVM to classify the features in a window. In this case, the classification can be seen as a convolution of the feature maps with the SVM kernel. It is well known that convolution can be efficiently implemented in the frequency domain, for kernels larger than a certain size. We show that by careful reordering of sliding-window operations, most of the frequency-domain transformations can be eliminated, leading to a substantial increase in efficiency. Additionally, we suggest to use the overlap-add method to keep the memory use within reasonable bounds. This allows us to keep all the transformed kernels in memory, thereby eliminating even more domain transformations, and allows all scales in a multiscale pyramid to be processed using the same set of transformed kernels. For a typical sliding-window implementation, we have found that the detector execution performance improves with a factor of 5.3. As a bonus, many of the detector improvements from literature, e.g. chi-squared kernel approximations, sub-class splitting algorithms etc., can be more easily applied at a lower performance penalty because of an improved scalability.

  11. An innovative procedure to assess multi-scale temporal trends in groundwater quality: Example of the nitrate in the Seine-Normandy basin, France

    NASA Astrophysics Data System (ADS)

    Lopez, Benjamin; Baran, Nicole; Bourgine, Bernard

    2015-03-01

    The European Water Framework Directive (WFD) asks Member States to identify trends in contaminant concentrations in groundwater and to take measures to reach a good chemical status by 2015. In this study, carried out in a large hydrological basin (95,300 km2), an innovative procedure is described for the assessment of recent trends in groundwater nitrate concentrations both at sampling point and regional scales. Temporal variograms of piezometric and nitrate concentration time series are automatically calculated and fitted in order to classify groundwater according to their temporal pattern. These results are then coupled with aquifer lithology to map spatial units within which the modes of diffuse transport of contaminants towards groundwater are assumed to be the same at all points. These spatial units are suitable for evaluating regional trends. The stability over time of the time series is tested based on the cumulative sum principle, to determine the time period during which the trend should be sought. The Mann-Kendall and Regional-Kendall nonparametric tests for monotonic trends, coupled with the Sen-slope test, are applied to the periods following the point breaks thus determined at both the sampling point or regional scales. This novel procedure is robust and enables rapid processing of large databases of raw data. It would therefore be useful for managing groundwater quality in compliance with the aims of the WFD.

  12. Automatic detection of ischemic stroke based on scaling exponent electroencephalogram using extreme learning machine

    NASA Astrophysics Data System (ADS)

    Adhi, H. A.; Wijaya, S. K.; Prawito; Badri, C.; Rezal, M.

    2017-03-01

    Stroke is one of cerebrovascular diseases caused by the obstruction of blood flow to the brain. Stroke becomes the leading cause of death in Indonesia and the second in the world. Stroke also causes of the disability. Ischemic stroke accounts for most of all stroke cases. Obstruction of blood flow can cause tissue damage which results the electrical changes in the brain that can be observed through the electroencephalogram (EEG). In this study, we presented the results of automatic detection of ischemic stroke and normal subjects based on the scaling exponent EEG obtained through detrended fluctuation analysis (DFA) using extreme learning machine (ELM) as the classifier. The signal processing was performed with 18 channels of EEG in the range of 0-30 Hz. Scaling exponents of the subjects were used as the input for ELM to classify the ischemic stroke. The performance of detection was observed by the value of accuracy, sensitivity and specificity. The result showed, performance of the proposed method to classify the ischemic stroke was 84 % for accuracy, 82 % for sensitivity and 87 % for specificity with 120 hidden neurons and sine as the activation function of ELM.

  13. Error minimizing algorithms for nearest eighbor classifiers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Porter, Reid B; Hush, Don; Zimmer, G. Beate

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. Wemore » use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.« less

  14. Contextual classification on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Tilton, James C.

    1987-01-01

    Classifiers are often used to produce land cover maps from multispectral Earth observation imagery. Conventionally, these classifiers have been designed to exploit the spectral information contained in the imagery. Very few classifiers exploit the spatial information content of the imagery, and the few that do rarely exploit spatial information content in conjunction with spectral and/or temporal information. A contextual classifier that exploits spatial and spectral information in combination through a general statistical approach was studied. Early test results obtained from an implementation of the classifier on a VAX-11/780 minicomputer were encouraging, but they are of limited meaning because they were produced from small data sets. An implementation of the contextual classifier is presented on the Massively Parallel Processor (MPP) at Goddard that for the first time makes feasible the testing of the classifier on large data sets.

  15. Deploying a quantum annealing processor to detect tree cover in aerial imagery of California

    PubMed Central

    Basu, Saikat; Ganguly, Sangram; Michaelis, Andrew; Mukhopadhyay, Supratik; Nemani, Ramakrishna R.

    2017-01-01

    Quantum annealing is an experimental and potentially breakthrough computational technology for handling hard optimization problems, including problems of computer vision. We present a case study in training a production-scale classifier of tree cover in remote sensing imagery, using early-generation quantum annealing hardware built by D-wave Systems, Inc. Beginning within a known boosting framework, we train decision stumps on texture features and vegetation indices extracted from four-band, one-meter-resolution aerial imagery from the state of California. We then impose a regulated quadratic training objective to select an optimal voting subset from among these stumps. The votes of the subset define the classifier. For optimization, the logical variables in the objective function map to quantum bits in the hardware device, while quadratic couplings encode as the strength of physical interactions between the quantum bits. Hardware design limits the number of couplings between these basic physical entities to five or six. To account for this limitation in mapping large problems to the hardware architecture, we propose a truncation and rescaling of the training objective through a trainable metaparameter. The boosting process on our basic 108- and 508-variable problems, thus constituted, returns classifiers that incorporate a diverse range of color- and texture-based metrics and discriminate tree cover with accuracies as high as 92% in validation and 90% on a test scene encompassing the open space preserves and dense suburban build of Mill Valley, CA. PMID:28241028

  16. A machine learning approach for viral genome classification.

    PubMed

    Remita, Mohamed Amine; Halioui, Ahmed; Malick Diouara, Abou Abdallah; Daigle, Bruno; Kiani, Golrokh; Diallo, Abdoulaye Baniré

    2017-04-11

    Advances in cloning and sequencing technology are yielding a massive number of viral genomes. The classification and annotation of these genomes constitute important assets in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific well-studied family of viruses. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly sequenced strains of diverse virus families. Here, we introduce a virus classification platform, CASTOR, based on machine learning methods. CASTOR is inspired by a well-known technique in molecular biology: restriction fragment length polymorphism (RFLP). It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. We benchmark CASTOR for the classification of distinct datasets of human papillomaviruses (HPV), hepatitis B viruses (HBV) and human immunodeficiency viruses type 1 (HIV-1). Results reveal true positive rates of 99%, 99% and 98% for HPV Alpha species, HBV genotyping and HIV-1 M subtyping, respectively. Furthermore, CASTOR shows a competitive performance compared to well-known HIV-1 specific classifiers (REGA and COMET) on whole genomes and pol fragments. The performance of CASTOR, its genericity and robustness could permit to perform novel and accurate large scale virus studies. The CASTOR web platform provides an open access, collaborative and reproducible machine learning classifiers. CASTOR can be accessed at http://castor.bioinfo.uqam.ca .

  17. Classifying proteins into functional groups based on all-versus-all BLAST of 10 million proteins.

    PubMed

    Kolker, Natali; Higdon, Roger; Broomall, William; Stanberry, Larissa; Welch, Dean; Lu, Wei; Haynes, Winston; Barga, Roger; Kolker, Eugene

    2011-01-01

    To address the monumental challenge of assigning function to millions of sequenced proteins, we completed the first of a kind all-versus-all sequence alignments using BLAST for 9.9 million proteins in the UniRef100 database. Microsoft Windows Azure produced over 3 billion filtered records in 6 days using 475 eight-core virtual machines. Protein classification into functional groups was then performed using Hive and custom jars implemented on top of Apache Hadoop utilizing the MapReduce paradigm. First, using the Clusters of Orthologous Genes (COG) database, a length normalized bit score (LNBS) was determined to be the best similarity measure for classification of proteins. LNBS achieved sensitivity and specificity of 98% each. Second, out of 5.1 million bacterial proteins, about two-thirds were assigned to significantly extended COG groups, encompassing 30 times more assigned proteins. Third, the remaining proteins were classified into protein functional groups using an innovative implementation of a single-linkage algorithm on an in-house Hadoop compute cluster. This implementation significantly reduces the run time for nonindexed queries and optimizes efficient clustering on a large scale. The performance was also verified on Amazon Elastic MapReduce. This clustering assigned nearly 2 million proteins to approximately half a million different functional groups. A similar approach was applied to classify 2.8 million eukaryotic sequences resulting in over 1 million proteins being assign to existing KOG groups and the remainder clustered into 100,000 functional groups.

  18. Radiomic modeling of BI-RADS density categories

    NASA Astrophysics Data System (ADS)

    Wei, Jun; Chan, Heang-Ping; Helvie, Mark A.; Roubidoux, Marilyn A.; Zhou, Chuan; Hadjiiski, Lubomir

    2017-03-01

    Screening mammography is the most effective and low-cost method to date for early cancer detection. Mammographic breast density has been shown to be highly correlated with breast cancer risk. We are developing a radiomic model for BI-RADS density categorization on digital mammography (FFDM) with a supervised machine learning approach. With IRB approval, we retrospectively collected 478 FFDMs from 478 women. As a gold standard, breast density was assessed by an MQSA radiologist based on BI-RADS categories. The raw FFDMs were used for computerized density assessment. The raw FFDM first underwent log-transform to approximate the x-ray sensitometric response, followed by multiscale processing to enhance the fibroglandular densities and parenchymal patterns. Three ROIs were automatically identified based on the keypoint distribution, where the keypoints were obtained as the extrema in the image Gaussian scale-space. A total of 73 features, including intensity and texture features that describe the density and the parenchymal pattern, were extracted from each breast. Our BI-RADS density estimator was constructed by using a random forest classifier. We used a 10-fold cross validation resampling approach to estimate the errors. With the random forest classifier, computerized density categories for 412 of the 478 cases agree with radiologist's assessment (weighted kappa = 0.93). The machine learning method with radiomic features as predictors demonstrated a high accuracy in classifying FFDMs into BI-RADS density categories. Further work is underway to improve our system performance as well as to perform an independent testing using a large unseen FFDM set.

  19. Toward a Rapid Synthesis of Field and Desktop Data for Classifying Streams in the Pacific Northwest: Guiding the Sampling and Management of Salmonid Habitat

    NASA Astrophysics Data System (ADS)

    Kasprak, A.; Wheaton, J. M.; Bouwes, N.; Weber, N. P.; Trahan, N. C.; Jordan, C. E.

    2012-12-01

    River managers often seek to understand habitat availability and quality for riverine organisms within the physical template provided by their landscape. Yet the large amount of natural heterogeneity in landscapes gives rise to stream systems which are highly variable over small spatial scales, potentially complicating site selection for surveying aquatic habitat while simultaneously making a simple, wide-reaching management strategy elusive. This is particularly true in the rugged John Day River Basin of northern Oregon, where efforts as part of the Columbia Habitat Monitoring Program to conduct site-based surveys of physical habitat for endangered steelhead salmon (Oncorhynchus mykiss) are underway. As a complete understanding of the type and distribution of habitat available to these fish would require visits to all streams in the basin (impractical due to its large size), here we develop an approach for classifying channel types which combines remote desktop GIS analyses with rapid field-based stream and landscape surveys. At the core of this method, we build off of the River Styles Framework, an open-ended and process-based approach for classifying streams and informing management decisions. This framework is combined with on-the-ground fluvial audits, which aim to quickly and continuously map sediment dynamics and channel behavior along selected channels. Validation of this classification method is completed by on-the-ground stream surveys using a digital iPad platform and by rapid small aircraft overflights to confirm or refine predictions. We further compare this method with existing channel classification approaches for the region (e.g. Beechie, Montgomery and Buffington). The results of this study will help guide both the refinement of site stratification and selection for salmonid habitat monitoring within the basin, and will be vital in designing and prioritizing restoration and management strategies tailored to the distribution of river styles found across the region.

  20. Crowdsourcing as a novel technique for retinal fundus photography classification: analysis of images in the EPIC Norfolk cohort on behalf of the UK Biobank Eye and Vision Consortium.

    PubMed

    Mitry, Danny; Peto, Tunde; Hayat, Shabina; Morgan, James E; Khaw, Kay-Tee; Foster, Paul J

    2013-01-01

    Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography. One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs) from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC). Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI) of 0.701(0.680-0.721) or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI) for normal/abnormal classification was 0.757 (0.738-0.776) for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64-86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74-97%. Sensitivity was ≥ 96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal varied between 61-79% across trials. With minimal training, crowdsourcing represents an accurate, rapid and cost-effective method of retinal image analysis which demonstrates good repeatability. Larger studies with more comprehensive participant training are needed to explore the utility of this compelling technique in large scale medical image analysis.

  1. Preliminary classification of forest vegetation of the Kenai Peninsula, Alaska.

    Treesearch

    K.M. Reynolds

    1990-01-01

    A total of 5,597 photo points was systematically located on 1:60,000-scale high altitude photographs of the Kenai Peninsula, Alaska; photo interpretation was used to classify the vegetation at each grid position. Of the total grid points, 12.3 percent were classified as timberland; 129 photo points within the timberland class were randomly selected for field survey....

  2. The spatial distribution of dwarf galaxies in the CfA slice of the universe

    NASA Technical Reports Server (NTRS)

    Thuan, Trinh X.; Gott, J. Richard, III; Schneider, Stephen E.

    1987-01-01

    A complete (with the the exception of one) redshift sample of 58 galaxies in the Nilson catalog classified as dwarf, irregular, or Magellanic irregular is used to investigate the large-scale clustering properties of these low-surface brightness galaxies in the CfA slice of the universe (alpha in the range of 8-17 h, delta in the range of 26.5-32.5 deg). It is found that the low-surface brightness dwarf galaxies also lie on the structures delineated by the high-surface brightness normal galaxies and that they do not fill in the voids. This is inconsistent with a class of biased galaxy formation theories which predict that dwarf galaxies should be present everywhere, including the voids.

  3. An Operational Definition of Learning Disabilities (Cognitive Domain) Using WISC Full Scale IQ and Peabody Individual Achievement Test Scores

    ERIC Educational Resources Information Center

    Brenton, Beatrice White; Gilmore, Doug

    1976-01-01

    An operational index of discrepancy to assist in identifying learning disabilities was derived using the Full Scale IQ, Wechsler Intelligence Scale for Children, and relevant subtest scores on the Peabody Individual Achievement Test. Considerable caution should be exercised when classifying children, especially females, as learning disabled.…

  4. Training set extension for SVM ensemble in P300-speller with familiar face paradigm.

    PubMed

    Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou

    2018-03-27

    P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.

  5. SparkText: Biomedical Text Mining on Big Data Framework.

    PubMed

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  6. APDA's Contribution to Current Research and Citizen Science

    NASA Astrophysics Data System (ADS)

    Barker, Thurburn; Castelaz, M. W.; Cline, J. D.; Hudec, R.

    2010-01-01

    The Astronomical Photographical Data Archive (APDA) is dedicated to the collection, restoration, preservation, and digitization of astronomical photographic data that eventually can be accessed via the Internet by the global community of scientists, researchers and students. Located on the Pisgah Astronomical Research Institute campus, APDA now includes collections from North America totaling more than 100,000 photographic plates and films. Two new large scale research projects, and one citizen science project have now been developed from the archived data. One unique photographic data collection covering the southern hemisphere contains the signatures of diffuse interstellar bands (DIBs) within the stellar spectra on objective prism plates. We plan to digitize the spectra, identify the DIBs, and map out the large scale spatial extent of DIBS. The goal is to understand the Galactic environment suitable to the DIB molecules. Another collection contains spectra with nearly the same dispersion as the GAIA Satellite low dispersion slitless spectrophotometers, BP and RP. The plates will be used to develop standards for GAIA spectra. To bring the data from APDA to the general public, we have developed the citizen science project called Stellar Classification Online - Public Exploration (SCOPE). SCOPE allows the citizen scientist to classify up to a half million stars on objective prism plates. We will present the status of each of these projects.

  7. SparkText: Biomedical Text Mining on Big Data Framework

    PubMed Central

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  8. Operational monitoring of land-cover change using multitemporal remote sensing data

    NASA Astrophysics Data System (ADS)

    Rogan, John

    2005-11-01

    Land-cover change, manifested as either land-cover modification and/or conversion, can occur at all spatial scales, and changes at local scales can have profound, cumulative impacts at broader scales. The implication of operational land-cover monitoring is that researchers have access to a continuous stream of remote sensing data, with the long term goal of providing for consistent and repetitive mapping. Effective large area monitoring of land-cover (i.e., >1000 km2) can only be accomplished by using remotely sensed images as an indirect data source in land-cover change mapping and as a source for land-cover change model projections. Large area monitoring programs face several challenges: (1) choice of appropriate classification scheme/map legend over large, topographically and phenologically diverse areas; (2) issues concerning data consistency and map accuracy (i.e., calibration and validation); (3) very large data volumes; (4) time consuming data processing and interpretation. Therefore, this dissertation research broadly addresses these challenges in the context of examining state-of-the-art image pre-processing, spectral enhancement, classification, and accuracy assessment techniques to assist the California Land-cover Mapping and Monitoring Program (LCMMP). The results of this dissertation revealed that spatially varying haze can be effectively corrected from Landsat data for the purposes of change detection. The Multitemporal Spectral Mixture Analysis (MSMA) spectral enhancement technique produced more accurate land-cover maps than those derived from the Multitemporal Kauth Thomas (MKT) transformation in northern and southern California study areas. A comparison of machine learning classifiers showed that Fuzzy ARTMAP outperformed two classification tree algorithms, based on map accuracy and algorithm robustness. Variation in spatial data error (positional and thematic) was explored in relation to environmental variables using geostatistical interpolation techniques. Finally, the land-cover modification maps generated for three time intervals (1985--1990--1996--2000), with nine change-classes revealed important variations in land-cover gain and loss between northern and southern California study areas.

  9. Large-scale circulation classification and its links to observed precipitation in the eastern and central Tibetan Plateau

    NASA Astrophysics Data System (ADS)

    Liu, Wenbin; Wang, Lei; Chen, Deliang; Tu, Kai; Ruan, Chengqing; Hu, Zengyun

    2016-06-01

    The relationship between the large-scale circulation dynamics and regional precipitation regime in the Tibetan Plateau (TP) has so far not been well understood. In this study, we classify the circulation types using the self-organizing maps based on the daily field of 500 hPa geopotential height and link them to the precipitation climatology in the eastern and central TP. By virtue of an objective determining method, 18 circulation types are quantified. The results show that the large amount of precipitation in summer is closely related to the circulation types in which the enhanced and northward shifted subtropical high (SH) over the northwest Pacific and the obvious cyclconic circulation anomaly over the Bay of Bengal are helpful for the Indian summer monsoon and East Asian summer monsoon to take abundant low-latitude moisture to the eastern and southern TP. On the contrary, the dry winter in the central and eastern Tibet corresponds to the circulation types with divergence over the central and eastern TP and the water vapor transportations of East Asian winter monsoon and mid-latitude westerly are very weak. Some circulation types are associated with some well-known circulation patterns/monsoons influencing the TP (e.g. East Atlantic Pattern, El Niño Southern Oscillation, Indian Summer Monsoon and the mid-latitude westerly), and exhibit an overall good potential for explaining the variability of regional seasonal precipitation. Moreover, the climate shift signals in the late 1970s over the eastern Pacific/North Pacific Oceans could also be reflected by both the variability of some circulation types and their correspondingly composite precipitations. This study extends our understandings for the large-scale atmospheric dynamics and their linkages with regional precipitation and is beneficial for the climate change projection and related adaptation activities in the highest and largest plateau in the world.

  10. Evaluation of linear classifiers on articles containing pharmacokinetic evidence of drug-drug interactions.

    PubMed

    Kolchinsky, A; Lourenço, A; Li, L; Rocha, L M

    2013-01-01

    Drug-drug interaction (DDI) is a major cause of morbidity and mortality. DDI research includes the study of different aspects of drug interactions, from in vitro pharmacology, which deals with drug interaction mechanisms, to pharmaco-epidemiology, which investigates the effects of DDI on drug efficacy and adverse drug reactions. Biomedical literature mining can aid both kinds of approaches by extracting relevant DDI signals from either the published literature or large clinical databases. However, though drug interaction is an ideal area for translational research, the inclusion of literature mining methodologies in DDI workflows is still very preliminary. One area that can benefit from literature mining is the automatic identification of a large number of potential DDIs, whose pharmacological mechanisms and clinical significance can then be studied via in vitro pharmacology and in populo pharmaco-epidemiology. We implemented a set of classifiers for identifying published articles relevant to experimental pharmacokinetic DDI evidence. These documents are important for identifying causal mechanisms behind putative drug-drug interactions, an important step in the extraction of large numbers of potential DDIs. We evaluate performance of several linear classifiers on PubMed abstracts, under different feature transformation and dimensionality reduction methods. In addition, we investigate the performance benefits of including various publicly-available named entity recognition features, as well as a set of internally-developed pharmacokinetic dictionaries. We found that several classifiers performed well in distinguishing relevant and irrelevant abstracts. We found that the combination of unigram and bigram textual features gave better performance than unigram features alone, and also that normalization transforms that adjusted for feature frequency and document length improved classification. For some classifiers, such as linear discriminant analysis (LDA), proper dimensionality reduction had a large impact on performance. Finally, the inclusion of NER features and dictionaries was found not to help classification.

  11. MIDAS, prototype Multivariate Interactive Digital Analysis System for large area earth resources surveys. Volume 1: System description

    NASA Technical Reports Server (NTRS)

    Christenson, D.; Gordon, M.; Kistler, R.; Kriegler, F.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1977-01-01

    A third-generation, fast, low cost, multispectral recognition system (MIDAS) able to keep pace with the large quantity and high rates of data acquisition from large regions with present and projected sensots is described. The program can process a complete ERTS frame in forty seconds and provide a color map of sixteen constituent categories in a few minutes. A principle objective of the MIDAS program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughput. The hardware and software generated in the overall program is described. The system contains a midi-computer to control the various high speed processing elements in the data path, a preprocessor to condition data, and a classifier which implements an all digital prototype multivariate Gaussian maximum likelihood or a Bayesian decision algorithm. Sufficient software was developed to perform signature extraction, control the preprocessor, compute classifier coefficients, control the classifier operation, operate the color display and printer, and diagnose operation.

  12. Optical differentiation between malignant and benign lymphadenopathy by grey scale texture analysis of endobronchial ultrasound convex probe images.

    PubMed

    Nguyen, Phan; Bashirzadeh, Farzad; Hundloe, Justin; Salvado, Olivier; Dowson, Nicholas; Ware, Robert; Masters, Ian Brent; Bhatt, Manoj; Kumar, Aravind Ravi; Fielding, David

    2012-03-01

    Morphologic and sonographic features of endobronchial ultrasound (EBUS) convex probe images are helpful in predicting metastatic lymph nodes. Grey scale texture analysis is a well-established methodology that has been applied to ultrasound images in other fields of medicine. The aim of this study was to determine if this methodology could differentiate between benign and malignant lymphadenopathy of EBUS images. Lymph nodes from digital images of EBUS procedures were manually mapped to obtain a region of interest and were analyzed in a prediction set. The regions of interest were analyzed for the following grey scale texture features in MATLAB (version 7.8.0.347 [R2009a]): mean pixel value, difference between maximal and minimal pixel value, SEM pixel value, entropy, correlation, energy, and homogeneity. Significant grey scale texture features were used to assess a validation set compared with fluoro-D-glucose (FDG)-PET-CT scan findings where available. Fifty-two malignant nodes and 48 benign nodes were in the prediction set. Malignant nodes had a greater difference in the maximal and minimal pixel values, SEM pixel value, entropy, and correlation, and a lower energy (P < .0001 for all values). Fifty-one lymph nodes were in the validation set; 44 of 51 (86.3%) were classified correctly. Eighteen of these lymph nodes also had FDG-PET-CT scan assessment, which correctly classified 14 of 18 nodes (77.8%), compared with grey scale texture analysis, which correctly classified 16 of 18 nodes (88.9%). Grey scale texture analysis of EBUS convex probe images can be used to differentiate malignant and benign lymphadenopathy. Preliminary results are comparable to FDG-PET-CT scan.

  13. A morphometric analysis of vegetation patterns in dryland ecosystems

    PubMed Central

    Dekker, Stefan C.; Li, Mao; Mio, Washington; Punyasena, Surangi W.; Lenton, Timothy M.

    2017-01-01

    Vegetation in dryland ecosystems often forms remarkable spatial patterns. These range from regular bands of vegetation alternating with bare ground, to vegetated spots and labyrinths, to regular gaps of bare ground within an otherwise continuous expanse of vegetation. It has been suggested that spotted vegetation patterns could indicate that collapse into a bare ground state is imminent, and the morphology of spatial vegetation patterns, therefore, represents a potentially valuable source of information on the proximity of regime shifts in dryland ecosystems. In this paper, we have developed quantitative methods to characterize the morphology of spatial patterns in dryland vegetation. Our approach is based on algorithmic techniques that have been used to classify pollen grains on the basis of textural patterning, and involves constructing feature vectors to quantify the shapes formed by vegetation patterns. We have analysed images of patterned vegetation produced by a computational model and a small set of satellite images from South Kordofan (South Sudan), which illustrates that our methods are applicable to both simulated and real-world data. Our approach provides a means of quantifying patterns that are frequently described using qualitative terminology, and could be used to classify vegetation patterns in large-scale satellite surveys of dryland ecosystems. PMID:28386414

  14. A morphometric analysis of vegetation patterns in dryland ecosystems.

    PubMed

    Mander, Luke; Dekker, Stefan C; Li, Mao; Mio, Washington; Punyasena, Surangi W; Lenton, Timothy M

    2017-02-01

    Vegetation in dryland ecosystems often forms remarkable spatial patterns. These range from regular bands of vegetation alternating with bare ground, to vegetated spots and labyrinths, to regular gaps of bare ground within an otherwise continuous expanse of vegetation. It has been suggested that spotted vegetation patterns could indicate that collapse into a bare ground state is imminent, and the morphology of spatial vegetation patterns, therefore, represents a potentially valuable source of information on the proximity of regime shifts in dryland ecosystems. In this paper, we have developed quantitative methods to characterize the morphology of spatial patterns in dryland vegetation. Our approach is based on algorithmic techniques that have been used to classify pollen grains on the basis of textural patterning, and involves constructing feature vectors to quantify the shapes formed by vegetation patterns. We have analysed images of patterned vegetation produced by a computational model and a small set of satellite images from South Kordofan (South Sudan), which illustrates that our methods are applicable to both simulated and real-world data. Our approach provides a means of quantifying patterns that are frequently described using qualitative terminology, and could be used to classify vegetation patterns in large-scale satellite surveys of dryland ecosystems.

  15. A shape-based segmentation method for mobile laser scanning point clouds

    NASA Astrophysics Data System (ADS)

    Yang, Bisheng; Dong, Zhen

    2013-07-01

    Segmentation of mobile laser point clouds of urban scenes into objects is an important step for post-processing (e.g., interpretation) of point clouds. Point clouds of urban scenes contain numerous objects with significant size variability, complex and incomplete structures, and holes or variable point densities, raising great challenges for the segmentation of mobile laser point clouds. This paper addresses these challenges by proposing a shape-based segmentation method. The proposed method first calculates the optimal neighborhood size of each point to derive the geometric features associated with it, and then classifies the point clouds according to geometric features using support vector machines (SVMs). Second, a set of rules are defined to segment the classified point clouds, and a similarity criterion for segments is proposed to overcome over-segmentation. Finally, the segmentation output is merged based on topological connectivity into a meaningful geometrical abstraction. The proposed method has been tested on point clouds of two urban scenes obtained by different mobile laser scanners. The results show that the proposed method segments large-scale mobile laser point clouds with good accuracy and computationally effective time cost, and that it segments pole-like objects particularly well.

  16. A morphometric analysis of vegetation patterns in dryland ecosystems

    NASA Astrophysics Data System (ADS)

    Mander, Luke; Dekker, Stefan C.; Li, Mao; Mio, Washington; Punyasena, Surangi W.; Lenton, Timothy M.

    2017-02-01

    Vegetation in dryland ecosystems often forms remarkable spatial patterns. These range from regular bands of vegetation alternating with bare ground, to vegetated spots and labyrinths, to regular gaps of bare ground within an otherwise continuous expanse of vegetation. It has been suggested that spotted vegetation patterns could indicate that collapse into a bare ground state is imminent, and the morphology of spatial vegetation patterns, therefore, represents a potentially valuable source of information on the proximity of regime shifts in dryland ecosystems. In this paper, we have developed quantitative methods to characterize the morphology of spatial patterns in dryland vegetation. Our approach is based on algorithmic techniques that have been used to classify pollen grains on the basis of textural patterning, and involves constructing feature vectors to quantify the shapes formed by vegetation patterns. We have analysed images of patterned vegetation produced by a computational model and a small set of satellite images from South Kordofan (South Sudan), which illustrates that our methods are applicable to both simulated and real-world data. Our approach provides a means of quantifying patterns that are frequently described using qualitative terminology, and could be used to classify vegetation patterns in large-scale satellite surveys of dryland ecosystems.

  17. [Using neural networks based template matching method to obtain redshifts of normal galaxies].

    PubMed

    Xu, Xin; Luo, A-li; Wu, Fu-chao; Zhao, Yong-heng

    2005-06-01

    Galaxies can be divided into two classes: normal galaxy (NG) and active galaxy (AG). In order to determine NG redshifts, an automatic effective method is proposed in this paper, which consists of the following three main steps: (1) From the template of normal galaxy, the two sets of samples are simulated, one with the redshift of 0.0-0.3, the other of 0.3-0.5, then the PCA is used to extract the main components, and train samples are projected to the main component subspace to obtain characteristic spectra. (2) The characteristic spectra are used to train a Probabilistic Neural Network to obtain a Bayes classifier. (3) An unknown real NG spectrum is first inputted to this Bayes classifier to determine the possible range of redshift, then the template matching is invoked to locate the redshift value within the estimated range. Compared with the traditional template matching technique with an unconstrained range, our proposed method not only halves the computational load, but also increases the estimation accuracy. As a result, the proposed method is particularly useful for automatic spectrum processing produced from a large-scale sky survey project.

  18. Pathological speech signal analysis and classification using empirical mode decomposition.

    PubMed

    Kaleem, Muhammad; Ghoraani, Behnaz; Guergachi, Aziz; Krishnan, Sridhar

    2013-07-01

    Automated classification of normal and pathological speech signals can provide an objective and accurate mechanism for pathological speech diagnosis, and is an active area of research. A large part of this research is based on analysis of acoustic measures extracted from sustained vowels. However, sustained vowels do not reflect real-world attributes of voice as effectively as continuous speech, which can take into account important attributes of speech such as rapid voice onset and termination, changes in voice frequency and amplitude, and sudden discontinuities in speech. This paper presents a methodology based on empirical mode decomposition (EMD) for classification of continuous normal and pathological speech signals obtained from a well-known database. EMD is used to decompose randomly chosen portions of speech signals into intrinsic mode functions, which are then analyzed to extract meaningful temporal and spectral features, including true instantaneous features which can capture discriminative information in signals hidden at local time-scales. A total of six features are extracted, and a linear classifier is used with the feature vector to classify continuous speech portions obtained from a database consisting of 51 normal and 161 pathological speakers. A classification accuracy of 95.7 % is obtained, thus demonstrating the effectiveness of the methodology.

  19. A multiple maximum scatter difference discriminant criterion for facial feature extraction.

    PubMed

    Song, Fengxi; Zhang, David; Mei, Dayong; Guo, Zhongwei

    2007-12-01

    Maximum scatter difference (MSD) discriminant criterion was a recently presented binary discriminant criterion for pattern classification that utilizes the generalized scatter difference rather than the generalized Rayleigh quotient as a class separability measure, thereby avoiding the singularity problem when addressing small-sample-size problems. MSD classifiers based on this criterion have been quite effective on face-recognition tasks, but as they are binary classifiers, they are not as efficient on large-scale classification tasks. To address the problem, this paper generalizes the classification-oriented binary criterion to its multiple counterpart--multiple MSD (MMSD) discriminant criterion for facial feature extraction. The MMSD feature-extraction method, which is based on this novel discriminant criterion, is a new subspace-based feature-extraction method. Unlike most other subspace-based feature-extraction methods, the MMSD computes its discriminant vectors from both the range of the between-class scatter matrix and the null space of the within-class scatter matrix. The MMSD is theoretically elegant and easy to calculate. Extensive experimental studies conducted on the benchmark database, FERET, show that the MMSD out-performs state-of-the-art facial feature-extraction methods such as null space method, direct linear discriminant analysis (LDA), eigenface, Fisherface, and complete LDA.

  20. Mexican Hat Wavelet Kernel ELM for Multiclass Classification.

    PubMed

    Wang, Jie; Song, Yi-Fan; Ma, Tian-Lei

    2017-01-01

    Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.

  1. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    PubMed

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  2. The twisted radio structure of PSO J334.2028+01.4075, still a supermassive binary black hole candidate

    NASA Astrophysics Data System (ADS)

    Mooley, K. P.; Wrobel, J. M.; Anderson, M. M.; Hallinan, G.

    2018-01-01

    Supermassive binary black holes (BBHs) on sub-parsec scales are prime targets for gravitational wave experiments. They also provide insights on close binary evolution and hierarchical structure formation. Sub-parsec BBHs cannot be spatially resolved but indirect methods can identify candidates. In 2015 Liu et al. reported an optical-continuum periodicity in the quasar PSO J334.2028+01.4075, with the estimated mass and rest-frame period suggesting an orbital separation of about 0.006 pc (0.7 μ arcsec). The persistence of the quasar's optical periodicity has recently been disfavoured over an extended baseline. However, if a radio jet is launched from a sub-parsec BBH, the binary's properties can influence the radio structure on larger scales. Here, we use the Very Long Baseline Array (VLBA) and Karl G. Jansky Very Large Array (VLA) to study the parsec- and kiloparsec-scale emission energized by the quasar's putative BBH. We find two VLBA components separated by 3.6 mas (30 pc), tentatively identifying one as the VLBA 'core' from which the other was ejected. The VLBA components contribute to a point-like, time-variable VLA source that is straddled by lobes spanning 8 arcsec (66 kpc). We classify PSO J334.2028+01.4075 as a lobe-dominated quasar, albeit with an atypically large twist of 39° between its elongation position angles on parsec- and kiloparsec-scales. By analogy with 3C 207, a well-studied lobe-dominated quasar with a similarly-rare twist, we speculate that PSO J334.2028+01.4075 could be ejecting jet components over an inner cone that traces a precessing jet in a BBH system.

  3. Complexity analysis on public transport networks of 97 large- and medium-sized cities in China

    NASA Astrophysics Data System (ADS)

    Tian, Zhanwei; Zhang, Zhuo; Wang, Hongfei; Ma, Li

    2018-04-01

    The traffic situation in Chinese urban areas is continuing to deteriorate. To make a better planning and designing of the public transport system, it is necessary to make profound research on the structure of urban public transport networks (PTNs). We investigate 97 large- and medium-sized cities’ PTNs in China, construct three types of network models — bus stop network, bus transit network and bus line network, then analyze the structural characteristics of them. It is revealed that bus stop network is small-world and scale-free, bus transit network and bus line network are both small-world. Betweenness centrality of each city’s PTN shows similar distribution pattern, although these networks’ size is various. When classifying cities according to the characteristics of PTNs or economic development level, the results are similar. It means that the development of cities’ economy and transport network has a strong correlation, PTN expands in a certain model with the development of economy.

  4. Invertebrate Iridoviruses: A Glance over the Last Decade

    PubMed Central

    Özcan, Orhan; Ilter-Akulke, Ayca Zeynep; Scully, Erin D.; Özgen, Arzu

    2018-01-01

    Members of the family Iridoviridae (iridovirids) are large dsDNA viruses that infect both invertebrate and vertebrate ectotherms and whose symptoms range in severity from minor reductions in host fitness to systemic disease and large-scale mortality. Several characteristics have been useful for classifying iridoviruses; however, novel strains are continuously being discovered and, in many cases, reliable classification has been challenging. Further impeding classification, invertebrate iridoviruses (IIVs) can occasionally infect vertebrates; thus, host range is often not a useful criterion for classification. In this review, we discuss the current classification of iridovirids, focusing on genomic and structural features that distinguish vertebrate and invertebrate iridovirids and viral factors linked to host interactions in IIV6 (Invertebrate iridescent virus 6). In addition, we show for the first time how complete genome sequences of viral isolates can be leveraged to improve classification of new iridovirid isolates and resolve ambiguous relations. Improved classification of the iridoviruses may facilitate the identification of genus-specific virulence factors linked with diverse host phenotypes and host interactions. PMID:29601483

  5. Invertebrate Iridoviruses: A Glance over the Last Decade.

    PubMed

    İnce, İkbal Agah; Özcan, Orhan; Ilter-Akulke, Ayca Zeynep; Scully, Erin D; Özgen, Arzu

    2018-03-30

    Members of the family Iridoviridae (iridovirids) are large dsDNA viruses that infect both invertebrate and vertebrate ectotherms and whose symptoms range in severity from minor reductions in host fitness to systemic disease and large-scale mortality. Several characteristics have been useful for classifying iridoviruses; however, novel strains are continuously being discovered and, in many cases, reliable classification has been challenging. Further impeding classification, invertebrate iridoviruses (IIVs) can occasionally infect vertebrates; thus, host range is often not a useful criterion for classification. In this review, we discuss the current classification of iridovirids, focusing on genomic and structural features that distinguish vertebrate and invertebrate iridovirids and viral factors linked to host interactions in IIV6 (Invertebrate iridescent virus 6). In addition, we show for the first time how complete genome sequences of viral isolates can be leveraged to improve classification of new iridovirid isolates and resolve ambiguous relations. Improved classification of the iridoviruses may facilitate the identification of genus-specific virulence factors linked with diverse host phenotypes and host interactions.

  6. Identification of consensus biomarkers for predicting non-genotoxic hepatocarcinogens

    PubMed Central

    Huang, Shan-Han; Tung, Chun-Wei

    2017-01-01

    The assessment of non-genotoxic hepatocarcinogens (NGHCs) is currently relying on two-year rodent bioassays. Toxicogenomics biomarkers provide a potential alternative method for the prioritization of NGHCs that could be useful for risk assessment. However, previous studies using inconsistently classified chemicals as the training set and a single microarray dataset concluded no consensus biomarkers. In this study, 4 consensus biomarkers of A2m, Ca3, Cxcl1, and Cyp8b1 were identified from four large-scale microarray datasets of the one-day single maximum tolerated dose and a large set of chemicals without inconsistent classifications. Machine learning techniques were subsequently applied to develop prediction models for NGHCs. The final bagging decision tree models were constructed with an average AUC performance of 0.803 for an independent test. A set of 16 chemicals with controversial classifications were reclassified according to the consensus biomarkers. The developed prediction models and identified consensus biomarkers are expected to be potential alternative methods for prioritization of NGHCs for further experimental validation. PMID:28117354

  7. Learning directed acyclic graphs from large-scale genomics data.

    PubMed

    Nikolay, Fabio; Pesavento, Marius; Kritikos, George; Typas, Nassos

    2017-09-20

    In this paper, we consider the problem of learning the genetic interaction map, i.e., the topology of a directed acyclic graph (DAG) of genetic interactions from noisy double-knockout (DK) data. Based on a set of well-established biological interaction models, we detect and classify the interactions between genes. We propose a novel linear integer optimization program called the Genetic-Interactions-Detector (GENIE) to identify the complex biological dependencies among genes and to compute the DAG topology that matches the DK measurements best. Furthermore, we extend the GENIE program by incorporating genetic interaction profile (GI-profile) data to further enhance the detection performance. In addition, we propose a sequential scalability technique for large sets of genes under study, in order to provide statistically significant results for real measurement data. Finally, we show via numeric simulations that the GENIE program and the GI-profile data extended GENIE (GI-GENIE) program clearly outperform the conventional techniques and present real data results for our proposed sequential scalability technique.

  8. Exploiting Language Models to Classify Events from Twitter

    PubMed Central

    Vo, Duc-Thuan; Hai, Vo Thuan; Ock, Cheol-Young

    2015-01-01

    Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP), which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets' features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events. PMID:26451139

  9. Classification of Community Hospitals by Scope of Service

    PubMed Central

    Edwards, Mary; Miller, Jon D.; Schumacher, Rex

    1972-01-01

    Four indexes are presented for classifying short-term nonfederal general hospitals by the scope of service they provide. The indexes, constructed by the application of Guttman scaling to data from 5439 hospitals, are tested for cohesiveness and unidimensionality and their relation to hospital expenses and staffing is examined. The usefulness of the indexes for classifying hospitals and as stratification variables is discussed. PMID:4631546

  10. Rank Determination of Mental Functions by 1D Wavelets and Partial Correlation.

    PubMed

    Karaca, Y; Aslan, Z; Cattani, C; Galletta, D; Zhang, Y

    2017-01-01

    The main aim of this paper is to classify mental functions by the Wechsler Adult Intelligence Scale-Revised tests with a mixed method based on wavelets and partial correlation. The Wechsler Adult Intelligence Scale-Revised is a widely used test designed and applied for the classification of the adults cognitive skills in a comprehensive manner. In this paper, many different intellectual profiles have been taken into consideration to measure the relationship between the mental functioning and psychological disorder. We propose a method based on wavelets and correlation analysis for classifying mental functioning, by the analysis of some selected parameters measured by the Wechsler Adult Intelligence Scale-Revised tests. In particular, 1-D Continuous Wavelet Analysis, 1-D Wavelet Coefficient Method and Partial Correlation Method have been analyzed on some Wechsler Adult Intelligence Scale-Revised parameters such as School Education, Gender, Age, Performance Information Verbal and Full Scale Intelligence Quotient. In particular, we will show that gender variable has a negative but a significant role on age and Performance Information Verbal factors. The age parameters also has a significant relation in its role on Performance Information Verbal and Full Scale Intelligence Quotient change.

  11. Integrating human and machine intelligence in galaxy morphology classification tasks

    NASA Astrophysics Data System (ADS)

    Beck, Melanie R.; Scarlata, Claudia; Fortson, Lucy F.; Lintott, Chris J.; Simmons, B. D.; Galloway, Melanie A.; Willett, Kyle W.; Dickinson, Hugh; Masters, Karen L.; Marshall, Philip J.; Wright, Darryl

    2018-06-01

    Quantifying galaxy morphology is a challenging yet scientifically rewarding task. As the scale of data continues to increase with upcoming surveys, traditional classification methods will struggle to handle the load. We present a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme, we increase the classification rate nearly 5-fold classifying 226 124 galaxies in 92 d of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7 per cent accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides at least a factor of 8 increase in the classification rate, classifying 210 803 galaxies in just 32 d of GZ2 project time with 93.1 per cent accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.

  12. Probabilistic classifiers with high-dimensional data

    PubMed Central

    Kim, Kyung In; Simon, Richard

    2011-01-01

    For medical classification problems, it is often desirable to have a probability associated with each class. Probabilistic classifiers have received relatively little attention for small n large p classification problems despite of their importance in medical decision making. In this paper, we introduce 2 criteria for assessment of probabilistic classifiers: well-calibratedness and refinement and develop corresponding evaluation measures. We evaluated several published high-dimensional probabilistic classifiers and developed 2 extensions of the Bayesian compound covariate classifier. Based on simulation studies and analysis of gene expression microarray data, we found that proper probabilistic classification is more difficult than deterministic classification. It is important to ensure that a probabilistic classifier is well calibrated or at least not “anticonservative” using the methods developed here. We provide this evaluation for several probabilistic classifiers and also evaluate their refinement as a function of sample size under weak and strong signal conditions. We also present a cross-validation method for evaluating the calibration and refinement of any probabilistic classifier on any data set. PMID:21087946

  13. Measuring Poverty in Southern India: A Comparison of Socio-Economic Scales Evaluated against Childhood Stunting

    PubMed Central

    Kattula, Deepthi; Venugopal, Srinivasan; Velusamy, Vasanthakumar; Sarkar, Rajiv; Jiang, Victoria; S., Mahasampath Gowri; Henry, Ankita; Deosaran, Jordanna Devi; Muliyil, Jayaprakash; Kang, Gagandeep

    2016-01-01

    Introduction Socioeconomic status (SES) scales measure poverty, wealth and economic inequality in a population to guide appropriate economic and public health policies. Measurement of poverty and comparison of material deprivation across nations is a challenge. This study compared four SES scales which have been used locally and internationally and evaluated them against childhood stunting, used as an indicator of chronic deprivation, in urban southern India. Methods A door-to-door survey collected information on socio-demographic indicators such as education, occupation, assets, income and living conditions in a semi-urban slum area in Vellore, Tamil Nadu in southern India. A total of 7925 households were categorized by four SES scales—Kuppuswamy scale, Below Poverty Line scale (BPL), the modified Kuppuswamy scale, and the multidimensional poverty index (MDPI) and the level of agreement compared between scales. Logistic regression was used to test the association of SES scales with stunting. Findings The Kuppuswamy, BPL, MDPI and modified Kuppuswamy scales classified 7.1%, 1%, 5.5%, and 55.3% of families as low SES respectively, indicating conservative estimation of low SES by the BPL and MDPI scales in comparison with the modified Kuppuswamy scale, which had the highest sensitivity (89%). Children from low SES classified by all scales had higher odds of stunting, but the level of agreement between scales was very poor ranging from 1%-15%. Conclusion There is great non-uniformity between existing SES scales and cautious interpretation of SES scales is needed in the context of social, cultural, and economic realities. PMID:27490200

  14. Neighbourhood-Scale Urban Forest Ecosystem Classification

    Treesearch

    James W.N. Steenberg; Andrew A. Millward; Peter N. Duinker; David J. Nowak; Pamela J. Robinson

    2015-01-01

    Urban forests are now recognized as essential components of sustainable cities, but there remains uncertainty concerning how to stratify and classify urban landscapes into units of ecological significance at spatial scales appropriate for management. Ecosystem classification is an approach that entails quantifying the social and ecological processes that shape...

  15. A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments

    NASA Astrophysics Data System (ADS)

    Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk

    2016-07-01

    Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.

  16. Land use/land cover mapping using multi-scale texture processing of high resolution data

    NASA Astrophysics Data System (ADS)

    Wong, S. N.; Sarker, M. L. R.

    2014-02-01

    Land use/land cover (LULC) maps are useful for many purposes, and for a long time remote sensing techniques have been used for LULC mapping using different types of data and image processing techniques. In this research, high resolution satellite data from IKONOS was used to perform land use/land cover mapping in Johor Bahru city and adjacent areas (Malaysia). Spatial image processing was carried out using the six texture algorithms (mean, variance, contrast, homogeneity, entropy, and GLDV angular second moment) with five difference window sizes (from 3×3 to 11×11). Three different classifiers i.e. Maximum Likelihood Classifier (MLC), Artificial Neural Network (ANN) and Supported Vector Machine (SVM) were used to classify the texture parameters of different spectral bands individually and all bands together using the same training and validation samples. Results indicated that texture parameters of all bands together generally showed a better performance (overall accuracy = 90.10%) for land LULC mapping, however, single spectral band could only achieve an overall accuracy of 72.67%. This research also found an improvement of the overall accuracy (OA) using single-texture multi-scales approach (OA = 89.10%) and single-scale multi-textures approach (OA = 90.10%) compared with all original bands (OA = 84.02%) because of the complementary information from different bands and different texture algorithms. On the other hand, all of the three different classifiers have showed high accuracy when using different texture approaches, but SVM generally showed higher accuracy (90.10%) compared to MLC (89.10%) and ANN (89.67%) especially for the complex classes such as urban and road.

  17. Examples of Deep Seated Gravitational Slope Deformations in the central part of the Lower Beskids, (the Polish Flysch Carpathians)

    NASA Astrophysics Data System (ADS)

    Zatorski, Michał

    2016-04-01

    The Lower Beskids are located between the western and eastern parts of the Carpathian flysch belt, whereas the low altitudes of passes and ridges in this region have until now been identified mainly with the differences in bedrock resistance. In the light of contemporary information regarding the geology of this area, the hypothesis of the gravitational placement of large tectonic elements has become topical again. A particularly interesting area is the ridge and foreland of the Magura Wątkowska, bordering in the north with the Sanok-Jasło Pits (a denudation valley). This edge zone of the Lower Beskids has a complicated geological structure, i.e. it constitutes a tectonic contact of the Magura Unit and the Central Carpathian Depression (the depressed part of the Silesian nappe). During the field research and analyses regarding the identification of morphostructural elements, the important role of various kinds of lineaments was observed. Some of the inventoried lineaments were, e.g. large size faults or effects of the impact of tectonic processes on bedrock. Structures in the rock (cracks, faults) accompanying them are important in determining the type of macro scale gravitational movements. The outer part of fold structures in the foreland of the Magura Wątkowska shows the rotation around the longitudinal syncline axis, and is an excellent research field for a comprehensive analysis of gravitational movements, both of the basin type and the DSGSD (Deep Seated Gravitational Slope Deformations) type. Determining the types of tectonic lineaments was based on a review of selected directions in the context of the course of tectonic structures in the study area. On that basis, lineaments were classified into two morphogenetic groups, i.e. structures that do not result in visible movements relative to the analyzed rock massif (cracks), and those causing the displacement of the rock massif (faults, overthrust). Using the directional and contour diagrams generated by measuring the spatial orientation of joint planes, gravitational macrocomplexes with a characteristic joint system were singled out. Next, by correlating them with fault zones, a morphogenetic analysis was performed the result of which was a precise characterization of the type of gravitational morphogenetic processes in the meso scale (e.g. large rock landslides) as well as in the macro scale (the basin type or DSGSD). Ultimately, the research results were used to classify lineaments in the context of the structural control of the Carpathian Mountains (gravity development of macro scale landforms) and to reinterpret the spatial interdependence of landforms (e.g. ridge, ridge-top trenches and rifts) with the geological structure. The research conducted so far indicates a variety of macro scale movements in the edge zone of the research area. Based on the morphotectonic analysis performed so far, the following examples of displacement have been found: lateral spreading, toppling, and rotation movement. The effects of these movements are associated with both the basin phases and the DSGSD, so they play an important morphogenetic role, leading to the fragmentation of the morphological threshold of the Lower Beskids, and to the development of characteristic structural landforms.

  18. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  19. Assessing Impression Management With the MMPI-2 in Child Custody Litigation.

    PubMed

    Arce, Ramón; Fariña, Francisca; Seijo, Dolores; Novo, Mercedes

    2015-12-01

    Forensic psychological evaluation of parents in child custody litigation is primarily focused on evaluating parenting capacity and underreporting. The biased responses of underreporting have been classified as Impression Management (IM) or as Self-Deceptive Positivity (S-DP), which are regarded to be conscious or unconscious in nature, respectively. A field study was undertaken to assess impression management on the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) in child custody cases, the accuracy of the MMPI-2 scales in classifying IM, and what parents in child custody litigation actually manipulate in terms of IM. A total of 244 parents in child custody litigation and 244 parents under standard instructions were administered the MMPI-2. The results revealed that the L, Mp, Wsd, and Od scales discriminated between both samples of parents; the rate of satisfactory classification (i.e., odds ratio ranged from 5.7 for Wsd to 23.3 for Od) and an incremental validity of Od over Mp and Wsd. As for the effects of IM, the results show IM effects in the Basic Clinical Scales, the Restructured Clinical Scales, the Personality Psychopathology Five Scales, the Content Scales, and the Supplementary Scales. The implications of the results are discussed in relation to the forensic evaluation of parents in child custody litigation. © The Author(s) 2014.

  20. Towards a critical transition theory under different temporal scales and noise strengths

    NASA Astrophysics Data System (ADS)

    Shi, Jifan; Li, Tiejun; Chen, Luonan

    2016-03-01

    The mechanism of critical phenomena or critical transitions has been recently studied from various aspects, in particular considering slow parameter change and small noise. In this article, we systematically classify critical transitions into three types based on temporal scales and noise strengths of dynamical systems. Specifically, the classification is made by comparing three important time scales τλ, τtran, and τergo, where τλ is the time scale of parameter change (e.g., the change of environment), τtran is the time scale when a particle or state transits from a metastable state into another, and τergo is the time scale when the system becomes ergodic. According to the time scales, we classify the critical transition behaviors as three types, i.e., state transition, basin transition, and distribution transition. Moreover, for each type of transition, there are two cases, i.e., single-trajectory transition and multitrajectory ensemble transition, which correspond to the transition of individual behavior and population behavior, respectively. We also define the critical point for each type of critical transition, derive several properties, and further propose the indicators for predicting critical transitions with numerical simulations. In addition, we show that the noise-to-signal ratio is effective to make the classification of critical transitions for real systems.

  1. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation.

    PubMed

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-04-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers.

  2. Calculation of wind speeds required to damage or destroy buildings

    NASA Astrophysics Data System (ADS)

    Liu, Henry

    Determination of wind speeds required to damage or destroy a building is important not only for the improvement of building design and construction but also for the estimation of wind speeds in tornadoes and other damaging storms. For instance, since 1973 the U.S. National Weather Service has been using the well-known Fujita scale (F scale) to estimate the maximum wind speeds of tornadoes [Fujita, 1981]. The F scale classifies tornadoes into 13 numbers, F-0 through F-12. The wind speed (maximum gust speed) associated with each F number is given in Table 1. Note that F-6 through F-12 are for wind speeds between 319 mi/hr (mph) and the sonic velocity (approximately 760 mph; 1 mph = 1.6 km/kr). However, since no tornadoes have been classified to exceed F-5, the F-6 through F-12 categories have no practical meaning [Fujita, 1981].

  3. Directional Multi-scale Modeling of High-Resolution Computed Tomography (HRCT) Lung Images for Diffuse Lung Disease Classification

    NASA Astrophysics Data System (ADS)

    Vo, Kiet T.; Sowmya, Arcot

    A directional multi-scale modeling scheme based on wavelet and contourlet transforms is employed to describe HRCT lung image textures for classifying four diffuse lung disease patterns: normal, emphysema, ground glass opacity (GGO) and honey-combing. Generalized Gaussian density parameters are used to represent the detail sub-band features obtained by wavelet and contourlet transforms. In addition, support vector machines (SVMs) with excellent performance in a variety of pattern classification problems are used as classifier. The method is tested on a collection of 89 slices from 38 patients, each slice of size 512x512, 16 bits/pixel in DICOM format. The dataset contains 70,000 ROIs of those slices marked by experienced radiologists. We employ this technique at different wavelet and contourlet transform scales for diffuse lung disease classification. The technique presented here has best overall sensitivity 93.40% and specificity 98.40%.

  4. Effect of the Modified Glasgow Coma Scale Score Criteria for Mild Traumatic Brain Injury on Mortality Prediction: Comparing Classic and Modified Glasgow Coma Scale Score Model Scores of 13

    PubMed Central

    Mena, Jorge Humberto; Sanchez, Alvaro Ignacio; Rubiano, Andres M.; Peitzman, Andrew B.; Sperry, Jason L.; Gutierrez, Maria Isabel; Puyana, Juan Carlos

    2011-01-01

    Objective The Glasgow Coma Scale (GCS) classifies Traumatic Brain Injuries (TBI) as Mild (14–15); Moderate (9–13) or Severe (3–8). The ATLS modified this classification so that a GCS score of 13 is categorized as mild TBI. We investigated the effect of this modification on mortality prediction, comparing patients with a GCS of 13 classified as moderate TBI (Classic Model) to patients with GCS of 13 classified as mild TBI (Modified Model). Methods We selected adult TBI patients from the Pennsylvania Outcome Study database (PTOS). Logistic regressions adjusting for age, sex, cause, severity, trauma center level, comorbidities, and isolated TBI were performed. A second evaluation included the time trend of mortality. A third evaluation also included hypothermia, hypotension, mechanical ventilation, screening for drugs, and severity of TBI. Discrimination of the models was evaluated using the area under receiver operating characteristic curve (AUC). Calibration was evaluated using the Hoslmer-Lemershow goodness of fit (GOF) test. Results In the first evaluation, the AUCs were 0.922 (95 %CI, 0.917–0.926) and 0.908 (95 %CI, 0.903–0.912) for classic and modified models, respectively. Both models showed poor calibration (p<0.001). In the third evaluation, the AUCs were 0.946 (95 %CI, 0.943 – 0.949) and 0.938 (95 %CI, 0.934 –0.940) for the classic and modified models, respectively, with improvements in calibration (p=0.30 and p=0.02 for the classic and modified models, respectively). Conclusion The lack of overlap between ROC curves of both models reveals a statistically significant difference in their ability to predict mortality. The classic model demonstrated better GOF than the modified model. A GCS of 13 classified as moderate TBI in a multivariate logistic regression model performed better than a GCS of 13 classified as mild. PMID:22071923

  5. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer

    PubMed Central

    Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant

    2015-01-01

    Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029

  6. Linking Capabilities to Functionings: Adapting Narrative Forms from Role-Playing Games to Education

    ERIC Educational Resources Information Center

    Cheville, R. Alan

    2016-01-01

    This paper explores science, technology, engineering, and mathematics education in the context of inequality of opportunity by examining educational systems through two lenses: curricular mode and system scale. Curricular mode classifies learning experiences as addressing knowing, acting, or being, while system scale captures how learning…

  7. A proposed ethogram of large-carnivore predatory behavior, exemplified by the wolf

    USGS Publications Warehouse

    MacNulty, D.R.; Mech, L.D.; Smith, D.W.

    2007-01-01

    Although predatory behavior is traditionally described by a basic ethogram composed of 3 phases (search, pursue, and capture), behavioral studies of large terrestrial carnivores generally use the concept of a "hunt" to classify and measure foraging. This approach is problematic because there is no consensus on what behaviors constitute a hunt. We therefore examined how the basic ethogram could be used as a common framework for classifying large-carnivore behavior. We used >2,150 h of observed wolf (Canis lupus) behavior in Yellowstone National Park, including 517 and 134 encounters with elk (Cervus elaphus) and American bison (Bison bison), respectively, to demonstrate the functional importance of several frequently described, but rarely quantified, patterns of large-carnivore behavior not explicitly described by the basic ethogram (approaching, watching, and attacking groups). To account for these additionally important behaviors we propose a modified form of the basic ethogram (search, approach, watch, attack-group, attack-individual, and capture). We tested the applicability of this ethogram by comparing it to 31 previous classifications and descriptions involving 7 other species and 5 other wolf populations. Close correspondence among studies suggests that this ethogram may provide a generally useful scheme for classifying large-carnivore predatory behavior that is behaviorally less ambiguous than the concept of a hunt. ?? 2007 American Society of Mammalogists.

  8. Beneficial use of classified paper waste for training land rehabilitation

    USDA-ARS?s Scientific Manuscript database

    This project will demonstrate and validate utilization of pulverized waste paper as an organic soil amendment for rehabilitation of disturbed training lands. Large quantities of classified documents are landfilled by the Department of Defense (DoD) since they have been pulverized too finely to be re...

  9. Naïve observers' perceptions of family drawings by 7-year-olds with disorganized attachment histories.

    PubMed

    Madigan, Sheri; Goldberg, Susan; Moran, Greg; Pederson, David R

    2004-09-01

    Previous research has succeeded in distinguishing among drawings made by children with histories of organized attachment relationships (secure, avoidant, and resistant); however, drawings of children with histories of disorganized attachment have yet to be systematically investigated. The purpose of this study was to determine whether naïve observers would respond differentially to family drawings of 7-year-olds who were classified in infancy as disorganized vs. organized. Seventy-three undergraduate students from one university and 78 from a second viewed 50 family drawings of 7-year-olds (25 by children with organized infant attachment and 25 by children with disorganized infant attachment). Participants were asked to (1) circle the emotion that best described their reaction to the drawings and (2) rate the drawings on 6 bipolar scales. Drawings from children classified as disorganized in infancy evoked positive emotion labels less often and negative emotion labels more often than those children classified as organized. Furthermore, drawings from children classified as disorganized in infancy received higher ratings on scales for disorganization, carelessness, family chaos, bizarreness, uneasiness, and dysfunction. These data indicate that naive observers are relatively successful in distinguishing selected features of drawings by children with histories of disorganized vs. organized attachment.

  10. A MBD-seq protocol for large-scale methylome-wide studies with (very) low amounts of DNA.

    PubMed

    Aberg, Karolina A; Chan, Robin F; Shabalin, Andrey A; Zhao, Min; Turecki, Gustavo; Staunstrup, Nicklas Heine; Starnawska, Anna; Mors, Ole; Xie, Lin Y; van den Oord, Edwin Jcg

    2017-09-01

    We recently showed that, after optimization, our methyl-CpG binding domain sequencing (MBD-seq) application approximates the methylome-wide coverage obtained with whole-genome bisulfite sequencing (WGB-seq), but at a cost that enables adequately powered large-scale association studies. A prior drawback of MBD-seq is the relatively large amount of genomic DNA (ideally >1 µg) required to obtain high-quality data. Biomaterials are typically expensive to collect, provide a finite amount of DNA, and may simply not yield sufficient starting material. The ability to use low amounts of DNA will increase the breadth and number of studies that can be conducted. Therefore, we further optimized the enrichment step. With this low starting material protocol, MBD-seq performed equally well, or better, than the protocol requiring ample starting material (>1 µg). Using only 15 ng of DNA as input, there is minimal loss in data quality, achieving 93% of the coverage of WGB-seq (with standard amounts of input DNA) at similar false/positive rates. Furthermore, across a large number of genomic features, the MBD-seq methylation profiles closely tracked those observed for WGB-seq with even slightly larger effect sizes. This suggests that MBD-seq provides similar information about the methylome and classifies methylation status somewhat more accurately. Performance decreases with <15 ng DNA as starting material but, even with as little as 5 ng, MBD-seq still achieves 90% of the coverage of WGB-seq with comparable genome-wide methylation profiles. Thus, the proposed protocol is an attractive option for adequately powered and cost-effective methylome-wide investigations using (very) low amounts of DNA.

  11. Rotation-invariant convolutional neural networks for galaxy morphology prediction

    NASA Astrophysics Data System (ADS)

    Dieleman, Sander; Willett, Kyle W.; Dambre, Joni

    2015-06-01

    Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time consuming and does not scale to large (≳104) numbers of images. Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images. We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project. For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy (>99 per cent) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts' workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the Large Synoptic Survey Telescope.

  12. Learn ++.NC: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes.

    PubMed

    Muhlbaier, Michael D; Topalis, Apostolos; Polikar, Robi

    2009-01-01

    We have previously introduced an incremental learning algorithm Learn(++), which learns novel information from consecutive data sets by generating an ensemble of classifiers with each data set, and combining them by weighted majority voting. However, Learn(++) suffers from an inherent "outvoting" problem when asked to learn a new class omega(new) introduced by a subsequent data set, as earlier classifiers not trained on this class are guaranteed to misclassify omega(new) instances. The collective votes of earlier classifiers, for an inevitably incorrect decision, then outweigh the votes of the new classifiers' correct decision on omega(new) instances--until there are enough new classifiers to counteract the unfair outvoting. This forces Learn(++) to generate an unnecessarily large number of classifiers. This paper describes Learn(++).NC, specifically designed for efficient incremental learning of multiple new classes using significantly fewer classifiers. To do so, Learn (++).NC introduces dynamically weighted consult and vote (DW-CAV), a novel voting mechanism for combining classifiers: individual classifiers consult with each other to determine which ones are most qualified to classify a given instance, and decide how much weight, if any, each classifier's decision should carry. Experiments on real-world problems indicate that the new algorithm performs remarkably well with substantially fewer classifiers, not only as compared to its predecessor Learn(++), but also as compared to several other algorithms recently proposed for similar problems.

  13. Orographic barriers GIS-based definition of the Campania-Lucanian Apennine Range (Southern Italy)

    NASA Astrophysics Data System (ADS)

    Cuomo, Albina; Guida, Domenico

    2010-05-01

    The presence of mountains on the land surfaces plays a central role in the space-time dynamics of the hydrological, geomorphic and ecological systems (Roe G. H., 2005). The aim of this paper is to identify, delimitate and classify the orographic relief in the Campania - Lucanian Apennine (Southern Italy) to investigate the effects of large-scale orographic and small-scale windward-leeward phenomena on distribution, frequency and duration of rainfall. The scale-dependent effects derived from the topographic relief favor the utilization of a hierarchical and multi-scale approach. The approach is based on a GIS procedure applied on Digital Elevation Model (DEM) with 20 meters cell size and derived from Regional Technical Map (CTR) of Campania region (1:5000). The DEM has been smoothed from data spikes and pits and we have then proceed to: a) Identify the three basic landforms of the relief (summit, hillslope and plain) by generalizing a previous 10-type landforms using the TPI method (Weiss A. 2001) and by simplifying the established rules of the differential geometry on topographic surface; b) Delimitate the mountain relief by modifying the method proposed by O. Z. Chaudhry and W. A. Mackaness (2008). It is based on three concepts: prominence , morphological variability and parent-child relationship. Graphical results have shown a good spatial correspondence between the digital definition of mountains and their morpho-tectonic structure derived from tectonic geomorphological studies; c) Classify, by using a set rules of spatial statistics (Cluster analysis) on geomorphometric parameters (elevation, curvature, slope, aspect, relative relief and form factor). Finally, we have recognized three prototypal orographic barriers shapes: cone, tableland and ridge, which are fundamental to improve the models of orographic rainfall in the Southern Apennines. References Chaudhry O. Z.and Mackaness W. A. (2008). Creating Mountains out of Mole Hills: Automatic Identification of Hills and Ranges Using Morphometric Analysis. Transactions in GIS. 12(5), pp. 567-589 Roe Gerard H. 2005. Orographic precipitation. Annual Review of Earth and Planetary Sciences. Vol. 33: 645-671. Weiss A., 2001. Topographic position and landform analysis. Poster Presentation. ESRI User Conference. San Diego, CA.

  14. Wavelet-based multiscale window transform and energy and vorticity analysis

    NASA Astrophysics Data System (ADS)

    Liang, Xiang San

    A new methodology, Multiscale Energy and Vorticity Analysis (MS-EVA), is developed to investigate sub-mesoscale, meso-scale, and large-scale dynamical interactions in geophysical fluid flows which are intermittent in space and time. The development begins with the construction of a wavelet-based functional analysis tool, the multiscale window transform (MWT), which is local, orthonormal, self-similar, and windowed on scale. The MWT is first built over the real line then modified onto a finite domain. Properties are explored, the most important one being the property of marginalization which brings together a quadratic quantity in physical space with its phase space representation. Based on MWT the MS-EVA is developed. Energy and enstrophy equations for the large-, meso-, and sub-meso-scale windows are derived and their terms interpreted. The processes thus represented are classified into four categories: transport; transfer, conversion, and dissipation/diffusion. The separation of transport from transfer is made possible with the introduction of the concept of perfect transfer. By the property of marginalization, the classical energetic analysis proves to be a particular case of the MS-EVA. The MS-EVA developed is validated with classical instability problems. The validation is carried out through two steps. First, it is established that the barotropic and baroclinic instabilities are indicated by the spatial averages of certain transfer term interaction analyses. Then calculations of these indicators are made with an Eady model and a Kuo model. The results agree precisely with what is expected from their analytical solutions, and the energetics reproduced reveal a consistent and important aspect of the unknown dynamic structures of instability processes. As an application, the MS-EVA is used to investigate the Iceland-Faeroe frontal (IFF) variability. A MS-EVA-ready dataset is first generated, through a forecasting study with the Harvard Ocean Prediction System using the data gathered during the 1993 NRV Alliance cruise. The application starts with a determination of the scale window bounds, which characterize a double-peak structure in either the time wavelet spectrum or the space wavelet spectrum. The resulting energetics, when locally averaged, reveal that there is a clear baroclinic instability happening around the cold tongue intrusion observed in the forecast. Moreover, an interaction analysis shows that the energy released by the instability indeed goes to the meso-scale window and fuel the growth of the intrusion. The sensitivity study shows that, in this case, the key to a successful application is a correct decomposition of the large-scale window from the meso-scale window.

  15. Programmed self-assembly of large π-conjugated molecules into electroactive one-dimensional nanostructures

    PubMed Central

    Yamamoto, Yohei

    2012-01-01

    Electroactive one-dimensional (1D) nano-objects possess inherent unidirectional charge and energy transport capabilities along with anisotropic absorption and emission of light, which are of great advantage for the development of nanometer-scale electronics and optoelectronics. In particular, molecular nanowires formed by self-assembly of π-conjugated molecules attract increasing attention for application in supramolecular electronics. This review introduces recent topics related to electroactive molecular nanowires. The nanowires are classified into four categories with respect to the electronic states of the constituent molecules: electron donors, acceptors, donor–acceptor pairs and miscellaneous molecules that display interesting electronic properties. Although many challenges still remain for practical use, state-of-the-art 1D supramolecular nanomaterials have already brought significant advances to both fundamental chemical sciences and technological applications. PMID:27877488

  16. Exposure control strategies in the carbonaceous nanomaterial industry.

    PubMed

    Dahm, Matthew M; Yencken, Marianne S; Schubauer-Berigan, Mary K

    2011-06-01

    Little is known about exposure control strategies currently being implemented to minimize exposures during the production or use of nanomaterials in the United States. Our goal was to estimate types and quantities of materials used and factors related to workplace exposure reductions among companies manufacturing or using engineered carbonaceous nanomaterials (ECNs). Information was collected through phone surveys on work practices and exposure control strategies from 30 participating producers and users of ECN. The participants were classified into three groups for further examination. We report here the use of exposure control strategies. Observed patterns suggest that large-scale manufacturers report greater use of nanospecific exposure control strategies particularly for respiratory protection. Workplaces producing or using ECN generally report using engineering and administrative controls as well as personal protective equipment to control workplace employee exposure.

  17. New features of the Moon revealed and identified by CLTM-s01

    NASA Astrophysics Data System (ADS)

    Huang, Qian; Ping, Jinsong; Su, Xiaoli; Shu, Rong; Tang, Geshi

    2009-12-01

    Previous analyses showed a clear asymmetry in the topography, geological material distribution, and crustal thickness between the nearside and farside of the Moon. Lunar detecting data, such as topography and gravity, have made it possible to interpret this hemisphere dichotomy. The high-resolution lunar topographic model CLTM-s01 has revealed that there still exist four unknown features, namely, quasi-impact basin Sternfeld-Lewis (20°S, 232°E), confirmed impact basin Fitzgerald-Jackson (25°N, 191°E), crater Wugang (13°N, 189°E) and volcanic deposited highland Yutu (14°N, 308°E). Furthermore, we analyzed and identified about eleven large-scale impact basins that have been proposed since 1994, and classified them according to their circular characteristics.

  18. Classifying Structures in the ISM with Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Beaumont, Christopher; Goodman, A. A.; Williams, J. P.

    2011-01-01

    The processes which govern molecular cloud evolution and star formation often sculpt structures in the ISM: filaments, pillars, shells, outflows, etc. Because of their morphological complexity, these objects are often identified manually. Manual classification has several disadvantages; the process is subjective, not easily reproducible, and does not scale well to handle increasingly large datasets. We have explored to what extent machine learning algorithms can be trained to autonomously identify specific morphological features in molecular cloud datasets. We show that the Support Vector Machine algorithm can successfully locate filaments and outflows blended with other emission structures. When the objects of interest are morphologically distinct from the surrounding emission, this autonomous classification achieves >90% accuracy. We have developed a set of IDL-based tools to apply this technique to other datasets.

  19. Automated analysis of clonal cancer cells by intravital imaging

    PubMed Central

    Coffey, Sarah Earley; Giedt, Randy J; Weissleder, Ralph

    2013-01-01

    Longitudinal analyses of single cell lineages over prolonged periods have been challenging particularly in processes characterized by high cell turn-over such as inflammation, proliferation, or cancer. RGB marking has emerged as an elegant approach for enabling such investigations. However, methods for automated image analysis continue to be lacking. Here, to address this, we created a number of different multicolored poly- and monoclonal cancer cell lines for in vitro and in vivo use. To classify these cells in large scale data sets, we subsequently developed and tested an automated algorithm based on hue selection. Our results showed that this method allows accurate analyses at a fraction of the computational time required by more complex color classification methods. Moreover, the methodology should be broadly applicable to both in vitro and in vivo analyses. PMID:24349895

  20. Erosion mechanisms of monocrystalline silicon under a microparticle laden air jet

    NASA Astrophysics Data System (ADS)

    Li, Q. L.; Wang, J.; Huang, C. Z.

    2008-08-01

    Microabrasive air-jet machining is considered as a promising precision processing technology for silicon substrates. In this paper, the impressions produced on a monocrystalline silicon by the impacts of microsolid particles entrained by an air jet and the associated microscopic erosion mechanisms are presented and discussed. It is shown that the impressions can be classified into three categories, namely, craters, scratches, and microdents, of which two types of craters and two types of scratches can lead to large-scale fractures. Craters with cleavage fracture surfaces have been found to play an important role in the material removal process. In addition, it is shown that most particles bounced away from the target surface without sliding or rolling during an impact so that most impressions formed are crater-type erosions.

  1. Anticipating the emergence of infectious diseases

    PubMed Central

    Drake, John M.; Rohani, Pejman

    2017-01-01

    In spite of medical breakthroughs, the emergence of pathogens continues to pose threats to both human and animal populations. We present candidate approaches for anticipating disease emergence prior to large-scale outbreaks. Through use of ideas from the theories of dynamical systems and stochastic processes we develop approaches which are not specific to a particular disease system or model, but instead have general applicability. The indicators of disease emergence detailed in this paper can be classified into two parallel approaches: a set of early-warning signals based around the theory of critical slowing down and a likelihood-based approach. To test the reliability of these two approaches we contrast theoretical predictions with simulated data. We find good support for our methods across a range of different model structures and parameter values. PMID:28679666

  2. Water utilization, evapotranspiration and soil moisture monitoring in the south east region of south Australia

    NASA Technical Reports Server (NTRS)

    Mccloy, K. R.; Shepherd, K. J.; Mcintosh, G. F. (Principal Investigator)

    1977-01-01

    The author has identified the following significant results. It was established that reliable estimates of sand and coastal scrub areas can be determined from LANDSAT image classification by the Vec classifier more economically than by conventional means from a map of the coastal zone produced by photointerpretation using 1:10,000 aerial photography. Current LANDSAT imagery is also suitable for monitoring for large scale storm damage to the zone, but the normal change in sand areas extent due to man's activity or other reasons, is about 5 to 10 m per year, occasionally being as great as 30 m per year, so that it is considered that LANDSAT D will have the resolution necessary to monitor these changes but not current imagery.

  3. Nuclear security

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dingell, J.D.

    1991-02-01

    The Department of Energy's (DOE) Lawrence Livermore National Laboratory, located in Livermore, California, generates and controls large numbers of classified documents associated with the research and testing of nuclear weapons. Concern has been raised about the potential for espionage at the laboratory and the national security implications of classified documents being stolen. This paper determines the extent of missing classified documents at the laboratory and assesses the adequacy of accountability over classified documents in the laboratory's custody. Audit coverage was limited to the approximately 600,000 secret documents in the laboratory's custody. The adequacy of DOE's oversight of the laboratory's secretmore » document control program was also assessed.« less

  4. Determining Effects of Non-synonymous SNPs on Protein-Protein Interactions using Supervised and Semi-supervised Learning

    PubMed Central

    Zhao, Nan; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of large-scale protein-protein interaction networks, and can be useful for functional annotation of disease-associated SNPs. SNIP-IN tool is freely accessible as a web-server at http://korkinlab.org/snpintool/. PMID:24784581

  5. Intelligent query by humming system based on score level fusion of multiple classifiers

    NASA Astrophysics Data System (ADS)

    Pyo Nam, Gi; Thu Trang Luong, Thi; Ha Nam, Hyun; Ryoung Park, Kang; Park, Sung-Joo

    2011-12-01

    Recently, the necessity for content-based music retrieval that can return results even if a user does not know information such as the title or singer has increased. Query-by-humming (QBH) systems have been introduced to address this need, as they allow the user to simply hum snatches of the tune to find the right song. Even though there have been many studies on QBH, few have combined multiple classifiers based on various fusion methods. Here we propose a new QBH system based on the score level fusion of multiple classifiers. This research is novel in the following three respects: three local classifiers [quantized binary (QB) code-based linear scaling (LS), pitch-based dynamic time warping (DTW), and LS] are employed; local maximum and minimum point-based LS and pitch distribution feature-based LS are used as global classifiers; and the combination of local and global classifiers based on the score level fusion by the PRODUCT rule is used to achieve enhanced matching accuracy. Experimental results with the 2006 MIREX QBSH and 2009 MIR-QBSH corpus databases show that the performance of the proposed method is better than that of single classifier and other fusion methods.

  6. Investigation of environmental change pattern in Japan

    NASA Technical Reports Server (NTRS)

    Maruyasu, T.; Ochiai, H.; Sugimori, Y.; Shoji, D.; Takeda, K.; Tsuchiya, K.; Nakajima, I.; Nakano, T.; Hayashi, S.; Horikawa, S. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. A detailed land use classification for a large urban area of Tokyo was made using MSS digital data. It was found that residential, commercial, industrial, and wooded areas and grasslands can be successfully classified. A mesoscale vortex associated with large ocean current, Kuroshio, which is a rare phenomenon, was recognized visually through the analysis of MSS data. It was found that this vortex affects the effluent patterns of rivers. Lava flowing from Sakurajima Volcano was clearly classified for three major erruptions (1779, 1914, and 1946) using MSS data.

  7. Progressive deficit in isolated pontine infarction: the association with etiological subtype, lesion topography and outcome.

    PubMed

    Gökçal, Elif; Niftaliyev, Elvin; Baran, Gözde; Deniz, Çiğdem; Asil, Talip

    2017-09-01

    It is important to predict progressive deficit (PD) in isolated pontine infarction, a relatively common problem of clinical stroke practice. Traditionally, lacunar infarctions are known with their progressive course. However, few studies have analyzed the branch atheromatous disease subtype as a subtype of lacunar infarction, separately. There are also conflicting results regarding the relationship with the topography of lesion and PD. In this study, we classified etiological subtypes and lesion topography in isolated pontine infarction and aimed to investigate the association of etiological subtypes, lesion topography and clinical outcome with PD. We analyzed demographics, laboratory parameters, and risk factors of 120 patients having isolated pontine infarction and admitted within 24 h retrospectively. PD was defined as an increase in the National Institutes of Health Stroke scale ≥2 units in 5 days after onset. Patients were classified as following: large artery disease (LAA), basilar artery branch disease (BABD) and small vessel disease (SVD). Upper, middle and lower pontine infarcts were identified longitudinally. Functional outcome at 3 months was determined according to modified Rankin scores. Of 120 patients, 41.7% of the patients were classified as BABD, 30.8% as SVD and 27.5% as LAA. 23 patients (19.2%) exhibited PD. PD was significantly more frequent in patient with BABD (p 0.006). PD was numerically higher in patients with lower pontine infarction. PD was associated with BABD and poor functional outcome. It is important to discriminate the BABD neuroradiologically from other stroke subtypes to predict PD which is associated with poor functional outcome in patients with isolated pontine infarctions.

  8. Using Imaging Spectrometry to Approach Crop Classification from a Water Management Perspective

    NASA Astrophysics Data System (ADS)

    Shivers, S.; Roberts, D. A.

    2017-12-01

    We use hyperspectral remote sensing imagery to classify crops in the Central Valley of California at a level that would be of use to water managers. In California irrigated agriculture uses 80 percent of the state's water supply with differences in water application rate varying by as large as a factor of three, dependent on crop type. Therefore, accurate water resource accounting is dependent upon accurate crop mapping. While on-the-ground crop accounting at the county level requires significant labor and time inputs, remote sensing has the potential to map crops over a greater spatial area with more frequent time intervals. Specifically, imaging spectrometry with its wide spectral range has the ability to detect small spectral differences at the field-level scale that may be indiscernible to multispectral sensors such as Landsat. In this study, crops in the Central Valley were classified into nine categories defined and used by the California Department of Water Resources as having similar water usages. We used the random forest classifier on Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) imagery from June 2013, 2014 and 2015 to analyze accuracy of multi-temporal images and to investigate the extent to which cropping patterns have changed over the course of the 2013-2015 drought. Initial results show accuracies of over 90% for all three years, indicating that hyperspectral imagery has the potential to identify crops by water use group at a single time step with a single sensor, allowing cropping patterns to be monitored in anticipation of water needs.

  9. Exploiting the systematic review protocol for classification of medical abstracts.

    PubMed

    Frunza, Oana; Inkpen, Diana; Matwin, Stan; Klement, William; O'Blenis, Peter

    2011-01-01

    To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers. The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload. For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%. The per-question method that combines classifiers following the specific protocol of the review leads to better results than the global method in terms of recall. Because neither method is efficient enough to classify abstracts reliably by itself, the technology should be applied in a semi-automatic way, with a human expert still involved. When the workflow includes one human expert and the trained automatic classifier, recall improves to an acceptable level, showing that automatic classification techniques can reduce the human workload in the process of building a systematic review. Copyright © 2010 Elsevier B.V. All rights reserved.

  10. Assessment of shoreline vegetation in relation to use by molting black brant Branta bernicla nigricans on the Alaska Coastal Plain

    USGS Publications Warehouse

    Weller, Milton W.; Jensen, K.C.; Taylor, Eric J.; Miller, Mark W.; Bollinger, Karen S.; Derksen, Dirk V.; Esler, Daniel N.; Markon, Carl J.

    1994-01-01

    To evaluate the importance of large thaw lakes on the Alaska Coastal Plain for molting Pacific black brant Branta bernicla nigricans, distribution and life form of shoreline vegetation were assessed using several scales: satellite imagery, point-intercept transects, cover quadrats, and a parameter for water regime. Brant population and distribution estimates from aerial surveys were used to classify large lakes into high, moderate, and low use. Correlations between brant and abundance of their preferred feeding site - moss flats - were best demonstrated by satellite imagery. Intercepts and cover ratings were not correlated, presumably because these techniques were less efficient at assessing area. General observations suggested that the presence of islands, large ice floes, and possibly other physical attributes of the habitat, influenced brant distribution. This area is unique because of low-lying, drained-lake basins that have ideal combinations of moss flats and large water areas where brant seek protection disturbance is vital to the success of this declining species because alternate habitats may not be available elsewhere on the Coastal Plain. in water or on ice floes. Protection of the area from disturbance is vital to the success of this declining species because alternate habitats may not be available elsewhere on the Coastal Plain.

  11. MIDAS, prototype Multivariate Interactive Digital Analysis System, phase 1. Volume 1: System description

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.

    1974-01-01

    The MIDAS System is described as a third-generation fast multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turnaround time and significant gains in throughput. The hardware and software are described. The system contains a mini-computer to control the various high-speed processing elements in the data path, and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating at 200,000 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation.

  12. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  13. A comparison of gradual sedation levels using the Comfort-B scale and bispectral index in children on mechanical ventilation in the pediatric intensive care unit

    PubMed Central

    Silva, Cláudia da Costa; Alves, Marta Maria Osório; El Halal, Michel Georges dos Santos; Pinheiro, Sabrina dos Santos; Carvalho, Paulo Roberto Antonacci

    2013-01-01

    Objective Compare the scores resulting from the Comfort-B scale with the bispectral index in children in an intensive care unit. Methods Eleven children between the ages of 1 month and 16 years requiring mechanical ventilation and sedation were simultaneously classified based on the bispectral index and the Comfort-B scale. Their behavior was recorded using digital photography, and the record was later evaluated by three independent evaluators. Agreement tests (Bland-Altman and Kappa) were then performed. The correlation between the two methods (Pearson correlation) was tested. Results In total, 35 observations were performed on 11 patients. Based on the Kappa coefficient, the agreement among evaluators ranged from 0.56 to 0.75 (p<0.001). There was a positive and consistent association between the bispectral index and the Comfort-B scale [r=0.424 (p=0.011) to r=0.498 (p=0.002)]. Conclusion Due to the strong correlation between the independent evaluators and the consistent correlation between the two methods, the results suggest that the Comfort-B scale is reproducible and useful in classifying the level of sedation in children requiring mechanical ventilation. PMID:24553512

  14. Accelerometry-based gait analysis, an additional objective approach to screen subjects at risk for falling.

    PubMed

    Senden, R; Savelberg, H H C M; Grimm, B; Heyligers, I C; Meijer, K

    2012-06-01

    This study investigated whether the Tinetti scale, as a subjective measure for fall risk, is associated with objectively measured gait characteristics. It is studied whether gait parameters are different for groups that are stratified for fall risk using the Tinetti scale. Moreover, the discriminative power of gait parameters to classify elderly according to the Tinetti scale is investigated. Gait of 50 elderly with a Tinneti>24 and 50 elderly with a Tinetti≤24 was analyzed using acceleration-based gait analysis. Validated algorithms were used to derive spatio-temporal gait parameters, harmonic ratio, inter-stride amplitude variability and root mean square (RMS) from the accelerometer data. Clear differences in gait were found between the groups. All gait parameters correlated with the Tinetti scale (r-range: 0.20-0.73). Only walking speed, step length and RMS showed moderate to strong correlations and high discriminative power to classify elderly according to the Tinetti scale. It is concluded that subtle gait changes that have previously been related to fall risk are not captured by the subjective assessment. It is therefore worthwhile to include objective gait assessment in fall risk screening. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. First in vivo traumatic brain injury imaging via magnetic particle imaging

    NASA Astrophysics Data System (ADS)

    Orendorff, Ryan; Peck, Austin J.; Zheng, Bo; Shirazi, Shawn N.; Ferguson, R. Matthew; Khandhar, Amit P.; Kemp, Scott J.; Goodwill, Patrick; Krishnan, Kannan M.; Brooks, George A.; Kaufer, Daniela; Conolly, Steven

    2017-05-01

    Emergency room visits due to traumatic brain injury (TBI) is common, but classifying the severity of the injury remains an open challenge. Some subjective methods such as the Glasgow Coma Scale attempt to classify traumatic brain injuries, as well as some imaging based modalities such as computed tomography and magnetic resonance imaging. However, to date it is still difficult to detect and monitor mild to moderate injuries. In this report, we demonstrate that the magnetic particle imaging (MPI) modality can be applied to imaging TBI events with excellent contrast. MPI can monitor injected iron nanoparticles over long time scales without signal loss, allowing researchers and clinicians to monitor the change in blood pools as the wound heals.

  16. Cracking up (and down): Linking multi-domain hydraulic properties with multi-scale hydrological processes in shrink-swell soils

    NASA Astrophysics Data System (ADS)

    Stewart, R. D.; Rupp, D. E.; Abou Najm, M. R.; Selker, J. S.

    2017-12-01

    Shrink-swell soils, often classified as Vertisols or vertic intergrades, are found on every continent except Antarctica and within many agricultural and urban regions. These soils are characterized by cyclical shrinking and swelling, in which bulk density and porosity distributions vary as functions of time and soil moisture. Crack networks that form in these soils act as dominant environmental controls on the movement of water, contaminants, and gases, making it important to develop fundamental understanding and tractable models of their hydrologic characteristics and behaviors. In this study, which took place primarily in the Secano Interior region of South-Central Chile, we quantified soil-water interactions across scales using a diverse and innovative dataset. These measurements were then utilized to develop a set of parsimonious multi-domain models for describing hydraulic properties and hydrological processes in shrink-swell soils. In a series of examples, we show how this model can predict porosity distributions, crack widths, saturated hydraulic conductivities, and surface runoff (i.e., overland flow) thresholds, by capturing the dominant mechanisms by which water moves through and interacts with clayey soils. Altogether, these models successfully link small-scale shrinkage/swelling behaviors with large-scale thresholds, and can be applied to describe important processes such as infiltration, overland flow development, and the preferential flow and transport of fluids and gases.

  17. Accounting for "land-grabbing" from a biocapacity viewpoint.

    PubMed

    Coscieme, Luca; Pulselli, Federico M; Niccolucci, Valentina; Patrizi, Nicoletta; Sutton, Paul C

    2016-01-01

    The comparison of the Ecological Footprint and its counterpart (i.e. biocapacity) allow for a classification of the world's countries as ecological creditors (Ecological Footprint lower than biocapacity) or debtors (Ecological Footprint higher than biocapacity). This classification is a national scale assessment on an annual time scale that provides a view of the ecological assets appropriated by the local population versus the natural ecological endowment of a country. We show that GDP per capita over a certain threshold is related with the worsening of the footprint balance in countries classified as ecological debtors. On the other hand, this correlation is lost when ecological creditor nations are considered. There is evidence that governments and investors from high GDP countries are playing a crucial role in impacting the environment at the global scale which is significantly affecting the geography of sustainability and preventing equal opportunities for development. In particular, international market dynamics and the concentration of economic power facilitate the transfer of biocapacity related to “land grabbing”, i.e. large scale acquisition of agricultural land. This transfer mainly occurs from low to high GDP countries, regardless of the actual need of foreign biocapacity, as expressed by the national footprint balance. A first estimation of the amount of biocapacity involved in this phenomenon is provided in this paper in order to better understand its implications on global sustainability and national and international land use policy.

  18. Building Multiclass Classifiers for Remote Homology Detection and Fold Recognition

    DTIC Science & Technology

    2006-04-05

    classes. In this study we evaluate the effectiveness of one of these formulations that was developed by Crammer and Singer [9], which leads to...significantly more complex model can be learned by directly applying the Crammer -Singer multiclass formulation on the outputs of the binary classifiers...will refer to this as the Crammer -Singer (CS) model. Comparing the scaling approach to the Crammer -Singer approach we can see that the Crammer -Singer

  19. Nonlinear Dynamics Used to Classify Effects of Mild Traumatic Brain Injury

    DTIC Science & Technology

    2012-01-11

    evaluate random fractal characteristics, and scale-dependent Lyapunov exponents (SDLE) to evaluate chaotic characteristics. Both Shannon and Renyi entropy...fluctuation analysis to evaluate random fractal characteristics, and scale-dependent Lyapunov exponents (SDLE) to evaluate chaotic characteristics. Both...often called the Hurst parameter [32]. When the scaling law described by Eq. (2) holds, the September 2011 I Volume 6 I Issue 9 I e24446 -Q.384

  20. Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces

    PubMed Central

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550

  1. Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.

  2. Extended X-ray emission in PKS 1718-649

    NASA Astrophysics Data System (ADS)

    Beuchert, T.; Rodríguez-Ardila, A.; Moss, V. A.; Schulz, R.; Kadler, M.; Wilms, J.; Angioni, R.; Callingham, J. R.; Gräfe, C.; Krauß, F.; Kreikenbohm, A.; Langejahn, M.; Leiter, K.; Maccagni, F. M.; Müller, C.; Ojha, R.; Ros, E.; Tingay, S. J.

    2018-04-01

    PKS 1718-649 is one of the closest and most comprehensively studied candidates of a young active galactic nucleus (AGN) that is still embedded in its optical host galaxy. The compact radio structure, with a maximal extent of a few parsecs, makes it a member of the group of compact symmetric objects (CSO). Its environment imposes a turnover of the radio synchrotron spectrum towards lower frequencies, also classifying PKS 1718-649 as gigahertz-peaked radio spectrum (GPS) source. Its close proximity has allowed the first detection of extended X-ray emission in a GPS/CSO source with Chandra that is for the most part unrelated to nuclear feedback. However, not much is known about the nature of this emission. By co-adding all archival Chandra data and complementing these datasets with the large effective area of XMM-Newton, we are able to study the detailed physics of the environment of PKS 1718-649. Not only can we confirm that the bulk of the ≲kiloparsec-scale environment emits in the soft X-rays, but we also identify the emitting gas to form a hot, collisionally ionized medium. While the feedback of the central AGN still seems to be constrained to the inner few parsecs, we argue that supernovae are capable of producing the observed large-scale X-ray emission at a rate inferred from its estimated star formation rate.

  3. Large-scale online semantic indexing of biomedical articles via an ensemble of multi-label classification models.

    PubMed

    Papanikolaou, Yannis; Tsoumakas, Grigorios; Laliotis, Manos; Markantonatos, Nikos; Vlahavas, Ioannis

    2017-09-22

    In this paper we present the approach that we employed to deal with large scale multi-label semantic indexing of biomedical papers. This work was mainly implemented within the context of the BioASQ challenge (2013-2017), a challenge concerned with biomedical semantic indexing and question answering. Our main contribution is a MUlti-Label Ensemble method (MULE) that incorporates a McNemar statistical significance test in order to validate the combination of the constituent machine learning algorithms. Some secondary contributions include a study on the temporal aspects of the BioASQ corpus (observations apply also to the BioASQ's super-set, the PubMed articles collection) and the proper parametrization of the algorithms used to deal with this challenging classification task. The ensemble method that we developed is compared to other approaches in experimental scenarios with subsets of the BioASQ corpus giving positive results. In our participation in the BioASQ challenge we obtained the first place in 2013 and the second place in the four following years, steadily outperforming MTI, the indexing system of the National Library of Medicine (NLM). The results of our experimental comparisons, suggest that employing a statistical significance test to validate the ensemble method's choices, is the optimal approach for ensembling multi-label classifiers, especially in contexts with many rare labels.

  4. Automatic detection of anomalies in screening mammograms

    PubMed Central

    2013-01-01

    Background Diagnostic performance in breast screening programs may be influenced by the prior probability of disease. Since breast cancer incidence is roughly half a percent in the general population there is a large probability that the screening exam will be normal. That factor may contribute to false negatives. Screening programs typically exhibit about 83% sensitivity and 91% specificity. This investigation was undertaken to determine if a system could be developed to pre-sort screening-images into normal and suspicious bins based on their likelihood to contain disease. Wavelets were investigated as a method to parse the image data, potentially removing confounding information. The development of a classification system based on features extracted from wavelet transformed mammograms is reported. Methods In the multi-step procedure images were processed using 2D discrete wavelet transforms to create a set of maps at different size scales. Next, statistical features were computed from each map, and a subset of these features was the input for a concerted-effort set of naïve Bayesian classifiers. The classifier network was constructed to calculate the probability that the parent mammography image contained an abnormality. The abnormalities were not identified, nor were they regionalized. The algorithm was tested on two publicly available databases: the Digital Database for Screening Mammography (DDSM) and the Mammographic Images Analysis Society’s database (MIAS). These databases contain radiologist-verified images and feature common abnormalities including: spiculations, masses, geometric deformations and fibroid tissues. Results The classifier-network designs tested achieved sensitivities and specificities sufficient to be potentially useful in a clinical setting. This first series of tests identified networks with 100% sensitivity and up to 79% specificity for abnormalities. This performance significantly exceeds the mean sensitivity reported in literature for the unaided human expert. Conclusions Classifiers based on wavelet-derived features proved to be highly sensitive to a range of pathologies, as a result Type II errors were nearly eliminated. Pre-sorting the images changed the prior probability in the sorted database from 37% to 74%. PMID:24330643

  5. Electroencephalography as a clinical tool for diagnosing and monitoring attention deficit hyperactivity disorder: a cross-sectional study

    PubMed Central

    Helgadóttir, Halla; Gudmundsson, Ólafur Ó; Baldursson, Gísli; Magnússon, Páll; Blin, Nicolas; Brynjólfsdóttir, Berglind; Emilsdóttir, Ásdís; Gudmundsdóttir, Gudrún B; Lorange, Málfrídur; Newman, Paula K; Jóhannesson, Gísli H; Johnsen, Kristinn

    2015-01-01

    Objectives The aim of this study was to develop and test, for the first time, a multivariate diagnostic classifier of attention deficit hyperactivity disorder (ADHD) based on EEG coherence measures and chronological age. Setting The participants were recruited in two specialised centres and three schools in Reykjavik. Participants The data are from a large cross-sectional cohort of 310 patients with ADHD and 351 controls, covering an age range from 5.8 to 14 years. ADHD was diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV) criteria using the K-SADS-PL semistructured interview. Participants in the control group were reported to be free of any mental or developmental disorders by their parents and had a score of less than 1.5 SDs above the age-appropriate norm on the ADHD Rating Scale-IV. Other than moderate or severe intellectual disability, no additional exclusion criteria were applied in order that the cohort reflected the typical cross section of patients with ADHD. Results Diagnostic classifiers were developed using statistical pattern recognition for the entire age range and for specific age ranges and were tested using cross-validation and by application to a separate cohort of recordings not used in the development process. The age-specific classification approach was more accurate (76% accuracy in the independent test cohort; 81% cross-validation accuracy) than the age-independent version (76%; 73%). Chronological age was found to be an important classification feature. Conclusions The novel application of EEG-based classification methods presented here can offer significant benefit to the clinician by improving both the accuracy of initial diagnosis and ongoing monitoring of children and adolescents with ADHD. The most accurate possible diagnosis at a single point in time can be obtained by the age-specific classifiers, but the age-independent classifiers are also useful as they enable longitudinal monitoring of brain function. PMID:25596195

  6. The Blurred Line between Form and Process: A Comparison of Stream Channel Classification Frameworks

    PubMed Central

    Kasprak, Alan; Hough-Snee, Nate

    2016-01-01

    Stream classification provides a means to understand the diversity and distribution of channels and floodplains that occur across a landscape while identifying links between geomorphic form and process. Accordingly, stream classification is frequently employed as a watershed planning, management, and restoration tool. At the same time, there has been intense debate and criticism of particular frameworks, on the grounds that these frameworks classify stream reaches based largely on their physical form, rather than direct measurements of their component hydrogeomorphic processes. Despite this debate surrounding stream classifications, and their ongoing use in watershed management, direct comparisons of channel classification frameworks are rare. Here we implement four stream classification frameworks and explore the degree to which each make inferences about hydrogeomorphic process from channel form within the Middle Fork John Day Basin, a watershed of high conservation interest within the Columbia River Basin, U.S.A. We compare the results of the River Styles Framework, Natural Channel Classification, Rosgen Classification System, and a channel form-based statistical classification at 33 field-monitored sites. We found that the four frameworks consistently classified reach types into similar groups based on each reach or segment’s dominant hydrogeomorphic elements. Where classified channel types diverged, differences could be attributed to the (a) spatial scale of input data used, (b) the requisite metrics and their order in completing a framework’s decision tree and/or, (c) whether the framework attempts to classify current or historic channel form. Divergence in framework agreement was also observed at reaches where channel planform was decoupled from valley setting. Overall, the relative agreement between frameworks indicates that criticism of individual classifications for their use of form in grouping stream channels may be overstated. These form-based criticisms may also ignore the geomorphic tenet that channel form reflects formative hydrogeomorphic processes across a given landscape. PMID:26982076

  7. Extraction of Pharmacokinetic Evidence of Drug–Drug Interactions from the Literature

    PubMed Central

    Kolchinsky, Artemy; Lourenço, Anália; Wu, Heng-Yi; Li, Lang; Rocha, Luis M.

    2015-01-01

    Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmacoepidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1≈0.93, MCC≈0.74, iAUC≈0.99) and sentences (F1≈0.76, MCC≈0.65, iAUC≈0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. We also found that some drug-related named entity recognition tools and dictionaries led to slight but significant improvements, especially in classification of evidence sentences. Based on our thorough analysis of classifiers and feature transforms and the high classification performance achieved, we demonstrate that literature mining can aid DDI discovery by supporting automatic extraction of specific types of experimental evidence. PMID:25961290

  8. The fragmented nature of tundra landscape

    NASA Astrophysics Data System (ADS)

    Virtanen, Tarmo; Ek, Malin

    2014-04-01

    The vegetation and land cover structure of tundra areas is fragmented when compared to other biomes. Thus, satellite images of high resolution are required for producing land cover classifications, in order to reveal the actual distribution of land cover types across these large and remote areas. We produced and compared different land cover classifications using three satellite images (QuickBird, Aster and Landsat TM5) with different pixel sizes (2.4 m, 15 m and 30 m pixel size, respectively). The study area, in north-eastern European Russia, was visited in July 2007 to obtain ground reference data. The QuickBird image was classified using supervised segmentation techniques, while the Aster and Landsat TM5 images were classified using a pixel-based supervised classification method. The QuickBird classification showed the highest accuracy when tested against field data, while the Aster image was generally more problematic to classify than the Landsat TM5 image. Use of smaller pixel sized images distinguished much greater levels of landscape fragmentation. The overall mean patch sizes in the QuickBird, Aster, and Landsat TM5-classifications were 871 m2, 2141 m2 and 7433 m2, respectively. In the QuickBird classification, the mean patch size of all the tundra and peatland vegetation classes was smaller than one pixel of the Landsat TM5 image. Water bodies and fens in particular occur in the landscape in small or elongated patches, and thus cannot be realistically classified from larger pixel sized images. Land cover patterns vary considerably at such a fine-scale, so that a lot of information is lost if only medium resolution satellite images are used. It is crucial to know the amount and spatial distribution of different vegetation types in arctic landscapes, as carbon dynamics and other climate related physical, geological and biological processes are known to vary greatly between vegetation types.

  9. The Planform Mobility of Large River Channel Confluences

    NASA Astrophysics Data System (ADS)

    Sambrook Smith, Greg; Dixon, Simon; Nicholas, Andrew; Bull, Jon; Vardy, Mark; Best, James; Goodbred, Steven; Sarker, Maminul

    2017-04-01

    Large river confluences are widely acknowledged as exerting a controlling influence upon both upstream and downstream morphology and thus channel planform evolution. Despite their importance, little is known concerning their longer-term evolution and planform morphodynamics, with much of the literature focusing on confluences as representing fixed, nodal points in the fluvial network. In contrast, some studies of large sand bed rivers in India and Bangladesh have shown large river confluences can be highly mobile, although the extent to which this is representative of large confluences around the world is unknown. Confluences have also been shown to generate substantial bed scours, and if the confluence location is mobile these scours could 'comb' across wide areas. This paper presents field data of large confluences morphologies in the Ganges-Brahmaputra-Meghna river basin, illustrating the spatial extent of large river bed scours and showing scour depth can extend below base level, enhancing long term preservation potential. Based on a global review of the planform of large river confluences using Landsat imagery from 1972 to 2014 this study demonstrates such scour features can be highly mobile and there is an array of confluence morphodynamic types: from freely migrating confluences, through confluences migrating on decadal timescales to fixed confluences. Based on this analysis, a conceptual model of large river confluence types is proposed, which shows large river confluences can be sites of extensive bank erosion and avulsion, creating substantial management challenges. We quantify the abundance of mobile confluence types by classifying all large confluences in both the Amazon and Ganges-Brahmaputra-Meghna basins, showing these two large rivers have contrasting confluence morphodynamics. We show large river confluences have multiple scales of planform adjustment with important implications for river management, infrastructure and interpretation of the rock record.

  10. Interobserver Agreement on Endoscopic Classification of Oesophageal Varices in Children.

    PubMed

    D'Antiga, Lorenzo; Betalli, Pietro; De Angelis, Paola; Davenport, Mark; Di Giorgio, Angelo; McKiernan, Patrick J; McLin, Valerie; Ravelli, Paolo; Durmaz, Ozlem; Talbotec, Cecile; Sturm, Ekkehard; Woynarowski, Marek; Burroughs, Andrew K

    2015-08-01

    Data regarding agreement on endoscopic features of oesophageal varices in children with portal hypertension (PH) are scant. The aim of this study was to evaluate endoscopic visualisation and classification of oesophageal varices in children by several European clinicians, to build a rational basis for future multicentre trials. Endoscopic pictures of the distal oesophagus of 100 children with a clinical diagnosis of PH were distributed to 10 endoscopists. Observers were requested to classify variceal size according to a 3-degree scale (small, medium, and large, class A), a 2-degree scale (small and large, class B), and to recognise red wales (presence or absence, class Red). Overall agreement was considered fair if Fleiss and Cohen κ test was ≥0.30, good if ≥0.40, excellent if ≥0.60, and perfect if ≥0.80. Agreement between observers was fair with class A (κ = 0.34) and class B (κ = 0.38), and good with class Red (κ = 0.49). The agreement was good on presence versus absence of varices (class A = 0.53, class B = 0.48). The agreement among the observers was good in class A when endoscopic features of severe PH (medium and large sizes, red marks) were grouped and compared with mild features (absent and small varices) (κ = 0.58). Experts working in different centres show a fairly good agreement on endoscopic features of PH in children, although a better training of paediatric endoscopists may improve the agreement in grading severity of varices in this setting.

  11. High-accuracy single-pass InSAR DEM for large-scale flood hazard applications

    NASA Astrophysics Data System (ADS)

    Schumann, G.; Faherty, D.; Moller, D.

    2017-12-01

    In this study, we used a unique opportunity of the GLISTIN-A (NASA airborne mission designed to characterizing the cryosphere) track to Greenland to acquire a high-resolution InSAR DEM of a large area in the Red River of the North Basin (north of Grand Forks, ND, USA), which is a very flood-vulnerable valley, particularly in spring time due to increased soil moisture content near state of saturation and/or, typical for this region, snowmelt. Having an InSAR DEM that meets flood inundation modeling and mapping requirements comparable to LiDAR, would demonstrate great application potential of new radar technology for national agencies with an operational flood forecasting mandate and also local state governments active in flood event prediction, disaster response and mitigation. Specifically, we derived a bare-earth DEM in SAR geometry by first removing the inherent far range bias related to airborne operation, which at the more typical large-scale DEM resolution of 30 m has a sensor accuracy of plus or minus 2.5 cm. Subsequently, an intelligent classifier based on informed relationships between InSAR height, intensity and correlation was used to distinguish between bare-earth, roads or embankments, buildings and tall vegetation in order to facilitate the creation of a bare-earth DEM that would meet the requirements for accurate floodplain inundation mapping. Using state-of-the-art LiDAR terrain data, we demonstrate that capability by achieving a root mean squared error of approximately 25 cm and further illustrating its applicability to flood modeling.

  12. Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

    PubMed Central

    Kotera, Masaaki; Tabei, Yasuo; Yamanishi, Yoshihiro; Tokimatsu, Toshiaki; Goto, Susumu

    2013-01-01

    Motivation: The metabolic pathway is an important biochemical reaction network involving enzymatic reactions among chemical compounds. However, it is assumed that a large number of metabolic pathways remain unknown, and many reactions are still missing even in known pathways. Therefore, the most important challenge in metabolomics is the automated de novo reconstruction of metabolic pathways, which includes the elucidation of previously unknown reactions to bridge the metabolic gaps. Results: In this article, we develop a novel method to reconstruct metabolic pathways from a large compound set in the reaction-filling framework. We define feature vectors representing the chemical transformation patterns of compound–compound pairs in enzymatic reactions using chemical fingerprints. We apply a sparsity-induced classifier to learn what we refer to as ‘enzymatic-reaction likeness’, i.e. whether compound pairs are possibly converted to each other by enzymatic reactions. The originality of our method lies in the search for potential reactions among many compounds at a time, in the extraction of reaction-related chemical transformation patterns and in the large-scale applicability owing to the computational efficiency. In the results, we demonstrate the usefulness of our proposed method on the de novo reconstruction of 134 metabolic pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our comprehensively predicted reaction networks of 15 698 compounds enable us to suggest many potential pathways and to increase research productivity in metabolomics. Availability: Softwares are available on request. Supplementary material are available at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2013/. Contact: goto@kuicr.kyoto-u.ac.jp PMID:23812977

  13. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation

    PubMed Central

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-01-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers. PMID:24098863

  14. Land use-based landscape planning and restoration in mine closure areas.

    PubMed

    Zhang, Jianjun; Fu, Meichen; Hassani, Ferri P; Zeng, Hui; Geng, Yuhuan; Bai, Zhongke

    2011-05-01

    Landscape planning and restoration in mine closure areas is not only an inevitable choice to sustain mining areas but also an important path to maximize landscape resources and to improve ecological function in mine closure areas. The analysis of the present mine development shows that many mines are unavoidably facing closures in China. This paper analyzes the periodic impact of mining activities on landscapes and then proposes planning concepts and principles. According to the landscape characteristics in mine closure areas, this paper classifies available landscape resources in mine closure areas into the landscape for restoration, for limited restoration and for protection, and then summarizes directions for their uses. This paper establishes the framework of spatial control planning and design of landscape elements from "macro control, medium allocation and micro optimization" for the purpose of managing and using this kind of special landscape resources. Finally, this paper applies the theories and methods to a case study in Wu'an from two aspects: the construction of a sustainable land-use pattern on a large scale and the optimized allocation of typical mine landscape resources on a small scale.

  15. Chemicals to enhance microalgal growth and accumulation of high-value bioproducts

    PubMed Central

    Yu, Xinheng; Chen, Lei; Zhang, Weiwen

    2015-01-01

    Photosynthetic microalgae have attracted significant attention as they can serve as important sources for cosmetic, food and pharmaceutical products, industrial materials and even biofuel biodiesels. However, current productivity of microalga-based processes is still very low, which has restricted their scale-up application. In addition to various efforts in strain improvement and cultivation optimization, it was proposed that the productivity of microalga-based processes can also be increased using various chemicals to trigger or enhance cell growth and accumulation of bioproducts. Herein, we summarized recent progresses in applying chemical triggers or enhancers to improve cell growth and accumulation of bioproducts in algal cultures. Based on their enhancing mechanisms, these chemicals can be classified into four categories:chemicals regulating biosynthetic pathways, chemicals inducing oxidative stress responses, phytohormones and analogs regulating multiple aspects of microalgal metabolism, and chemicals directly as metabolic precursors. Taken together, the early researches demonstrated that the use of chemical stimulants could be a very effective and economical way to improve cell growth and accumulation of high-value bioproducts in large-scale cultivation of microalgae. PMID:25741321

  16. Machine Learning Approach to Automated Quality Identification of Human Induced Pluripotent Stem Cell Colony Images.

    PubMed

    Joutsijoki, Henry; Haponen, Markus; Rasku, Jyrki; Aalto-Setälä, Katriina; Juhola, Martti

    2016-01-01

    The focus of this research is on automated identification of the quality of human induced pluripotent stem cell (iPSC) colony images. iPS cell technology is a contemporary method by which the patient's cells are reprogrammed back to stem cells and are differentiated to any cell type wanted. iPS cell technology will be used in future to patient specific drug screening, disease modeling, and tissue repairing, for instance. However, there are technical challenges before iPS cell technology can be used in practice and one of them is quality control of growing iPSC colonies which is currently done manually but is unfeasible solution in large-scale cultures. The monitoring problem returns to image analysis and classification problem. In this paper, we tackle this problem using machine learning methods such as multiclass Support Vector Machines and several baseline methods together with Scaled Invariant Feature Transformation based features. We perform over 80 test arrangements and do a thorough parameter value search. The best accuracy (62.4%) for classification was obtained by using a k-NN classifier showing improved accuracy compared to earlier studies.

  17. Why square lattices are not seen on curved ionic membranes

    NASA Astrophysics Data System (ADS)

    Thomas, Creighton; Olvera de La Cruz, Monica

    2013-03-01

    Ionic crystalline membranes on curved surfaces are ubiquitous in nature, appearing for example on the membranes of halophilic organisms. Even when these membranes buckle into polyhedra with square or rectangular sides, the crystalline structure is seen to have hexagonal symmetry. Here, we theoretically and numerically investigate the effects of curvature on square lattices. Our model system consists of both positive and negative ions with a 1:1 charge ratio adsorbed onto the surface of a sphere. In flat space, the lowest-energy configuration of this system can be a square lattice. This bipartite arrangement is favored because there are two types of ions. It leads to a fundamentally different defect structure than what has been seen when triangular lattices are favored. We classify these defects and find that curvature disrupts long-range square symmetry in a crystal. Through numerical simulations, we see that small square regions are possible in some cases, but this phase coexists with other structures, limiting the scale of these square-lattice microstructures. Thus, at large length scales, curvature leads to triangular structures.

  18. Evaluation of Skylab (EREP) data for forest and rangeland surveys. [Georgia, South Dakota, Colorado, and California

    NASA Technical Reports Server (NTRS)

    Aldrich, R. C. (Principal Investigator); Dana, R. W.; Greentree, W. J.; Roberts, E. H.; Norick, N. X.; Waite, T. H.; Francis, R. E.; Driscoll, R. S.; Weber, F. P.

    1975-01-01

    The author has identified the following significant results. Four widely separated sites (near Augusta, Georgia; Lead, South Dakota; Manitou, Colorado; and Redding, California) were selected as typical sites for forest inventory, forest stress, rangeland inventory, and atmospheric and solar measurements, respectively. Results indicated that Skylab S190B color photography is good for classification of Level 1 forest and nonforest land (90 to 95 percent correct) and could be used as a data base for sampling by small and medium scale photography using regression techniques. The accuracy of Level 2 forest and nonforest classes, however, varied from fair to poor. Results of plant community classification tests indicate that both visual and microdensitometric techniques can separate deciduous, conifirous, and grassland classes to the region level in the Ecoclass hierarchical classification system. There was no consistency in classifying tree categories at the series level by visual photointerpretation. The relationship between ground measurements and large scale photo measurements of foliar cover had a correlation coefficient of greater than 0.75. Some of the relationships, however, were site dependent.

  19. Sensitivity and specificity of the 'knee-up test' for estimation of the American Spinal Injury Association Impairment Scale in patients with acute motor incomplete cervical spinal cord injury.

    PubMed

    Yugué, Itaru; Okada, Seiji; Maeda, Takeshi; Ueta, Takayoshi; Shiba, Keiichiro

    2018-04-01

    A retrospective study. Precise classification of the neurological state of patients with acute cervical spinal cord injury (CSCI) can be challenging. This study proposed a useful and simple clinical method to help classify patients with incomplete CSCI. Spinal Injuries Centre, Japan. The sensitivity and specificity of the 'knee-up test' were evaluated in patients with acute CSCI classified as American Spinal Injury Association Impairment Scale (AIS) C or D. The result is positive if the patient can lift the knee in one or both legs to an upright position, whereas the result is negative if the patient is unable to lift the knee in either leg to an upright position. The AIS of these patients was classified according to a strict computerised algorithm designed by Walden et al., and the knee-up test was tested by non-expert examiners. Among the 200 patients, 95 and 105 were classified as AIS C and AIS D, respectively. Overall, 126 and 74 patients demonstrated positive and negative results, respectively, when evaluated using the knee-up test. A total of 104 patients with positive results and 73 patients with negative results were classified as AIS D and AIS C, respectively. The sensitivity, specificity, positive predictive and negative predictive values of this test for all patients were 99.1, 76.8, 82.5 and 98.7, respectively. The knee-up test may allow easy and highly accurate estimation, without the need for special skills, of AIS classification for patients with incomplete CSCI.

  20. Sexual orientation and boyhood gender conformity: development of the Boyhood Gender Conformity Scale (BGCS)

    PubMed

    Hockenberry, S L; Billingham, R E

    1987-12-01

    Two hundred twenty-five [corrected] respondents (109 [corrected] heterosexuals and 116 [corrected] homosexuals) completed a survey containing a 20-item Boyhood Gender Conformity Scale (BGCS). This scale was largely composed of edited and abridged gender items from Part A of Freund et al.'s Feminine Gender Identity Scale (FGIS-A) and Whitam's "childhood indicators." The combined scale was developed in an attempt to obtain a reliable, valid, and potent discriminating instrument for accurately classifying adult male respondents for sexual orientation on the basis of their reported boyhood gender conformity or nonconforming behavior and identity. In addition, 33% of these respondents were administered the original FGIS-A and Whitam inventory during a 2-week test-retest analysis conducted to determine the validity and reliability of the new instrument. All the original items significantly discriminated between heterosexual and homosexual respondents. From these a 13-item function and a 5-item function proved to be the most powerful discriminators between the two groups. Significant correlations between each of the three scales and a very high test-retest correlation coefficient supported the reliability and validity assumption for the BGCS. The conclusion was made that the five-item function (playing with boys, preferring [corrected] boys' games, imagining self as sports figure, reading adventure and sports stories, considered a "sissy") was the most potent and parsimonious discriminator among adult males for sexual orientation. It was similarly noted that the absence of masculine behaviors and traits appeared to be a more powerful predictor of later homosexual orientation than the traditionally feminine or cross-sexed traits and behaviors.

  1. Neural network classifications and correlation analysis of EEG and MEG activity accompanying spontaneous reversals of the Necker cube.

    PubMed

    Gaetz, M; Weinberg, H; Rzempoluck, E; Jantzen, K J

    1998-04-01

    It has recently been suggested that reentrant connections are essential in systems that process complex information [A. Damasio, H. Damasio, Cortical systems for the retrieval of concrete knowledge: the convergence zone framework, in: C. Koch, J.L. Davis (Eds.), Large Scale Neuronal Theories of the Brain, The MIT Press, Cambridge, 1995, pp. 61-74; G. Edelman, The Remembered Present, Basic Books, New York, 1989; M.I. Posner, M. Rothbart, Constructing neuronal theories of mind, in: C. Koch, J.L. Davis (Eds.), Large Scale Neuronal Theories of the Brain, The MIT Press, Cambridge, 1995, pp. 183-199; C. von der Malsburg, W. Schneider, A neuronal cocktail party processor, Biol. Cybem., 54 (1986) 29-40]. Reentry is not feedback, but parallel signalling in the time domain between spatially distributed maps, similar to a process of correlation between distributed systems. Accordingly, it was expected that during spontaneous reversals of the Necker cube, complex patterns of correlations between distributed systems would be present in the cortex. The present study included EEG (n=4) and MEG recordings (n=5). Two experimental questions were posed: (1) Can distributed cortical patterns present during perceptual reversals be classified differently using a generalised regression neural network (GRNN) compared to processing of a two-dimensional figure? (2) Does correlated cortical activity increase significantly during perception of a Necker cube reversal? One-second duration single trials of EEG and MEG data were analysed using the GRNN. Electrode/sensor pairings based on cortico-cortical connections were selected to assess correlated activity in each condition. The GRNN significantly classified single trials recorded during Necker cube reversals as different from single trials recorded during perception of a two-dimensional figure for both EEG and MEG. In addition, correlated cortical activity increased significantly in the Necker cube reversal condition for EEG and MEG compared to the perception of a non-reversing stimulus. Coherent MEG activity observed over occipital, parietal and temporal regions is believed to represent neural systems related to the perception of Necker cube reversals. Copyright 1998 Elsevier Science B.V.

  2. Rubber and Land-Cover Land-Use Change in Mainland Southeast Asia

    NASA Astrophysics Data System (ADS)

    Fox, J. M.; Hurni, K.

    2017-12-01

    Over the past half century, the five countries of Mainland Southeast Asia (MSEA) - Cambodia, Laos, Myanmar, Thailand, and Vietnam - have witnessed major shifts from predominantly subsistence agrarian economies to increasingly commercialized agriculture. Major drivers of change include policy initiatives that fostered regional economic integration and promoted among other changes rapid expansion of boom-crop plantations. Among the many types of commercial boom crops promoted and grown in MSEA are numerous tree-based products such as rubber, coffee, tree species for pulp and paper (particularly eucalyptus and acacia), cashews, and fruits such as oranges, lychees, and longans. The project proposal hypothesized that most (but not all) tree crops replaced swidden cultivation fields and hence are not necessarily accompanied by deforestation. We used MODIS EVI and SWIR time-series from 2001-2014 to classify changes in tree cover across MSEA; a total of 6849 sample points were used to train the classifier (75%) and verification (25%). The classification consists of 24 classes and 17 classes represent tree crops. Project results suggest that 4.4 m ha of rubber have been planted since 2003; 50% of rubber is planted on former evergreen forest land, 18% on deciduous forest land, and 32% on low vegetation area (former crop lands, bushes, scrub). Tree crops occupy about 8% of the landscape (half of that is rubber). Due to the differences in their political and economic histories these countries display different LCLUCs. In northern Laos, smallholder rubber plantations dominate and shifting cultivation is common in the upland. In southern Laos, large-scale plantations of rubber, coffee, eucalyptus, and sugarcane are widespread. In Thailand, vast areas are covered by annual agriculture; fruit trees and rubber are the prevailing tree crops and are mostly planted by smallholders. In Cambodia, large-scale rubber plantations have expanded in recent years on forest lands; smallholder plantations of cashews and rubber also occur. In Vietnam small holder tree crops (e.g. rubber, cashews, coffee) were already established before 2000, but since then have continued to expand. Contrary to our hypothesis, boom crops are planted primarily on forest lands and are a cause of deforestation in MSEA.

  3. How little pain and disability do patients with low back pain have to experience to feel that they have recovered?

    PubMed Central

    Maher, Christopher G.; Herbert, Robert D.; Hancock, Mark J.; Hush, Julia M.; Smeets, Robert J.

    2010-01-01

    Epidemiological and clinical studies of people with low back pain (LBP) commonly measure the incidence of recovery. The pain numerical rating scale (NRS), scores from 0 to 10, and Roland Morris disability questionnaire (RMDQ), scores from 0 to 24, are two instruments often used to define recovery. On both scales higher scores indicate greater severity. There is no consensus, however, on the cutoff scores on these scales that classify people as having recovered. The aim of this study was to determine which cutoff scores most accurately classify those who had recovered from LBP. Subjects from four clinical studies were categorized as ‘recovered’ or ‘unrecovered’ according to their self-rating on a global perceived effect scale. Odd ratios were calculated for scores of 0, 1, 2, 3 and 4 on the NRS and RMDQ to predict perceived recovery. Scores of 0 on the NRS and ≤2 on the RMDQ most accurately identify patients who consider themselves completely recovered. The diagnostic odds ratio (OR) for predicting recovery was 43.9 for a score of 0 on the NRS and 17.6 for a score of ≤2 on the RMDQ. There was no apparent effect of LBP duration or length of follow-up period on the optimal cutoff score. OR for the NRS were generally higher than those for RMDQ. Cutoffs of 0 on the NRS and 2 on the RMDQ most accurately classify subjects as recovered from LBP. Subjects consider pain more than disability when determining their recovery status. PMID:20229120

  4. Employing Machine-Learning Methods to Study Young Stellar Objects

    NASA Astrophysics Data System (ADS)

    Moore, Nicholas

    2018-01-01

    Vast amounts of data exist in the astronomical data archives, and yet a large number of sources remain unclassified. We developed a multi-wavelength pipeline to classify infrared sources. The pipeline uses supervised machine learning methods to classify objects into the appropriate categories. The program is fed data that is already classified to train it, and is then applied to unknown catalogues. The primary use for such a pipeline is the rapid classification and cataloging of data that would take a much longer time to classify otherwise. While our primary goal is to study young stellar objects (YSOs), the applications extend beyond the scope of this project. We present preliminary results from our analysis and discuss future applications.

  5. High Stimulus-Related Information in Barrel Cortex Inhibitory Interneurons

    PubMed Central

    Reyes-Puerta, Vicente; Kim, Suam; Sun, Jyh-Jang; Imbrosci, Barbara; Kilb, Werner; Luhmann, Heiko J.

    2015-01-01

    The manner in which populations of inhibitory (INH) and excitatory (EXC) neocortical neurons collectively encode stimulus-related information is a fundamental, yet still unresolved question. Here we address this question by simultaneously recording with large-scale multi-electrode arrays (of up to 128 channels) the activity of cell ensembles (of up to 74 neurons) distributed along all layers of 3–4 neighboring cortical columns in the anesthetized adult rat somatosensory barrel cortex in vivo. Using two different whisker stimulus modalities (location and frequency) we show that individual INH neurons – classified as such according to their distinct extracellular spike waveforms – discriminate better between restricted sets of stimuli (≤6 stimulus classes) than EXC neurons in granular and infra-granular layers. We also demonstrate that ensembles of INH cells jointly provide as much information about such stimuli as comparable ensembles containing the ~20% most informative EXC neurons, however presenting less information redundancy – a result which was consistent when applying both theoretical information measurements and linear discriminant analysis classifiers. These results suggest that a consortium of INH neurons dominates the information conveyed to the neocortical network, thereby efficiently processing incoming sensory activity. This conclusion extends our view on the role of the inhibitory system to orchestrate cortical activity. PMID:26098109

  6. Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images.

    PubMed

    Rajaraman, Sivaramakrishnan; Antani, Sameer K; Poostchi, Mahdieh; Silamut, Kamolrat; Hossain, Md A; Maude, Richard J; Jaeger, Stefan; Thoma, George R

    2018-01-01

    Malaria is a blood disease caused by the Plasmodium parasites transmitted through the bite of female Anopheles mosquito. Microscopists commonly examine thick and thin blood smears to diagnose disease and compute parasitemia. However, their accuracy depends on smear quality and expertise in classifying and counting parasitized and uninfected cells. Such an examination could be arduous for large-scale diagnoses resulting in poor quality. State-of-the-art image-analysis based computer-aided diagnosis (CADx) methods using machine learning (ML) techniques, applied to microscopic images of the smears using hand-engineered features demand expertise in analyzing morphological, textural, and positional variations of the region of interest (ROI). In contrast, Convolutional Neural Networks (CNN), a class of deep learning (DL) models promise highly scalable and superior results with end-to-end feature extraction and classification. Automated malaria screening using DL techniques could, therefore, serve as an effective diagnostic aid. In this study, we evaluate the performance of pre-trained CNN based DL models as feature extractors toward classifying parasitized and uninfected cells to aid in improved disease screening. We experimentally determine the optimal model layers for feature extraction from the underlying data. Statistical validation of the results demonstrates the use of pre-trained CNNs as a promising tool for feature extraction for this purpose.

  7. Classifying elephant behaviour through seismic vibrations.

    PubMed

    Mortimer, Beth; Rees, William Lake; Koelemeijer, Paula; Nissen-Meyer, Tarje

    2018-05-07

    Seismic waves - vibrations within and along the Earth's surface - are ubiquitous sources of information. During propagation, physical factors can obscure information transfer via vibrations and influence propagation range [1]. Here, we explore how terrain type and background seismic noise influence the propagation of seismic vibrations generated by African elephants. In Kenya, we recorded the ground-based vibrations of different wild elephant behaviours, such as locomotion and infrasonic vocalisations [2], as well as natural and anthropogenic seismic noise. We employed techniques from seismology to transform the geophone recordings into source functions - the time-varying seismic signature generated at the source. We used computer modelling to constrain the propagation ranges of elephant seismic vibrations for different terrains and noise levels. Behaviours that generate a high force on a sandy terrain with low noise propagate the furthest, over the kilometre scale. Our modelling also predicts that specific elephant behaviours can be distinguished and monitored over a range of propagation distances and noise levels. We conclude that seismic cues have considerable potential for both behavioural classification and remote monitoring of wildlife. In particular, classifying the seismic signatures of specific behaviours of large mammals remotely in real time, such as elephant running, could inform on poaching threats. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Large-Scale Propagation of Ultrasound in a 3-D Breast Model Based on High-Resolution MRI Data

    PubMed Central

    Tillett, Jason C.; Metlay, Leon A.; Waag, Robert C.

    2010-01-01

    A 40 × 35 × 25-mm3 specimen of human breast consisting mostly of fat and connective tissue was imaged using a 3-T magnetic resonance scanner. The resolutions in the image plane and in the orthogonal direction were 130 μm and 150 μm, respectively. Initial processing to prepare the data for segmentation consisted of contrast inversion, interpolation, and noise reduction. Noise reduction used a multilevel bidirectional median filter to preserve edges. The volume of data was segmented into regions of fat and connective tissue by using a combination of local and global thresholding. Local thresholding was performed to preserve fine detail, while global thresholding was performed to minimize the interclass variance between voxels classified as background and voxels classified as object. After smoothing the data to avoid aliasing artifacts, the segmented data volume was visualized using iso-surfaces. The isosurfaces were enhanced using transparency, lighting, shading, reflectance, and animation. Computations of pulse propagation through the model illustrate its utility for the study of ultrasound aberration. The results show the feasibility of using the described combination of methods to demonstrate tissue morphology in a form that provides insight about the way ultrasound beams are aberrated in three dimensions by tissue. PMID:20172794

  9. Prevalence of infection and genotypes of GBV-C/HGV among homosexual men.

    PubMed

    Hattori, Junko; Ibe, Shiro; Nagai, Hiromi; Wada, Kaoru; Morishita, Takayuki; Sato, Katsuhiko; Utsumi, Makoto; Kaneda, Tsuguhiro

    2003-01-01

    Since the discovery of GB virus-C (GBV-C) and hepatitis G virus (HGV), many studies have been performed. These viruses are now known to be parenterally, as well as sexually transmitted. A phylogenetic analysis also revealed that GBV-C has five major genotypes: type 1 predominates in West Africa, type 2 in Europe and the United States, type 3 in parts of Asia, type 4 in Southeast Asia, and type 5 in South Africa. Despite the number of reports so far, there have been few large-scale surveys of homosexual men to determine the prevalence of the GBV-C/HGV infections. We examined the levels of GBV-C/HGV viremia in 297 homosexual men who attended the Nagoya Lesbian and Gay Revolution held in Nagoya, Japan. Reverse transcription-polymerase chain reaction (RT-PCR)/nested PCR of the GBV-C/HGV 5 ' -non-coding region (NCR), and base sequence analyses showed that the infection rate was 12.5%, and genotypes in this population were classified into type 2 (32%) and type 3 (68%). None were classified as types 1, 4, or 5 in this study. Our results indicate that the GBV-C/HGV type 2 seen mainly in Europe and the US is spreading widely in Japan, especially in the Nagoya district.

  10. Large-scale propagation of ultrasound in a 3-D breast model based on high-resolution MRI data.

    PubMed

    Salahura, Gheorghe; Tillett, Jason C; Metlay, Leon A; Waag, Robert C

    2010-06-01

    A 40 x 35 x 25-mm(3) specimen of human breast consisting mostly of fat and connective tissue was imaged using a 3-T magnetic resonance scanner. The resolutions in the image plane and in the orthogonal direction were 130 microm and 150 microm, respectively. Initial processing to prepare the data for segmentation consisted of contrast inversion, interpolation, and noise reduction. Noise reduction used a multilevel bidirectional median filter to preserve edges. The volume of data was segmented into regions of fat and connective tissue by using a combination of local and global thresholding. Local thresholding was performed to preserve fine detail, while global thresholding was performed to minimize the interclass variance between voxels classified as background and voxels classified as object. After smoothing the data to avoid aliasing artifacts, the segmented data volume was visualized using isosurfaces. The isosurfaces were enhanced using transparency, lighting, shading, reflectance, and animation. Computations of pulse propagation through the model illustrate its utility for the study of ultrasound aberration. The results show the feasibility of using the described combination of methods to demonstrate tissue morphology in a form that provides insight about the way ultrasound beams are aberrated in three dimensions by tissue.

  11. Pelvic artery calcification detection on CT scans using convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Liu, Jiamin; Lu, Le; Yao, Jianhua; Bagheri, Mohammadhadi; Summers, Ronald M.

    2017-03-01

    Artery calcification is observed commonly in elderly patients, especially in patients with chronic kidney disease, and may affect coronary, carotid and peripheral arteries. Vascular calcification has been associated with many clinical outcomes. Manual identification of calcification in CT scans requires substantial expert interaction, which makes it time-consuming and infeasible for large-scale studies. Many works have been proposed for coronary artery calcification detection in cardiac CT scans. In these works, coronary artery extraction is commonly required for calcification detection. However, there are few works about abdominal or pelvic artery calcification detection. In this work, we present a method for automatic pelvic artery calcification detection on CT scan. This method uses the recent advanced faster region-based convolutional neural network (R-CNN) to directly identify artery calcification without a need for artery extraction since pelvic artery extraction itself is challenging. Our method first generates category-independent region proposals for each slice of the input CT scan using region proposal networks (RPN). Then, each region proposal is jointly classified and refined by softmax classifier and bounding box regressor. We applied the detection method to 500 images from 20 CT scans of patients for evaluation. The detection system achieved a 77.4% average precision and a 85% sensitivity at 1 false positive per image.

  12. An investigation of Hebbian phase sequences as assembly graphs

    PubMed Central

    Almeida-Filho, Daniel G.; Lopes-dos-Santos, Vitor; Vasconcelos, Nivaldo A. P.; Miranda, José G. V.; Tort, Adriano B. L.; Ribeiro, Sidarta

    2014-01-01

    Hebb proposed that synapses between neurons that fire synchronously are strengthened, forming cell assemblies and phase sequences. The former, on a shorter scale, are ensembles of synchronized cells that function transiently as a closed processing system; the latter, on a larger scale, correspond to the sequential activation of cell assemblies able to represent percepts and behaviors. Nowadays, the recording of large neuronal populations allows for the detection of multiple cell assemblies. Within Hebb's theory, the next logical step is the analysis of phase sequences. Here we detected phase sequences as consecutive assembly activation patterns, and then analyzed their graph attributes in relation to behavior. We investigated action potentials recorded from the adult rat hippocampus and neocortex before, during and after novel object exploration (experimental periods). Within assembly graphs, each assembly corresponded to a node, and each edge corresponded to the temporal sequence of consecutive node activations. The sum of all assembly activations was proportional to firing rates, but the activity of individual assemblies was not. Assembly repertoire was stable across experimental periods, suggesting that novel experience does not create new assemblies in the adult rat. Assembly graph attributes, on the other hand, varied significantly across behavioral states and experimental periods, and were separable enough to correctly classify experimental periods (Naïve Bayes classifier; maximum AUROCs ranging from 0.55 to 0.99) and behavioral states (waking, slow wave sleep, and rapid eye movement sleep; maximum AUROCs ranging from 0.64 to 0.98). Our findings agree with Hebb's view that assemblies correspond to primitive building blocks of representation, nearly unchanged in the adult, while phase sequences are labile across behavioral states and change after novel experience. The results are compatible with a role for phase sequences in behavior and cognition. PMID:24782715

  13. LOFAR discovery of an ultra-steep radio halo and giant head-tail radio galaxy in Abell 1132

    NASA Astrophysics Data System (ADS)

    Wilber, A.; Brüggen, M.; Bonafede, A.; Savini, F.; Shimwell, T.; van Weeren, R. J.; Rafferty, D.; Mechev, A. P.; Intema, H.; Andrade-Santos, F.; Clarke, A. O.; Mahony, E. K.; Morganti, R.; Prandoni, I.; Brunetti, G.; Röttgering, H.; Mandal, S.; de Gasperin, F.; Hoeft, M.

    2018-01-01

    Low-Frequency Array (LOFAR) observations at 144 MHz have revealed large-scale radio sources in the unrelaxed galaxy cluster Abell 1132. The cluster hosts diffuse radio emission on scales of ∼650 kpc near the cluster centre and a head-tail (HT) radio galaxy, extending up to 1 Mpc, south of the cluster centre. The central diffuse radio emission is not seen in NRAO VLA FIRST Survey, Westerbork Northern Sky Survey, nor in C & D array VLA observations at 1.4 GHz, but is detected in our follow-up Giant Meterwave Radio Telescope (GMRT) observations at 325 MHz. Using LOFAR and GMRT data, we determine the spectral index of the central diffuse emission to be α = -1.75 ± 0.19 (S ∝ να). We classify this emission as an ultra-steep spectrum radio halo and discuss the possible implications for the physical origin of radio haloes. The HT radio galaxy shows narrow, collimated emission extending up to 1 Mpc and another 300 kpc of more diffuse, disturbed emission, giving a full projected linear size of 1.3 Mpc - classifying it as a giant radio galaxy (GRG) and making it the longest HT found to date. The head of the GRG coincides with an elliptical galaxy (SDSS J105851.01+564308.5) belonging to Abell 1132. In our LOFAR image, there appears to be a connection between the radio halo and the GRG. The turbulence that may have produced the halo may have also affected the tail of the GRG. In turn, the GRG may have provided seed electrons for the radio halo.

  14. Non-parametric transient classification using adaptive wavelets

    NASA Astrophysics Data System (ADS)

    Varughese, Melvin M.; von Sachs, Rainer; Stephanou, Michael; Bassett, Bruce A.

    2015-11-01

    Classifying transients based on multiband light curves is a challenging but crucial problem in the era of GAIA and Large Synoptic Sky Telescope since the sheer volume of transients will make spectroscopic classification unfeasible. We present a non-parametric classifier that predicts the transient's class given training data. It implements two novel components: the use of the BAGIDIS wavelet methodology - a characterization of functional data using hierarchical wavelet coefficients - as well as the introduction of a ranked probability classifier on the wavelet coefficients that handles both the heteroscedasticity of the data in addition to the potential non-representativity of the training set. The classifier is simple to implement while a major advantage of the BAGIDIS wavelets is that they are translation invariant. Hence, BAGIDIS does not need the light curves to be aligned to extract features. Further, BAGIDIS is non-parametric so it can be used effectively in blind searches for new objects. We demonstrate the effectiveness of our classifier against the Supernova Photometric Classification Challenge to correctly classify supernova light curves as Type Ia or non-Ia. We train our classifier on the spectroscopically confirmed subsample (which is not representative) and show that it works well for supernova with observed light-curve time spans greater than 100 d (roughly 55 per cent of the data set). For such data, we obtain a Ia efficiency of 80.5 per cent and a purity of 82.4 per cent, yielding a highly competitive challenge score of 0.49. This indicates that our `model-blind' approach may be particularly suitable for the general classification of astronomical transients in the era of large synoptic sky surveys.

  15. Linear Subpixel Learning Algorithm for Land Cover Classification from WELD using High Performance Computing

    NASA Technical Reports Server (NTRS)

    Kumar, Uttam; Nemani, Ramakrishna R.; Ganguly, Sangram; Kalia, Subodh; Michaelis, Andrew

    2017-01-01

    In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS-national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91 percent was achieved, which is a 6 percent improvement in unmixing based classification relative to per-pixel-based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.

  16. Linear Subpixel Learning Algorithm for Land Cover Classification from WELD using High Performance Computing

    NASA Astrophysics Data System (ADS)

    Ganguly, S.; Kumar, U.; Nemani, R. R.; Kalia, S.; Michaelis, A.

    2017-12-01

    In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS - national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91% was achieved, which is a 6% improvement in unmixing based classification relative to per-pixel based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.

  17. Evaluating the Effectiveness of Flood Control Strategies in Contrasting Urban Watersheds and Implications for Houston's Future Flood Vulnerability

    NASA Astrophysics Data System (ADS)

    Ganguly, S.; Kumar, U.; Nemani, R. R.; Kalia, S.; Michaelis, A.

    2016-12-01

    In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS - national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91% was achieved, which is a 6% improvement in unmixing based classification relative to per-pixel based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.

  18. A simple atomic-level hydrophobicity scale reveals protein interfacial structure.

    PubMed

    Kapcha, Lauren H; Rossky, Peter J

    2014-01-23

    Many amino acid residue hydrophobicity scales have been created in an effort to better understand and rapidly characterize water-protein interactions based only on protein structure and sequence. There is surprisingly low consistency in the ranking of residue hydrophobicity between scales, and their ability to provide insightful characterization varies substantially across subject proteins. All current scales characterize hydrophobicity based on entire amino acid residue units. We introduce a simple binary but atomic-level hydrophobicity scale that allows for the classification of polar and non-polar moieties within single residues, including backbone atoms. This simple scale is first shown to capture the anticipated hydrophobic character for those whole residues that align in classification among most scales. Examination of a set of protein binding interfaces establishes good agreement between residue-based and atomic-level descriptions of hydrophobicity for five residues, while the remaining residues produce discrepancies. We then show that the atomistic scale properly classifies the hydrophobicity of functionally important regions where residue-based scales fail. To illustrate the utility of the new approach, we show that the atomic-level scale rationalizes the hydration of two hydrophobic pockets and the presence of a void in a third pocket within a single protein and that it appropriately classifies all of the functionally important hydrophilic sites within two otherwise hydrophobic pores. We suggest that an atomic level of detail is, in general, necessary for the reliable depiction of hydrophobicity for all protein surfaces. The present formulation can be implemented simply in a manner no more complex than current residue-based approaches. © 2013.

  19. Landscape effects on mallard habitat selection at multiple spatial scales during the non-breeding period

    USGS Publications Warehouse

    Beatty, William S.; Webb, Elisabeth B.; Kesler, Dylan C.; Raedeke, Andrew H.; Naylor, Luke W.; Humburg, Dale D.

    2014-01-01

    Previous studies that evaluated effects of landscape-scale habitat heterogeneity on migratory waterbird distributions were spatially limited and temporally restricted to one major life-history phase. However, effects of landscape-scale habitat heterogeneity on long-distance migratory waterbirds can be studied across the annual cycle using new technologies, including global positioning system satellite transmitters. We used Bayesian discrete choice models to examine the influence of local habitats and landscape composition on habitat selection by a generalist dabbling duck, the mallard (Anas platyrhynchos), in the midcontinent of North America during the non-breeding period. Using a previously published empirical movement metric, we separated the non-breeding period into three seasons, including autumn migration, winter, and spring migration. We defined spatial scales based on movement patterns such that movements >0.25 and <30.00 km were classified as local scale and movements >30.00 km were classified as relocation scale. Habitat selection at the local scale was generally influenced by local and landscape-level variables across all seasons. Variables in top models at the local scale included proximities to cropland, emergent wetland, open water, and woody wetland. Similarly, variables associated with area of cropland, emergent wetland, open water, and woody wetland were also included at the local scale. At the relocation scale, mallards selected resource units based on more generalized variables, including proximity to wetlands and total wetland area. Our results emphasize the role of landscape composition in waterbird habitat selection and provide further support for local wetland landscapes to be considered functional units of waterbird conservation and management.

  20. Invariant object recognition based on the generalized discrete radon transform

    NASA Astrophysics Data System (ADS)

    Easley, Glenn R.; Colonna, Flavia

    2004-04-01

    We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.

  1. More robust regional precipitation projection from selected CMIP5 models based on multiple-dimensional metrics

    NASA Astrophysics Data System (ADS)

    Qian, Y.; Wang, L.; Leung, L. R.; Lin, G.; Lu, J.; Gao, Y.; Zhang, Y.

    2017-12-01

    Projecting precipitation changes is challenging because of incomplete understanding of the climate system and biases and uncertainty in climate models. In East Asia where summer precipitation is dominantly influenced by the monsoon circulation and the global models from Coupled Model Intercomparison Project Phase 5 (CMIP5), however, give various projection of precipitation change for 21th century. It is critical for community to know which models' projection are more reliable in response to natural and anthropogenic forcings. In this study we defined multiple-dimensional metrics, measuring the model performance in simulating the present-day of large-scale circulation, regional precipitation and relationship between them. The large-scale circulation features examined in this study include the lower tropospheric southwesterly winds, the western North Pacific subtropical high, the South China Sea Subtropical High, and the East Asian westerly jet in the upper troposphere. Each of these circulation features transport moisture to East Asia, enhancing the moist static energy and strengthening the Meiyu moisture front that is the primary mechanism for precipitation generation in eastern China. Based on these metrics, 30 models in CMIP5 ensemble are classified into three groups. Models in the top performing group projected regional precipitation patterns that are more similar to each other than the bottom or middle performing group and consistently projected statistically significant increasing trends in two of the large-scale circulation indices and precipitation. In contrast, models in the bottom or middle performing group projected small drying or no trends in precipitation. We also find the models that only reasonably reproduce the observed precipitation climatology does not guarantee more reliable projection of future precipitation because good simulation skill could be achieved through compensating errors from multiple sources. Herein the potential for more robust projections of precipitation changes at regional scale is demonstrated through the use of discriminating metric to subsample the multi-model ensemble. The results from this study provides insights for how to select models from CMIP ensemble to project regional climate and hydrological cycle changes.

  2. VizieR Online Data Catalog: IRAS Point Source Identifications (MacConnell, 1993; rev. 2009)

    NASA Astrophysics Data System (ADS)

    MacConnell, D. J.

    2010-08-01

    Most of the sources are south of the celestial equator and have been classified in increasing galactic longitude over the period Sept. 1985 to May 1992. They have been classified on Kodak I-N objective-prism plates taken primarily with the Curtis Schmidt telescope at Cerro Tololo, but some northern plates taken with the Burrell Schmidt at Kitt Peak were also used for classification. The spectra cover the range 680-880nm at a dispersion of 340nm/mm at the A-band, and the plate scale is 96.6"/mm. They are ideal for classifying M stars of type M3 and cooler (increasing strength of TiO and VO bands) and carbon stars (CN bands), but stars warmer than M2 and most S stars cannot be classified or identified as such. The M stars M3 and cooler can be separated into about five groups. The limiting mag of the deepest plates is I about 13.5. The IRAS PS were identified on transparent overlays made to the plate scale for each plate center, and the association of a spectrum with a given PS is usually unambiguous. In cases of doubt or offset, a comment is made. Note that there are some cases where the PSC gives an incorrect association on the basis of position, and the correct association is with a faint, uncatalogued M star. (3 data files).

  3. Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.

    PubMed

    Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan

    2017-07-01

    Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.

  4. Choosing the Most Effective Pattern Classification Model under Learning-Time Constraint.

    PubMed

    Saito, Priscila T M; Nakamura, Rodrigo Y M; Amorim, Willian P; Papa, João P; de Rezende, Pedro J; Falcão, Alexandre X

    2015-01-01

    Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.

  5. MIDAS, prototype Multivariate Interactive Digital Analysis System, phase 1. Volume 3: Wiring diagrams

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Christenson, D.; Gordon, M.; Kistler, R.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1974-01-01

    The Midas System is a third-generation, fast, multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS Program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughput. The hardware and software generated in Phase I of the overall program are described. The system contains a mini-computer to control the various high-speed processing elements in the data path and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating at 2 x 100,000 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation. The MIDAS construction and wiring diagrams are given.

  6. MIDAS, prototype Multivariate Interactive Digital Analysis System, Phase 1. Volume 2: Diagnostic system

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Christenson, D.; Gordon, M.; Kistler, R.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1974-01-01

    The MIDAS System is a third-generation, fast, multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS Program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughout. The hardware and software generated in Phase I of the over-all program are described. The system contains a mini-computer to control the various high-speed processing elements in the data path and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating 2 x 105 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation. Diagnostic programs used to test MIDAS' operations are presented.

  7. Delinquency Level Classification Via the HEW Community Program Youth Impact Scales.

    ERIC Educational Resources Information Center

    Truckenmiller, James L.

    The former HEW National Strategy for Youth Development (NSYD) model was created as a community-based planning and procedural tool to promote youth development and prevent delinquency. To assess the predictive power of NSYD Impact Scales in classifying youths into low, medium, and high delinquency levels, male and female students aged 10-19 years…

  8. Receiver Operating Characteristic Curve Analysis of Wechsler Memory Scale-Revised Scores in Epilepsy Surgery Candidates.

    ERIC Educational Resources Information Center

    Barr, William B.

    1997-01-01

    Wechsler Memory Scale-Revised (WMS-R) scores were analyzed for 82 epilepsy surgery candidates and used in combination with receiver operating characteristic curves to classify patients with left (LTL) and right (RTL) temporal lobe seizure onset. Results indicate that WMS-R scores used alone or in combination provide relatively poor discrimination…

  9. PNAC: a protein nucleolar association classifier

    PubMed Central

    2011-01-01

    Background Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional. Results To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions. Conclusions Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments. PMID:21272300

  10. A review of downscaling procedures - a contribution to the research on climate change impacts at city scale

    NASA Astrophysics Data System (ADS)

    Smid, Marek; Costa, Ana; Pebesma, Edzer; Granell, Carlos; Bhattacharya, Devanjan

    2016-04-01

    Human kind is currently predominantly urban based, and the majority of ever continuing population growth will take place in urban agglomerations. Urban systems are not only major drivers of climate change, but also the impact hot spots. Furthermore, climate change impacts are commonly managed at city scale. Therefore, assessing climate change impacts on urban systems is a very relevant subject of research. Climate and its impacts on all levels (local, meso and global scale) and also the inter-scale dependencies of those processes should be a subject to detail analysis. While global and regional projections of future climate are currently available, local-scale information is lacking. Hence, statistical downscaling methodologies represent a potentially efficient way to help to close this gap. In general, the methodological reviews of downscaling procedures cover the various methods according to their application (e.g. downscaling for the hydrological modelling). Some of the most recent and comprehensive studies, such as the ESSEM COST Action ES1102 (VALUE), use the concept of Perfect Prog and MOS. Other examples of classification schemes of downscaling techniques consider three main categories: linear methods, weather classifications and weather generators. Downscaling and climate modelling represent a multidisciplinary field, where researchers from various backgrounds intersect their efforts, resulting in specific terminology, which may be somewhat confusing. For instance, the Polynomial Regression (also called the Surface Trend Analysis) is a statistical technique. In the context of the spatial interpolation procedures, it is commonly classified as a deterministic technique, and kriging approaches are classified as stochastic. Furthermore, the terms "statistical" and "stochastic" (frequently used as names of sub-classes in downscaling methodological reviews) are not always considered as synonymous, even though both terms could be seen as identical since they are referring to methods handling input modelling factors as variables with certain probability distributions. In addition, the recent development is going towards multi-step methodologies containing deterministic and stochastic components. This evolution leads to the introduction of new terms like hybrid or semi-stochastic approaches, which makes the efforts to systematically classifying downscaling methods to the previously defined categories even more challenging. This work presents a review of statistical downscaling procedures, which classifies the methods in two steps. In the first step, we describe several techniques that produce a single climatic surface based on observations. The methods are classified into two categories using an approximation to the broadest consensual statistical terms: linear and non-linear methods. The second step covers techniques that use simulations to generate alternative surfaces, which correspond to different realizations of the same processes. Those simulations are essential because there is a limited number of real observational data, and such procedures are crucial for modelling extremes. This work emphasises the link between statistical downscaling methods and the research of climate change impacts at city scale.

  11. How large a training set is needed to develop a classifier for microarray data?

    PubMed

    Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M

    2008-01-01

    A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

  12. Mathematical foundations of hybrid data assimilation from a synchronization perspective

    NASA Astrophysics Data System (ADS)

    Penny, Stephen G.

    2017-12-01

    The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.

  13. Hyperswitch communication network

    NASA Technical Reports Server (NTRS)

    Peterson, J.; Pniel, M.; Upchurch, E.

    1991-01-01

    The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed.

  14. Social networks and spreading of epidemics

    NASA Astrophysics Data System (ADS)

    Trimper, Steffen; Zheng, Dafang; Brandau, Marian

    2004-05-01

    Epidemiological processes are studied within a recently proposed social network model using the susceptible-infected-refractory dynamics (SIR) of an epidemic. Within the network model, a population of individuals may be characterized by H independent hierarchies or dimensions, each of which consists of groupings of individuals into layers of subgroups. Detailed numerical simulations reveals that for H > 1, the global spreading results regardless of the degree of homophily α of the individuals forming a social circle. For H = 1, a transition from a global to a local spread occurs as the population becomes decomposed into increasingly homophilous groups. Multiple dimensions in classifying individuals (nodes) thus make a society (computer network) highly susceptible to large scale outbreaks of infectious diseases (viruses). The SIR-model can be extended by the inclusion of waiting times resulting in modified distribution function of the recovered.

  15. Quantifying social development in autism.

    PubMed

    Volkmar, F R; Carter, A; Sparrow, S S; Cicchetti, D V

    1993-05-01

    This study was concerned with the development of quantitative measures of social development in autism. Multiple regression equations predicting social, communicative, and daily living skills on the Vineland Adaptive Behavior Scales were derived from a large, normative sample and applied to groups of autistic and nonautistic, developmentally disordered children. Predictive models included either mental or chronological age and other relevant variables. Social skills in the autistic group were more than two standard deviations below those predicted by their mental age; an index derived from the ratio of actual to predicted social skills correctly classified 94% of the autistic and 92% of the nonautistic, developmentally disordered cases. The findings are consistent with the idea that social disturbance is central in the definition of autism. The approach used in this study has potential advantages for providing more precise measures of social development in autism.

  16. Mathematical foundations of hybrid data assimilation from a synchronization perspective.

    PubMed

    Penny, Stephen G

    2017-12-01

    The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.

  17. Demonstration of three gorges archaeological relics based on 3D-visualization technology

    NASA Astrophysics Data System (ADS)

    Xu, Wenli

    2015-12-01

    This paper mainly focuses on the digital demonstration of three gorges archeological relics to exhibit the achievements of the protective measures. A novel and effective method based on 3D-visualization technology, which includes large-scaled landscape reconstruction, virtual studio, and virtual panoramic roaming, etc, is proposed to create a digitized interactive demonstration system. The method contains three stages: pre-processing, 3D modeling and integration. Firstly, abundant archaeological information is classified according to its history and geographical information. Secondly, build up a 3D-model library with the technology of digital images processing and 3D modeling. Thirdly, use virtual reality technology to display the archaeological scenes and cultural relics vividly and realistically. The present work promotes the application of virtual reality to digital projects and enriches the content of digital archaeology.

  18. Swift delineation of flood-prone areas over large European regions

    NASA Astrophysics Data System (ADS)

    Tavares da Costa, Ricardo; Castellarin, Attilio; Manfreda, Salvatore; Samela, Caterina; Domeneghetti, Alessio; Mazzoli, Paolo; Luzzi, Valerio; Bagli, Stefano

    2017-04-01

    According to the European Environment Agency (EEA Report No 1/2016), a significant share of the European population is estimated to be living on or near a floodplain, with Italy having the highest population density in flood-prone areas among the countries analysed. This tendency, tied with event frequency and magnitude (e.g.: the 24/11/2016 floods in Italy) and the fact that river floods may occur at large scales and at a transboundary level, where data is often sparse, presents a challenge in flood-risk management. The availability of consistent flood hazard and risk maps during prevention, preparedness, response and recovery phases are a valuable and important step forward in improving the effectiveness, efficiency and robustness of evidence-based decision making. The present work aims at testing and discussing the usefulness of pattern recognition techniques based on geomorphologic indices (Manfreda et al., J. Hydrol. Eng., 2011, Degiorgis et al., J Hydrol., 2012, Samela et al., J. Hydrol. Eng., 2015) for the simplified mapping of river flood-prone areas at large scales. The techniques are applied to 25m Digital Elevation Models (DEM) of the Danube, Po and Severn river watersheds, obtained from the Copernicus data and information funded by the European Union - EU-DEM layers. Results are compared to the Pan-European flood hazard maps derived by Alfieri et al. (Hydrol. Proc., 2013) using a set of distributed hydrological (LISFLOOD, van der Knijff et al., Int. J. Geogr. Inf. Sci., 2010, employed within the European Flood Awareness System, www.efas.eu) and hydraulic models (LISFLOOD-FP, Bates and De Roo, J. Hydrol., 2000). Our study presents different calibration and cross-validation exercises of the DEM-based mapping algorithms to assess to which extent, and with which accuracy, they can be reproduced over different regions of Europe. This work is being developed under the System-Risk project (www.system-risk.eu) that received funding from the European Union's Framework Programme for Research and Innovation Horizon 2020 under the Marie Skłodowska-Curie Grant Agreement No. 676027. Keywords: flood hazard, data-scarce regions, large-scale studies, pattern recognition, linear binary classifiers, basin geomorphology, DEM.

  19. Surface-layer turbulence, energy balance and links to atmospheric circulations over a mountain glacier in the French Alps

    NASA Astrophysics Data System (ADS)

    Litt, Maxime; Sicart, Jean-Emmanuel; Six, Delphine; Wagnon, Patrick; Helgason, Warren D.

    2017-04-01

    Over Saint-Sorlin Glacier in the French Alps (45° N, 6.1° E; ˜ 3 km2) in summer, we study the atmospheric surface-layer dynamics, turbulent fluxes, their uncertainties and their impact on surface energy balance (SEB) melt estimates. Results are classified with regard to large-scale forcing. We use high-frequency eddy-covariance data and mean air-temperature and wind-speed vertical profiles, collected in 2006 and 2009 in the glacier's atmospheric surface layer. We evaluate the turbulent fluxes with the eddy-covariance (sonic) and the profile method, and random errors and parametric uncertainties are evaluated by including different stability corrections and assuming different values for surface roughness lengths. For weak synoptic forcing, local thermal effects dominate the wind circulation. On the glacier, weak katabatic flows with a wind-speed maximum at low height (2-3 m) are detected 71 % of the time and are generally associated with small turbulent kinetic energy (TKE) and small net turbulent fluxes. Radiative fluxes dominate the SEB. When the large-scale forcing is strong, the wind in the valley aligns with the glacier flow, intense downslope flows are observed, no wind-speed maximum is visible below 5 m, and TKE and net turbulent fluxes are often intense. The net turbulent fluxes contribute significantly to the SEB. The surface-layer turbulence production is probably not at equilibrium with dissipation because of interactions of large-scale orographic disturbances with the flow when the forcing is strong or low-frequency oscillations of the katabatic flow when the forcing is weak. In weak forcing when TKE is low, all turbulent fluxes calculation methods provide similar fluxes. In strong forcing when TKE is large, the choice of roughness lengths impacts strongly the net turbulent fluxes from the profile method fluxes and their uncertainties. However, the uncertainty on the total SEB remains too high with regard to the net observed melt to be able to recommend one turbulent flux calculation method over another.

  20. A Nasal Brush-based Classifier of Asthma Identified by Machine Learning Analysis of Nasal RNA Sequence Data.

    PubMed

    Pandey, Gaurav; Pandey, Om P; Rogers, Angela J; Ahsen, Mehmet E; Hoffman, Gabriel E; Raby, Benjamin A; Weiss, Scott T; Schadt, Eric E; Bunyavanich, Supinda

    2018-06-11

    Asthma is a common, under-diagnosed disease affecting all ages. We sought to identify a nasal brush-based classifier of mild/moderate asthma. 190 subjects with mild/moderate asthma and controls underwent nasal brushing and RNA sequencing of nasal samples. A machine learning-based pipeline identified an asthma classifier consisting of 90 genes interpreted via an L2-regularized logistic regression classification model. This classifier performed with strong predictive value and sensitivity across eight test sets, including (1) a test set of independent asthmatic and control subjects profiled by RNA sequencing (positive and negative predictive values of 1.00 and 0.96, respectively; AUC of 0.994), (2) two independent case-control cohorts of asthma profiled by microarray, and (3) five cohorts with other respiratory conditions (allergic rhinitis, upper respiratory infection, cystic fibrosis, smoking), where the classifier had a low to zero misclassification rate. Following validation in large, prospective cohorts, this classifier could be developed into a nasal biomarker of asthma.

Top