Sample records for evolving classifiers methods

  1. Texture segmentation by genetic programming.

    PubMed

    Song, Andy; Ciesielski, Vic

    2008-01-01

    This paper describes a texture segmentation method using genetic programming (GP), which is one of the most powerful evolutionary computation algorithms. By choosing an appropriate representation texture, classifiers can be evolved without computing texture features. Due to the absence of time-consuming feature extraction, the evolved classifiers enable the development of the proposed texture segmentation algorithm. This GP based method can achieve a segmentation speed that is significantly higher than that of conventional methods. This method does not require a human expert to manually construct models for texture feature extraction. In an analysis of the evolved classifiers, it can be seen that these GP classifiers are not arbitrary. Certain textural regularities are captured by these classifiers to discriminate different textures. GP has been shown in this study as a feasible and a powerful approach for texture classification and segmentation, which are generally considered as complex vision tasks.

  2. A Generic multi-dimensional feature extraction method using multiobjective genetic programming.

    PubMed

    Zhang, Yang; Rockett, Peter I

    2009-01-01

    In this paper, we present a generic feature extraction method for pattern classification using multiobjective genetic programming. This not only evolves the (near-)optimal set of mappings from a pattern space to a multi-dimensional decision space, but also simultaneously optimizes the dimensionality of that decision space. The presented framework evolves vector-to-vector feature extractors that maximize class separability. We demonstrate the efficacy of our approach by making statistically-founded comparisons with a wide variety of established classifier paradigms over a range of datasets and find that for most of the pairwise comparisons, our evolutionary method delivers statistically smaller misclassification errors. At very worst, our method displays no statistical difference in a few pairwise comparisons with established classifier/dataset combinations; crucially, none of the misclassification results produced by our method is worse than any comparator classifier. Although principally focused on feature extraction, feature selection is also performed as an implicit side effect; we show that both feature extraction and selection are important to the success of our technique. The presented method has the practical consequence of obviating the need to exhaustively evaluate a large family of conventional classifiers when faced with a new pattern recognition problem in order to attain a good classification accuracy.

  3. A method of evolving novel feature extraction algorithms for detecting buried objects in FLIR imagery using genetic programming

    NASA Astrophysics Data System (ADS)

    Paino, A.; Keller, J.; Popescu, M.; Stone, K.

    2014-06-01

    In this paper we present an approach that uses Genetic Programming (GP) to evolve novel feature extraction algorithms for greyscale images. Our motivation is to create an automated method of building new feature extraction algorithms for images that are competitive with commonly used human-engineered features, such as Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG). The evolved feature extraction algorithms are functions defined over the image space, and each produces a real-valued feature vector of variable length. Each evolved feature extractor breaks up the given image into a set of cells centered on every pixel, performs evolved operations on each cell, and then combines the results of those operations for every cell using an evolved operator. Using this method, the algorithm is flexible enough to reproduce both LBP and HOG features. The dataset we use to train and test our approach consists of a large number of pre-segmented image "chips" taken from a Forward Looking Infrared Imagery (FLIR) camera mounted on the hood of a moving vehicle. The goal is to classify each image chip as either containing or not containing a buried object. To this end, we define the fitness of a candidate solution as the cross-fold validation accuracy of the features generated by said candidate solution when used in conjunction with a Support Vector Machine (SVM) classifier. In order to validate our approach, we compare the classification accuracy of an SVM trained using our evolved features with the accuracy of an SVM trained using mainstream feature extraction algorithms, including LBP and HOG.

  4. Evolving fuzzy rules in a learning classifier system

    NASA Technical Reports Server (NTRS)

    Valenzuela-Rendon, Manuel

    1993-01-01

    The fuzzy classifier system (FCS) combines the ideas of fuzzy logic controllers (FLC's) and learning classifier systems (LCS's). It brings together the expressive powers of fuzzy logic as it has been applied in fuzzy controllers to express relations between continuous variables, and the ability of LCS's to evolve co-adapted sets of rules. The goal of the FCS is to develop a rule-based system capable of learning in a reinforcement regime, and that can potentially be used for process control.

  5. Delineating slowly and rapidly evolving fractions of the Drosophila genome.

    PubMed

    Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

    2008-05-01

    Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.

  6. Binary Image Classification: A Genetic Programming Approach to the Problem of Limited Training Instances.

    PubMed

    Al-Sahaf, Harith; Zhang, Mengjie; Johnston, Mark

    2016-01-01

    In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.

  7. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    PubMed Central

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level. PMID:25830807

  8. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    PubMed

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.

  9. Concurrent approach for evolving compact decision rule sets

    NASA Astrophysics Data System (ADS)

    Marmelstein, Robert E.; Hammack, Lonnie P.; Lamont, Gary B.

    1999-02-01

    The induction of decision rules from data is important to many disciplines, including artificial intelligence and pattern recognition. To improve the state of the art in this area, we introduced the genetic rule and classifier construction environment (GRaCCE). It was previously shown that GRaCCE consistently evolved decision rule sets from data, which were significantly more compact than those produced by other methods (such as decision tree algorithms). The primary disadvantage of GRaCCe, however, is its relatively poor run-time execution performance. In this paper, a concurrent version of the GRaCCE architecture is introduced, which improves the efficiency of the original algorithm. A prototype of the algorithm is tested on an in- house parallel processor configuration and the results are discussed.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mason, J.

    CCHDT constructs and classifies various arrangements of hard disks of a single radius places on the unit square with periodic boundary conditions. Specifially, a given configuration is evolved to the nearest critical point on a smoothed hard disk energy fuction, and is classified by the adjacency matrix of the canonically labelled contact graph.

  11. Adaptive Framework for Classification and Novel Class Detection over Evolving Data Streams with Limited Labeled Data.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Haque, Ahsanul; Khan, Latifur; Baron, Michael

    2015-09-01

    Most approaches to classifying evolving data streams either divide the stream of data into fixed-size chunks or use gradual forgetting to address the problems of infinite length and concept drift. Finding the fixed size of the chunks or choosing a forgetting rate without prior knowledge about time-scale of change is not a trivial task. As a result, these approaches suffer from a trade-off between performance and sensitivity. To address this problem, we present a framework which uses change detection techniques on the classifier performance to determine chunk boundaries dynamically. Though this framework exhibits good performance, it is heavily dependent onmore » the availability of true labels of data instances. However, labeled data instances are scarce in realistic settings and not readily available. Therefore, we present a second framework which is unsupervised in nature, and exploits change detection on classifier confidence values to determine chunk boundaries dynamically. In this way, it avoids the use of labeled data while still addressing the problems of infinite length and concept drift. Moreover, both of our proposed frameworks address the concept evolution problem by detecting outliers having similar values for the attributes. We provide theoretical proof that our change detection method works better than other state-of-the-art approaches in this particular scenario. Results from experiments on various benchmark and synthetic data sets also show the efficiency of our proposed frameworks.« less

  12. Evolving forecasting classifications and applications in health forecasting

    PubMed Central

    Soyiri, Ireneous N; Reidpath, Daniel D

    2012-01-01

    Health forecasting forewarns the health community about future health situations and disease episodes so that health systems can better allocate resources and manage demand. The tools used for developing and measuring the accuracy and validity of health forecasts commonly are not defined although they are usually adapted forms of statistical procedures. This review identifies previous typologies used in classifying the forecasting methods commonly used in forecasting health conditions or situations. It then discusses the strengths and weaknesses of these methods and presents the choices available for measuring the accuracy of health-forecasting models, including a note on the discrepancies in the modes of validation. PMID:22615533

  13. Classifying the evolutionary and ecological features of neoplasms.

    PubMed

    Maley, Carlo C; Aktipis, Athena; Graham, Trevor A; Sottoriva, Andrea; Boddy, Amy M; Janiszewska, Michalina; Silva, Ariosto S; Gerlinger, Marco; Yuan, Yinyin; Pienta, Kenneth J; Anderson, Karen S; Gatenby, Robert; Swanton, Charles; Posada, David; Wu, Chung-I; Schiffman, Joshua D; Hwang, E Shelley; Polyak, Kornelia; Anderson, Alexander R A; Brown, Joel S; Greaves, Mel; Shibata, Darryl

    2017-10-01

    Neoplasms change over time through a process of cell-level evolution, driven by genetic and epigenetic alterations. However, the ecology of the microenvironment of a neoplastic cell determines which changes provide adaptive benefits. There is widespread recognition of the importance of these evolutionary and ecological processes in cancer, but to date, no system has been proposed for drawing clinically relevant distinctions between how different tumours are evolving. On the basis of a consensus conference of experts in the fields of cancer evolution and cancer ecology, we propose a framework for classifying tumours that is based on four relevant components. These are the diversity of neoplastic cells (intratumoural heterogeneity) and changes over time in that diversity, which make up an evolutionary index (Evo-index), as well as the hazards to neoplastic cell survival and the resources available to neoplastic cells, which make up an ecological index (Eco-index). We review evidence demonstrating the importance of each of these factors and describe multiple methods that can be used to measure them. Development of this classification system holds promise for enabling clinicians to personalize optimal interventions based on the evolvability of the patient's tumour. The Evo- and Eco-indices provide a common lexicon for communicating about how neoplasms change in response to interventions, with potential implications for clinical trials, personalized medicine and basic cancer research.

  14. Instruction-matrix-based genetic programming.

    PubMed

    Li, Gang; Wang, Jin Feng; Lee, Kin Hong; Leung, Kwong-Sak

    2008-08-01

    In genetic programming (GP), evolving tree nodes separately would reduce the huge solution space. However, tree nodes are highly interdependent with respect to their fitness. In this paper, we propose a new GP framework, namely, instruction-matrix (IM)-based GP (IMGP), to handle their interactions. IMGP maintains an IM to evolve tree nodes and subtrees separately. IMGP extracts program trees from an IM and updates the IM with the information of the extracted program trees. As the IM actually keeps most of the information of the schemata of GP and evolves the schemata directly, IMGP is effective and efficient. Our experimental results on benchmark problems have verified that IMGP is not only better than those of canonical GP in terms of the qualities of the solutions and the number of program evaluations, but they are also better than some of the related GP algorithms. IMGP can also be used to evolve programs for classification problems. The classifiers obtained have higher classification accuracies than four other GP classification algorithms on four benchmark classification problems. The testing errors are also comparable to or better than those obtained with well-known classifiers. Furthermore, an extended version, called condition matrix for rule learning, has been used successfully to handle multiclass classification problems.

  15. Improving Naive Bayes with Online Feature Selection for Quick Adaptation to Evolving Feature Usefulness

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pon, R K; Cardenas, A F; Buttler, D J

    The definition of what makes an article interesting varies from user to user and continually evolves even for a single user. As a result, for news recommendation systems, useless document features can not be determined a priori and all features are usually considered for interestingness classification. Consequently, the presence of currently useless features degrades classification performance [1], particularly over the initial set of news articles being classified. The initial set of document is critical for a user when considering which particular news recommendation system to adopt. To address these problems, we introduce an improved version of the naive Bayes classifiermore » with online feature selection. We use correlation to determine the utility of each feature and take advantage of the conditional independence assumption used by naive Bayes for online feature selection and classification. The augmented naive Bayes classifier performs 28% better than the traditional naive Bayes classifier in recommending news articles from the Yahoo! RSS feeds.« less

  16. Classifying the embedded young stellar population in Perseus and Taurus and the LOMASS database

    NASA Astrophysics Data System (ADS)

    Carney, M. T.; Yıldız, U. A.; Mottram, J. C.; van Dishoeck, E. F.; Ramchandani, J.; Jørgensen, J. K.

    2016-02-01

    Context. The classification of young stellar objects (YSOs) is typically done using the infrared spectral slope or bolometric temperature, but either can result in contamination of samples. More accurate methods to determine the evolutionary stage of YSOs will improve the reliability of statistics for the embedded YSO population and provide more robust stage lifetimes. Aims: We aim to separate the truly embedded YSOs from more evolved sources. Methods: Maps of HCO+J = 4-3 and C18O J = 3-2 were observed with HARP on the James Clerk Maxwell Telescope (JCMT) for a sample of 56 candidate YSOs in Perseus and Taurus in order to characterize the presence and morphology of emission from high density (ncrit > 106 cm-3) and high column density gas, respectively. These are supplemented with archival dust continuum maps observed with SCUBA on the JCMT and Herschel PACS to compare the morphology of the gas and dust in the protostellar envelopes. The spatial concentration of HCO+J = 4-3 and 850 μm dust emission are used to classify the embedded nature of YSOs. Results: Approximately 30% of Class 0+I sources in Perseus and Taurus are not Stage I, but are likely to be more evolved Stage II pre-main sequence (PMS) stars with disks. An additional 16% are confused sources with an uncertain evolutionary stage. Outflows are found to make a negligible contribution to the integrated HCO+ intensity for the majority of sources in this study. Conclusions: Separating classifications by cloud reveals that a high percentage of the Class 0+I sources in the Perseus star forming region are truly embedded Stage I sources (71%), while the Taurus cloud hosts a majority of evolved PMS stars with disks (68%). The concentration factor method is useful to correct misidentified embedded YSOs, yielding higher accuracy for YSO population statistics and Stage timescales. Current estimates (0.54 Myr) may overpredict the Stage I lifetime on the order of 30%, resulting in timescales down to 0.38 Myr for the embedded phase.

  17. Just-in-time classifiers for recurrent concepts.

    PubMed

    Alippi, Cesare; Boracchi, Giacomo; Roveri, Manuel

    2013-04-01

    Just-in-time (JIT) classifiers operate in evolving environments by classifying instances and reacting to concept drift. In stationary conditions, a JIT classifier improves its accuracy over time by exploiting additional supervised information coming from the field. In nonstationary conditions, however, the classifier reacts as soon as concept drift is detected; the current classification setup is discarded and a suitable one activated to keep the accuracy high. We present a novel generation of JIT classifiers able to deal with recurrent concept drift by means of a practical formalization of the concept representation and the definition of a set of operators working on such representations. The concept-drift detection activity, which is crucial in promptly reacting to changes exactly when needed, is advanced by considering change-detection tests monitoring both inputs and classes distributions.

  18. Random forest classification of stars in the Galactic Centre

    NASA Astrophysics Data System (ADS)

    Plewa, P. M.

    2018-05-01

    Near-infrared high-angular resolution imaging observations of the Milky Way's nuclear star cluster have revealed all luminous members of the existing stellar population within the central parsec. Generally, these stars are either evolved late-type giants or massive young, early-type stars. We revisit the problem of stellar classification based on intermediate-band photometry in the K band, with the primary aim of identifying faint early-type candidate stars in the extended vicinity of the central massive black hole. A random forest classifier, trained on a subsample of spectroscopically identified stars, performs similarly well as competitive methods (F1 = 0.85), without involving any model of stellar spectral energy distributions. Advantages of using such a machine-trained classifier are a minimum of required calibration effort, a predictive accuracy expected to improve as more training data become available, and the ease of application to future, larger data sets. By applying this classifier to archive data, we are also able to reproduce the results of previous studies of the spatial distribution and the K-band luminosity function of both the early- and late-type stars.

  19. Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling.

    PubMed

    Wu, Ke; Edwards, Andrea; Fan, Wei; Gao, Jing; Zhang, Kun

    2014-04-01

    Data stream classification and imbalanced data learning are two important areas of data mining research. Each has been well studied to date with many interesting algorithms developed. However, only a few approaches reported in literature address the intersection of these two fields due to their complex interplay. In this work, we proposed an importance sampling driven, dynamic feature group weighting framework (DFGW-IS) for classifying data streams of imbalanced distribution. Two components are tightly incorporated into the proposed approach to address the intrinsic characteristics of concept-drifting, imbalanced streaming data. Specifically, the ever-evolving concepts are tackled by a weighted ensemble trained on a set of feature groups with each sub-classifier (i.e. a single classifier or an ensemble) weighed by its discriminative power and stable level. The un-even class distribution, on the other hand, is typically battled by the sub-classifier built in a specific feature group with the underlying distribution rebalanced by the importance sampling technique. We derived the theoretical upper bound for the generalization error of the proposed algorithm. We also studied the empirical performance of our method on a set of benchmark synthetic and real world data, and significant improvement has been achieved over the competing algorithms in terms of standard evaluation metrics and parallel running time. Algorithm implementations and datasets are available upon request.

  20. Reconstructing the origin and elaboration of insect-trapping inflorescences in the Araceae1

    PubMed Central

    Bröderbauer, David; Diaz, Anita; Weber, Anton

    2016-01-01

    Premise of the study Floral traps are among the most sophisticated devices that have evolved in angiosperms in the context of pollination, but the evolution of trap pollination has not yet been studied in a phylogenetic context. We aim to determine the evolutionary history of morphological traits that facilitate trap pollination and to elucidate the impact of pollinators on the evolution of inflorescence traps in the family Araceae. Methods Inflorescence morphology was investigated to determine the presence of trapping devices and to classify functional types of traps. We inferred phylogenetic relationships in the family using maximum likelihood and Bayesian methods. Character evolution of trapping devices, trap types, and pollinator types was then assessed with maximum parsimony and Bayesian methods. We also tested for an association of trap pollination with specific pollinator types. Key results Inflorescence traps have evolved independently at least 10 times within the Araceae. Trapping devices were found in 27 genera. On the basis of different combinations of trapping devices, six functional types of traps were identified. Trap pollination in Araceae is correlated with pollination by flies. Conclusions Trap pollination in the Araceae is more common than was previously thought. Preadaptations such as papillate cells or elongated sterile flowers facilitated the evolution of inflorescence traps. In some clades, imperfect traps served as a precursor for the evolution of more elaborate traps. Traps that evolved in association with fly pollination were most probably derived from mutualistic ancestors, offering a brood-site to their pollinators. PMID:22965851

  1. Classifying the evolutionary and ecological features of neoplasms

    PubMed Central

    Maley, Carlo C.; Aktipis, Athena; Graham, Trevor A.; Sottoriva, Andrea; Boddy, Amy M.; Janiszewska, Michalina; Silva, Ariosto S.; Gerlinger, Marco; Yuan, Yinyin; Pienta, Kenneth J.; Anderson, Karen S.; Gatenby, Robert; Swanton, Charles; Posada, David; Wu, Chung-I; Schiffman, Joshua D.; Hwang, E. Shelley; Polyak, Kornelia; Anderson, Alexander R. A.; Brown, Joel S.; Greaves, Mel; Shibata, Darryl

    2018-01-01

    Neoplasms change over time through a process of cell-level evolution, driven by genetic and epigenetic alterations. However, the ecology of the microenvironment of a neoplastic cell determines which changes provide adaptive benefits. There is widespread recognition of the importance of these evolutionary and ecological processes in cancer, but to date, no system has been proposed for drawing clinically relevant distinctions between how different tumours are evolving. On the basis of a consensus conference of experts in the fields of cancer evolution and cancer ecology, we propose a framework for classifying tumours that is based on four relevant components. These are the diversity of neoplastic cells (intratumoural heterogeneity) and changes over time in that diversity, which make up an evolutionary index (Evo-index), as well as the hazards to neoplastic cell survival and the resources available to neoplastic cells, which make up an ecological index (Eco-index). We review evidence demonstrating the importance of each of these factors and describe multiple methods that can be used to measure them. Development of this classification system holds promise for enabling clinicians to personalize optimal interventions based on the evolvability of the patient’s tumour. The Evo- and Eco-indices provide a common lexicon for communicating about how neoplasms change in response to interventions, with potential implications for clinical trials, personalized medicine and basic cancer research. PMID:28912577

  2. Prediction of cancer class with majority voting genetic programming classifier using gene expression data.

    PubMed

    Paul, Topon Kumar; Iba, Hitoshi

    2009-01-01

    In order to get a better understanding of different types of cancers and to find the possible biomarkers for diseases, recently, many researchers are analyzing the gene expression data using various machine learning techniques. However, due to a very small number of training samples compared to the huge number of genes and class imbalance, most of these methods suffer from overfitting. In this paper, we present a majority voting genetic programming classifier (MVGPC) for the classification of microarray data. Instead of a single rule or a single set of rules, we evolve multiple rules with genetic programming (GP) and then apply those rules to test samples to determine their labels with majority voting technique. By performing experiments on four different public cancer data sets, including multiclass data sets, we have found that the test accuracies of MVGPC are better than those of other methods, including AdaBoost with GP. Moreover, some of the more frequently occurring genes in the classification rules are known to be associated with the types of cancers being studied in this paper.

  3. Evolving binary classifiers through parallel computation of multiple fitness cases.

    PubMed

    Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni

    2005-06-01

    This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.

  4. Maximizing the Adjacent Possible in Automata Chemistries.

    PubMed

    Hickinbotham, Simon; Clark, Edward; Nellis, Adam; Stepney, Susan; Clarke, Tim; Young, Peter

    2016-01-01

    Automata chemistries are good vehicles for experimentation in open-ended evolution, but they are by necessity complex systems whose low-level properties require careful design. To aid the process of designing automata chemistries, we develop an abstract model that classifies the features of a chemistry from a physical (bottom up) perspective and from a biological (top down) perspective. There are two levels: things that can evolve, and things that cannot. We equate the evolving level with biology and the non-evolving level with physics. We design our initial organisms in the biology, so they can evolve. We design the physics to facilitate evolvable biologies. This architecture leads to a set of design principles that should be observed when creating an instantiation of the architecture. These principles are Everything Evolves, Everything's Soft, and Everything Dies. To evaluate these ideas, we present experiments in the recently developed Stringmol automata chemistry. We examine the properties of Stringmol with respect to the principles, and so demonstrate the usefulness of the principles in designing automata chemistries.

  5. Recognising promoter sequences using an artificial immune system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cooke, D.E.; Hunt, J.E.

    1995-12-31

    We have developed an artificial immune system (AIS) which is based on the human immune system. The AIS possesses an adaptive learning mechanism which enables antibodies to emerge which can be used for classification tasks. In this paper, we describe how the AIS has been used to evolve antibodies which can classify promoter containing and promoter negative DNA sequences. The DNA sequences used for teaching were 57 nucleotides in length and contained procaryotic promoters. The system classified previously unseen DNA sequences with an accuracy of approximately 90%.

  6. Evolution of cellular automata with memory: The Density Classification Task.

    PubMed

    Stone, Christopher; Bull, Larry

    2009-08-01

    The Density Classification Task is a well known test problem for two-state discrete dynamical systems. For many years researchers have used a variety of evolutionary computation approaches to evolve solutions to this problem. In this paper, we investigate the evolvability of solutions when the underlying Cellular Automaton is augmented with a type of memory based on the Least Mean Square algorithm. To obtain high performance solutions using a simple non-hybrid genetic algorithm, we design a novel representation based on the ternary representation used for Learning Classifier Systems. The new representation is found able to produce superior performance to the bit string traditionally used for representing Cellular automata. Moreover, memory is shown to improve evolvability of solutions and appropriate memory settings are able to be evolved as a component part of these solutions.

  7. The Evolution of Sonic Ecosystems

    NASA Astrophysics Data System (ADS)

    McCormack, Jon

    This chapter describes a novel type of artistic artificial life software environment. Agents that have the ability to make and listen to sound populate a synthetic world. An evolvable, rule-based classifier system drives agent behavior. Agents compete for limited resources in a virtual environment that is influenced by the presence and movement of people observing the system. Electronic sensors create a link between the real and virtual spaces, virtual agents evolve implicitly to try to maintain the interest of the human audience, whose presence provides them with life-sustaining food.

  8. Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.

    PubMed

    Liu, Li; Shao, Ling; Li, Xuelong; Lu, Ke

    2016-01-01

    Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned.

  9. Ethnic Settlement in a Metropolitan Area: A Typology of Communities.

    ERIC Educational Resources Information Center

    Agocs, Carol

    1981-01-01

    Presents a comparative analysis of changing ethnic residential distributions from 1940-1970 to identify recently evolved forms of ethnic settlement in the Detroit (Michigan) metropolitan area. Identifies and classifies contemporary types of ethnic communities to expand the knowledge of ethnic settlement. (MK)

  10. Ordering the discipline: classification in the history of science. Introduction.

    PubMed

    Weldon, Stephen P

    2013-09-01

    Classification of the history of science has a long history, and the essays in this Focus section explore that history and its consequences from several different angles. Two of the papers deal with how classifying schemes in bibliography have evolved. A third looks at the way archival organization has changed over the years. Finally, the last essay explores the intersection of human and machine classifying systems. All four contributions look closely at the ramifications of the digital revolution for the way we organize the knowledge of the discipline.

  11. Phylogeny of the Genus Flavivirus

    PubMed Central

    Kuno, Goro; Chang, Gwong-Jen J.; Tsuchiya, K. Richard; Karabatsos, Nick; Cropp, C. Bruce

    1998-01-01

    We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses. PMID:9420202

  12. Phylogeny of the genus Flavivirus.

    PubMed

    Kuno, G; Chang, G J; Tsuchiya, K R; Karabatsos, N; Cropp, C B

    1998-01-01

    We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses.

  13. 78 FR 49878 - Endangered and Threatened Wildlife and Plants; Endangered Status for the Florida Leafwing and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ... frequent plaintiff group and further strengthened the workplan, which will allow the agency to focus its... butterfly in the West Indies has evolved as many distinct island subspecies as S. acis. Each group of... classified into two [[Page 49882

  14. MICROARRAY ANALYSIS OF PM-INDUCED GENEEXPRESSION IN HUMAN BRONCHIAL EPITHELIAL CELLS

    EPA Science Inventory

    Ambient air particles (PM) are generally classified into 3 sizes; coarse (2.5, 10m), fine (0.1, 2.5m), and ultrafine (<0.lpm). Each particle size is evolved from different sources and transformation processes (e.g., combustion vs. mechanical abrasion, and atmospheric conversion ...

  15. Accounting for Cheating: An Evolving Theory and Emergent Themes

    ERIC Educational Resources Information Center

    Brent, Edward; Atkisson, Curtis

    2011-01-01

    This study examines student responses to the question, "What circumstances, if any, could make cheating justified?" It then assesses how well those responses can be classified by existing theories and categories that emerge from a qualitative analysis of the data. Results show considerable support for techniques of neutralization, partial support…

  16. Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science

    NASA Technical Reports Server (NTRS)

    Zevin, M.; Coughlin, S.; Bahaadini, S.; Besler, E.; Rohani, N.; Allen, S.; Cabero, M.; Crowston, K.; Katsaggelos, A. K.; Littenberg, T. B.

    2017-01-01

    With the first direct detection of gravitational waves, the advanced laser interferometer gravitational-wave observatory (LIGO) has initiated a new field of astronomy by providing an alternative means of sensing the universe. The extreme sensitivity required to make such detections is achieved through exquisite isolation of all sensitive components of LIGO from non-gravitational-wave disturbances. Nonetheless, LIGO is still susceptible to a variety of instrumental and environmental sources of noise that contaminate the data. Of particular concern are noise features known as glitches, which are transient and non-Gaussian in their nature, and occur at a high enough rate so that accidental coincidence between the two LIGO detectors is non-negligible. Glitches come in a wide range of time-frequency-amplitude morphologies, with new morphologies appearing as the detector evolves. Since they can obscure or mimic true gravitational-wave signals, a robust characterization of glitches is paramount in the effort to achieve the gravitational-wave detection rates that are predicted by the design sensitivity of LIGO. This proves a daunting task for members of the LIGO Scientific Collaboration alone due to the sheer amount of data. In this paper we describe an innovative project that combines crowdsourcing with machine learning to aid in the challenging task of categorizing all of the glitches recorded by the LIGO detectors. Through the Zooniverse platform, we engage and recruit volunteers from the public to categorize images of time-frequency representations of glitches into pre-identified morphological classes and to discover new classes that appear as the detectors evolve. In addition, machine learning algorithms are used to categorize images after being trained on human-classified examples of the morphological classes. Leveraging the strengths of both classification methods, we create a combined method with the aim of improving the efficiency and accuracy of each individual classifier. The resulting classification and characterization should help LIGO scientists to identify causes of glitches and subsequently eliminate them from the data or the detector entirely, thereby improving the rate and accuracy of gravitational-wave observations. We demonstrate these methods using a small subset of data from LIGO's first observing run.

  17. Gravity Spy: integrating advanced LIGO detector characterization, machine learning, and citizen science.

    PubMed

    Zevin, M; Coughlin, S; Bahaadini, S; Besler, E; Rohani, N; Allen, S; Cabero, M; Crowston, K; Katsaggelos, A K; Larson, S L; Lee, T K; Lintott, C; Littenberg, T B; Lundgren, A; Østerlund, C; Smith, J R; Trouille, L; Kalogera, V

    2017-01-01

    With the first direct detection of gravitational waves, the advanced laser interferometer gravitational-wave observatory (LIGO) has initiated a new field of astronomy by providing an alternative means of sensing the universe. The extreme sensitivity required to make such detections is achieved through exquisite isolation of all sensitive components of LIGO from non-gravitational-wave disturbances. Nonetheless, LIGO is still susceptible to a variety of instrumental and environmental sources of noise that contaminate the data. Of particular concern are noise features known as glitches , which are transient and non-Gaussian in their nature, and occur at a high enough rate so that accidental coincidence between the two LIGO detectors is non-negligible. Glitches come in a wide range of time-frequency-amplitude morphologies, with new morphologies appearing as the detector evolves. Since they can obscure or mimic true gravitational-wave signals, a robust characterization of glitches is paramount in the effort to achieve the gravitational-wave detection rates that are predicted by the design sensitivity of LIGO. This proves a daunting task for members of the LIGO Scientific Collaboration alone due to the sheer amount of data. In this paper we describe an innovative project that combines crowdsourcing with machine learning to aid in the challenging task of categorizing all of the glitches recorded by the LIGO detectors. Through the Zooniverse platform, we engage and recruit volunteers from the public to categorize images of time-frequency representations of glitches into pre-identified morphological classes and to discover new classes that appear as the detectors evolve. In addition, machine learning algorithms are used to categorize images after being trained on human-classified examples of the morphological classes. Leveraging the strengths of both classification methods, we create a combined method with the aim of improving the efficiency and accuracy of each individual classifier. The resulting classification and characterization should help LIGO scientists to identify causes of glitches and subsequently eliminate them from the data or the detector entirely, thereby improving the rate and accuracy of gravitational-wave observations. We demonstrate these methods using a small subset of data from LIGO's first observing run.

  18. Gravity Spy: integrating advanced LIGO detector characterization, machine learning, and citizen science

    NASA Astrophysics Data System (ADS)

    Zevin, M.; Coughlin, S.; Bahaadini, S.; Besler, E.; Rohani, N.; Allen, S.; Cabero, M.; Crowston, K.; Katsaggelos, A. K.; Larson, S. L.; Lee, T. K.; Lintott, C.; Littenberg, T. B.; Lundgren, A.; Østerlund, C.; Smith, J. R.; Trouille, L.; Kalogera, V.

    2017-03-01

    With the first direct detection of gravitational waves, the advanced laser interferometer gravitational-wave observatory (LIGO) has initiated a new field of astronomy by providing an alternative means of sensing the universe. The extreme sensitivity required to make such detections is achieved through exquisite isolation of all sensitive components of LIGO from non-gravitational-wave disturbances. Nonetheless, LIGO is still susceptible to a variety of instrumental and environmental sources of noise that contaminate the data. Of particular concern are noise features known as glitches, which are transient and non-Gaussian in their nature, and occur at a high enough rate so that accidental coincidence between the two LIGO detectors is non-negligible. Glitches come in a wide range of time-frequency-amplitude morphologies, with new morphologies appearing as the detector evolves. Since they can obscure or mimic true gravitational-wave signals, a robust characterization of glitches is paramount in the effort to achieve the gravitational-wave detection rates that are predicted by the design sensitivity of LIGO. This proves a daunting task for members of the LIGO Scientific Collaboration alone due to the sheer amount of data. In this paper we describe an innovative project that combines crowdsourcing with machine learning to aid in the challenging task of categorizing all of the glitches recorded by the LIGO detectors. Through the Zooniverse platform, we engage and recruit volunteers from the public to categorize images of time-frequency representations of glitches into pre-identified morphological classes and to discover new classes that appear as the detectors evolve. In addition, machine learning algorithms are used to categorize images after being trained on human-classified examples of the morphological classes. Leveraging the strengths of both classification methods, we create a combined method with the aim of improving the efficiency and accuracy of each individual classifier. The resulting classification and characterization should help LIGO scientists to identify causes of glitches and subsequently eliminate them from the data or the detector entirely, thereby improving the rate and accuracy of gravitational-wave observations. We demonstrate these methods using a small subset of data from LIGO’s first observing run.

  19. Gravity Spy: integrating advanced LIGO detector characterization, machine learning, and citizen science

    PubMed Central

    Zevin, M; Coughlin, S; Bahaadini, S; Besler, E; Rohani, N; Allen, S; Cabero, M; Crowston, K; Katsaggelos, A K; Larson, S L; Lee, T K; Lintott, C; Littenberg, T B; Lundgren, A; Østerlund, C; Smith, J R; Trouille, L; Kalogera, V

    2018-01-01

    With the first direct detection of gravitational waves, the advanced laser interferometer gravitational-wave observatory (LIGO) has initiated a new field of astronomy by providing an alternative means of sensing the universe. The extreme sensitivity required to make such detections is achieved through exquisite isolation of all sensitive components of LIGO from non-gravitational-wave disturbances. Nonetheless, LIGO is still susceptible to a variety of instrumental and environmental sources of noise that contaminate the data. Of particular concern are noise features known as glitches, which are transient and non-Gaussian in their nature, and occur at a high enough rate so that accidental coincidence between the two LIGO detectors is non-negligible. Glitches come in a wide range of time-frequency-amplitude morphologies, with new morphologies appearing as the detector evolves. Since they can obscure or mimic true gravitational-wave signals, a robust characterization of glitches is paramount in the effort to achieve the gravitational-wave detection rates that are predicted by the design sensitivity of LIGO. This proves a daunting task for members of the LIGO Scientific Collaboration alone due to the sheer amount of data. In this paper we describe an innovative project that combines crowdsourcing with machine learning to aid in the challenging task of categorizing all of the glitches recorded by the LIGO detectors. Through the Zooniverse platform, we engage and recruit volunteers from the public to categorize images of time-frequency representations of glitches into pre-identified morphological classes and to discover new classes that appear as the detectors evolve. In addition, machine learning algorithms are used to categorize images after being trained on human-classified examples of the morphological classes. Leveraging the strengths of both classification methods, we create a combined method with the aim of improving the efficiency and accuracy of each individual classifier. The resulting classification and characterization should help LIGO scientists to identify causes of glitches and subsequently eliminate them from the data or the detector entirely, thereby improving the rate and accuracy of gravitational-wave observations. We demonstrate these methods using a small subset of data from LIGO’s first observing run. PMID:29722360

  20. Evolutionary history of human disease genes reveals phenotypic connections and comorbidity among genetic diseases

    NASA Astrophysics Data System (ADS)

    Park, Solip; Yang, Jae-Seong; Kim, Jinho; Shin, Young-Eun; Hwang, Jihye; Park, Juyong; Jang, Sung Key; Kim, Sanguk

    2012-10-01

    The extent to which evolutionary changes have impacted the phenotypic relationships among human diseases remains unclear. In this work, we report that phenotypically similar diseases are connected by the evolutionary constraints on human disease genes. Human disease groups can be classified into slowly or rapidly evolving classes, where the diseases in the slowly evolving class are enriched with morphological phenotypes and those in the rapidly evolving class are enriched with physiological phenotypes. Our findings establish a clear evolutionary connection between disease classes and disease phenotypes for the first time. Furthermore, the high comorbidity found between diseases connected by similar evolutionary constraints enables us to improve the predictability of the relative risk of human diseases. We find the evolutionary constraints on disease genes are a new layer of molecular connection in the network-based exploration of human diseases.

  1. Evolutionary history of human disease genes reveals phenotypic connections and comorbidity among genetic diseases.

    PubMed

    Park, Solip; Yang, Jae-Seong; Kim, Jinho; Shin, Young-Eun; Hwang, Jihye; Park, Juyong; Jang, Sung Key; Kim, Sanguk

    2012-01-01

    The extent to which evolutionary changes have impacted the phenotypic relationships among human diseases remains unclear. In this work, we report that phenotypically similar diseases are connected by the evolutionary constraints on human disease genes. Human disease groups can be classified into slowly or rapidly evolving classes, where the diseases in the slowly evolving class are enriched with morphological phenotypes and those in the rapidly evolving class are enriched with physiological phenotypes. Our findings establish a clear evolutionary connection between disease classes and disease phenotypes for the first time. Furthermore, the high comorbidity found between diseases connected by similar evolutionary constraints enables us to improve the predictability of the relative risk of human diseases. We find the evolutionary constraints on disease genes are a new layer of molecular connection in the network-based exploration of human diseases.

  2. Cleaning up That Mess: A Framework for Classifying Educational Apps

    ERIC Educational Resources Information Center

    Cherner, Todd; Dix , Judy; Lee, Corey

    2014-01-01

    As tablet technologies continue to evolve, the emergence of educational applications (apps) is impacting the work of teacher educators. Beyond online lists of best apps for education and recommendations from colleagues, teacher educators have few resources available to support their teaching of how to select educational apps. In response, this…

  3. Actors, Observers, and the Attribution of Intent in Conversation.

    ERIC Educational Resources Information Center

    Ehrenhaus, Peter C.

    A study examined the manner in which conversants and observers of conversants attribute intent to messages in ongoing information-seeking conversations. College students were used to evolve and test three scenarios, in which evasion was more or less likely, and a system of classifying intention in information seeking conversations. Fifty-four…

  4. Link prediction in multiplex online social networks

    NASA Astrophysics Data System (ADS)

    Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

    2017-02-01

    Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.

  5. Link prediction in multiplex online social networks.

    PubMed

    Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

    2017-02-01

    Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.

  6. Machine learning in infrared object classification - an all-sky selection of YSO candidates

    NASA Astrophysics Data System (ADS)

    Marton, Gabor; Zahorecz, Sarolta; Toth, L. Viktor; Magnus McGehee, Peregrine; Kun, Maria

    2015-08-01

    Object classification is a fundamental and challenging problem in the era of big data. I will discuss up-to-date methods and their application to classify infrared point sources.We analysed the ALLWISE catalogue, the most recent public source catalogue of the Wide-field Infrared Survey Explorer (WISE) to compile a reliable list of Young Stellar Object (YSO) candidates. We tested and compared classical and up-to-date statistical methods as well, to discriminate source types like extragalactic objects, evolved stars, main sequence stars, objects related to the interstellar medium and YSO candidates by using their mid-IR WISE properties and associated near-IR 2MASS data.In the particular classification problem the Support Vector Machines (SVM), a class of supervised learning algorithm turned out to be the best tool. As a result we classify Class I and II YSOs with >90% accuracy while the fraction of contaminating extragalactic objects remains well below 1%, based on the number of known objects listed in the SIMBAD and VizieR databases. We compare our results to other classification schemes from the literature and show that the SVM outperforms methods that apply linear cuts on the colour-colour and colour-magnitude space. Our homogenous YSO candidate catalog can serve as an excellent pathfinder for future detailed observations of individual objects and a starting point of statistical studies that aim to add pieces to the big picture of star formation theory.

  7. Standalone medical device software: The evolving regulatory framework.

    PubMed

    McCarthy, Avril D; Lawford, Patricia V

    2014-01-01

    The paper provides an introduction to the regulatory landscape affecting a particular category of medical technology, namely standalone software-sometimes referred to as 'software as a medical device'. To aid the reader's comprehension of an often complex area, six case studies are outlined and discussed before the paper continues to provide detail of how software with a medical purpose in its own right can potentially be classified as a medical device. The reader is provided an appreciation of how to go about classifying such software and references to support the developer new to the field in locating detailed regulatory support documents and contact points for advice.

  8. Feminisms and Educational Research. Philosophy, Theory, and Educational Research Series

    ERIC Educational Resources Information Center

    Kohli, Wendy R.; Burbules, Nicholas C.

    2011-01-01

    Feminist theory has come a long way from its nascent beginnings--no longer can it be classified as "liberal," "socialist," or "radical." It has shaped and evolved to take on multiple meanings and forms, each distinct in its own perspective and theory. In "Feminisms and Educational Research," the authors explore the various forms of feminisms,…

  9. FORUM: Affective Learning. Affective Learning: Evolving from Values and Planned Behaviors to Internalization and Pervasive Behavioral Change

    ERIC Educational Resources Information Center

    Thweatt, Katherine S.; Wrench, Jason S.

    2015-01-01

    The mission of "Communication Education" is to publish the best research on communication and learning. Researchers study the communication-learning interface in many ways, but a common approach is to explore how instructor and student communication can lead to better learning outcomes. Although scholars have long classified learning…

  10. Disadvantaged Young People Accessing the New Urban Economies of the Post-Industrial City

    ERIC Educational Resources Information Center

    Raffo, Carlo

    2006-01-01

    The aim of the paper is to examine current and evolving supply side transition policy initiatives in the light of (a) particular demand side needs of urban young people classified as those most disadvantaged and potentially marginalized; and (b) the emerging realities of accessing and operating within particular examples of high value-added…

  11. Highly Interactive WWW Services: A New Type of Information Sources.

    ERIC Educational Resources Information Center

    Vanouplines, Patrick; Nieuwenhuysen, P.

    The World Wide Web is evolving from a collection of texts linked by hypertext and hypermedia toward services that operate interactively with the information user, and which offer results through use of a broad spectrum of tools. This paper presents a collection of interactive WWW services. The services are classified on the basis of the client…

  12. Protein interface classification by evolutionary analysis

    PubMed Central

    2012-01-01

    Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org. PMID:23259833

  13. Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.

    PubMed

    Yu, Hualong; Hong, Shufang; Yang, Xibei; Ni, Jun; Dan, Yuanyuan; Qin, Bin

    2013-01-01

    DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.

  14. Spider Neurotoxins, Short Linear Cationic Peptides and Venom Protein Classification Improved by an Automated Competition between Exhaustive Profile HMM Classifiers

    PubMed Central

    Koua, Dominique; Kuhn-Nentwig, Lucia

    2017-01-01

    Spider venoms are rich cocktails of bioactive peptides, proteins, and enzymes that are being intensively investigated over the years. In order to provide a better comprehension of that richness, we propose a three-level family classification system for spider venom components. This classification is supported by an exhaustive set of 219 new profile hidden Markov models (HMMs) able to attribute a given peptide to its precise peptide type, family, and group. The proposed classification has the advantages of being totally independent from variable spider taxonomic names and can easily evolve. In addition to the new classifiers, we introduce and demonstrate the efficiency of hmmcompete, a new standalone tool that monitors HMM-based family classification and, after post-processing the result, reports the best classifier when multiple models produce significant scores towards given peptide queries. The combined used of hmmcompete and the new spider venom component-specific classifiers demonstrated 96% sensitivity to properly classify all known spider toxins from the UniProtKB database. These tools are timely regarding the important classification needs caused by the increasing number of peptides and proteins generated by transcriptomic projects. PMID:28786958

  15. Evolvable social agents for bacterial systems modeling.

    PubMed

    Paton, Ray; Gregory, Richard; Vlachos, Costas; Saunders, Jon; Wu, Henry

    2004-09-01

    We present two approaches to the individual-based modeling (IbM) of bacterial ecologies and evolution using computational tools. The IbM approach is introduced, and its important complementary role to biosystems modeling is discussed. A fine-grained model of bacterial evolution is then presented that is based on networks of interactivity between computational objects representing genes and proteins. This is followed by a coarser grained agent-based model, which is designed to explore the evolvability of adaptive behavioral strategies in artificial bacteria represented by learning classifier systems. The structure and implementation of the two proposed individual-based bacterial models are discussed, and some results from simulation experiments are presented, illustrating their adaptive properties.

  16. Quantum adiabatic machine learning

    NASA Astrophysics Data System (ADS)

    Pudenz, Kristen L.; Lidar, Daniel A.

    2013-05-01

    We develop an approach to machine learning and anomaly detection via quantum adiabatic evolution. This approach consists of two quantum phases, with some amount of classical preprocessing to set up the quantum problems. In the training phase we identify an optimal set of weak classifiers, to form a single strong classifier. In the testing phase we adiabatically evolve one or more strong classifiers on a superposition of inputs in order to find certain anomalous elements in the classification space. Both the training and testing phases are executed via quantum adiabatic evolution. All quantum processing is strictly limited to two-qubit interactions so as to ensure physical feasibility. We apply and illustrate this approach in detail to the problem of software verification and validation, with a specific example of the learning phase applied to a problem of interest in flight control systems. Beyond this example, the algorithm can be used to attack a broad class of anomaly detection problems.

  17. Recent Advances in the Classification of Low-grade Papillary-like Thyroid Neoplasms and Aggressive Papillary Thyroid Carcinomas: Evolution of Diagnostic Criteria.

    PubMed

    Guo, Zhenying; Ge, Minghua; Chu, Ying-Hsia; Asioli, Sofia; Lloyd, Ricardo V

    2018-07-01

    Papillary thyroid carcinomas account for ∼80% of well-differentiated thyroid tumors. During the past decade, several new variants of papillary-like thyroid neoplasms and papillary thyroid carcinomas have been recognized. Some of these neoplasms that were previously classified as malignant have been reclassified as low-grade neoplasms, as the diagnostic criteria have evolved. Similarly, some of the papillary thyroid carcinomas that were previously classified as conventional or classic papillary thyroid carcinomas have now been recognized as more aggressive variants of papillary thyroid carcinomas. Recognizing these differences becomes more important for the proper medical, surgical, and radiotherapeutic management of patients with these neoplasms.

  18. Surgical treatment of avulsion fractures at the tibial insertion of the posterior cruciate ligament: functional result☆

    PubMed Central

    Barros, Marcos Alexandre; Cervone, Gabriel Lopes de Faria; Costa, André Luis Serigatti

    2015-01-01

    Objective To objectively and subjectively evaluate the functional result from before to after surgery among patients with a diagnosis of an isolated avulsion fracture of the posterior cruciate ligament who were treated surgically. Method Five patients were evaluated by means of reviewing the medical files, applying the Lysholm questionnaire, physical examination and radiological examination. For the statistical analysis, a significance level of 0.10 and 95% confidence interval were used. Results According to the Lysholm criteria, all the patients were classified as poor (<64 points) before the operation and evolved to a mean of 96 points six months after the operation. We observed that 100% of the posterior drawer cases became negative, taking values less than 5 mm to be negative. Conclusion Surgical methods with stable fixation for treating avulsion fractures at the tibial insertion of the posterior cruciate ligament produce acceptable functional results from the surgical and radiological points of view, with a significance level of 0.042. PMID:27218073

  19. Constraints on submicrojansky radio number counts based on evolving VLA-COSMOS luminosity functions

    NASA Astrophysics Data System (ADS)

    Novak, M.; Smolčić, V.; Schinnerer, E.; Zamorani, G.; Delvecchio, I.; Bondi, M.; Delhaize, J.

    2018-06-01

    We present an investigation of radio luminosity functions (LFs) and number counts based on the Karl G. Jansky Very Large Array-COSMOS 3 GHz Large Project. The radio-selected sample of 7826 galaxies with robust optical/near-infrared counterparts with excellent photometric coverage allows us to construct the total radio LF since z 5.7. Using the Markov chain Monte Carlo algorithm, we fit the redshift dependent pure luminosity evolution model to the data and compare it with previously published VLA-COSMOS LFs obtained on individual populations of radio-selected star-forming galaxies and galaxies hosting active galactic nuclei classified on the basis of presence or absence of a radio excess with respect to the star-formation rates derived from the infrared emission. We find they are in excellent agreement, thus showing the reliability of the radio excess method in selecting these two galaxy populations at radio wavelengths. We study radio number counts down to submicrojansky levels drawn from different models of evolving LFs. We show that our evolving LFs are able to reproduce the observed radio sky brightness, even though we rely on extrapolations toward the faint end. Our results also imply that no new radio-emitting galaxy population is present below 1 μJy. Our work suggests that selecting galaxies with radio flux densities between 0.1 and 10 μJy will yield a star-forming galaxy in 90-95% of the cases with a high percentage of these galaxies existing around a redshift of z 2, thus providing useful constraints for planned surveys with the Square Kilometer Array and its precursors.

  20. The Evolution of Vicia ramuliflora (Fabaceae) at Tetraploid and Diploid Levels Revealed with FISH and RAPD

    PubMed Central

    Han, Ying; Liu, Yuan; Wang, Haoyou; Liu, Xiangjun

    2017-01-01

    Vicia ramuliflora L. is a widely distributed species in Eurasia with high economic value. For past 200 years, it has evolved a tetraploid cytotype and new subspecies at the diploid level. Based on taxonomy, cytogeography and other lines of evidence, previous studies have provided valuable information about the evolution of V. ramuliflora ploidy level, but due to the limited resolution of traditional methods, important questions remain. In this study, fluorescence in situ hybridization (FISH) and random amplified polymorphic DNA (RAPD) were used to analyze the evolution of V. ramuliflora at the diploid and tetraploid levels. Our aim was to reveal the genomic constitution and parents of the tetraploid V. ramuliflora and the relationships among diploid V. ramuliflora populations. Our study showed that the tetraploid cytotype of V. ramuliflora at Changbai Mountains (M) has identical 18S and 5S rDNA distribution patterns with the diploid Hengdaohezi population (B) and the diploid Dailing population (H). However, UPGMA clustering, Neighbor-Joining clustering and principal coordinates analysis based on RAPD showed that the tetraploid cytotype (M) has more close relationships with Qianshan diploid population T. Based on our results and the fact that interspecific hybridization among Vicia species is very difficult, we think that the tetraploid V. ramuliflora is an autotetraploid and its genomic origin still needs further study. In addition, our study also found that Qianshan diploid population (T) had evolved distinct new traits compared with other diploid populations, which hints that V. ramuliflora evolved further at diploid level. We suggest that diploid population T be re-classified as a new subspecies. PMID:28135314

  1. The Evolution of Vicia ramuliflora (Fabaceae) at Tetraploid and Diploid Levels Revealed with FISH and RAPD.

    PubMed

    Han, Ying; Liu, Yuan; Wang, Haoyou; Liu, Xiangjun

    2017-01-01

    Vicia ramuliflora L. is a widely distributed species in Eurasia with high economic value. For past 200 years, it has evolved a tetraploid cytotype and new subspecies at the diploid level. Based on taxonomy, cytogeography and other lines of evidence, previous studies have provided valuable information about the evolution of V. ramuliflora ploidy level, but due to the limited resolution of traditional methods, important questions remain. In this study, fluorescence in situ hybridization (FISH) and random amplified polymorphic DNA (RAPD) were used to analyze the evolution of V. ramuliflora at the diploid and tetraploid levels. Our aim was to reveal the genomic constitution and parents of the tetraploid V. ramuliflora and the relationships among diploid V. ramuliflora populations. Our study showed that the tetraploid cytotype of V. ramuliflora at Changbai Mountains (M) has identical 18S and 5S rDNA distribution patterns with the diploid Hengdaohezi population (B) and the diploid Dailing population (H). However, UPGMA clustering, Neighbor-Joining clustering and principal coordinates analysis based on RAPD showed that the tetraploid cytotype (M) has more close relationships with Qianshan diploid population T. Based on our results and the fact that interspecific hybridization among Vicia species is very difficult, we think that the tetraploid V. ramuliflora is an autotetraploid and its genomic origin still needs further study. In addition, our study also found that Qianshan diploid population (T) had evolved distinct new traits compared with other diploid populations, which hints that V. ramuliflora evolved further at diploid level. We suggest that diploid population T be re-classified as a new subspecies.

  2. Efficient quantitative assessment of facial paralysis using iris segmentation and active contour-based key points detection with hybrid classifier.

    PubMed

    Barbosa, Jocelyn; Lee, Kyubum; Lee, Sunwon; Lodhi, Bilal; Cho, Jae-Gu; Seo, Woo-Keun; Kang, Jaewoo

    2016-03-12

    Facial palsy or paralysis (FP) is a symptom that loses voluntary muscles movement in one side of the human face, which could be very devastating in the part of the patients. Traditional methods are solely dependent to clinician's judgment and therefore time consuming and subjective in nature. Hence, a quantitative assessment system becomes apparently invaluable for physicians to begin the rehabilitation process; and to produce a reliable and robust method is challenging and still underway. We introduce a novel approach for a quantitative assessment of facial paralysis that tackles classification problem for FP type and degree of severity. Specifically, a novel method of quantitative assessment is presented: an algorithm that extracts the human iris and detects facial landmarks; and a hybrid approach combining the rule-based and machine learning algorithm to analyze and prognosticate facial paralysis using the captured images. A method combining the optimized Daugman's algorithm and Localized Active Contour (LAC) model is proposed to efficiently extract the iris and facial landmark or key points. To improve the performance of LAC, appropriate parameters of initial evolving curve for facial features' segmentation are automatically selected. The symmetry score is measured by the ratio between features extracted from the two sides of the face. Hybrid classifiers (i.e. rule-based with regularized logistic regression) were employed for discriminating healthy and unhealthy subjects, FP type classification, and for facial paralysis grading based on House-Brackmann (H-B) scale. Quantitative analysis was performed to evaluate the performance of the proposed approach. Experiments show that the proposed method demonstrates its efficiency. Facial movement feature extraction on facial images based on iris segmentation and LAC-based key point detection along with a hybrid classifier provides a more efficient way of addressing classification problem on facial palsy type and degree of severity. Combining iris segmentation and key point-based method has several merits that are essential for our real application. Aside from the facial key points, iris segmentation provides significant contribution as it describes the changes of the iris exposure while performing some facial expressions. It reveals the significant difference between the healthy side and the severe palsy side when raising eyebrows with both eyes directed upward, and can model the typical changes in the iris region.

  3. Death of an order: a comprehensive molecular phylogenetic study confirms that termites are eusocial cockroaches.

    PubMed

    Inward, Daegan; Beccaloni, George; Eggleton, Paul

    2007-06-22

    Termites are instantly recognizable mound-builders and house-eaters: their complex social lifestyles have made them incredibly successful throughout the tropics. Although known as 'white ants', they are not ants and their relationships with other insects remain unclear. Our molecular phylogenetic analyses, the most comprehensive yet attempted, show that termites are social cockroaches, no longer meriting being classified as a separate order (Isoptera) from the cockroaches (Blattodea). Instead, we propose that they should be treated as a family (Termitidae) of cockroaches. It is surprising to find that a group of wood-feeding cockroaches has evolved full sociality, as other ecologically dominant fully social insects (e.g. ants, social bees and social wasps) have evolved from solitary predatory wasps.

  4. Bacteriophage Taxonomy: An Evolving Discipline.

    PubMed

    Tolstoy, Igor; Kropinski, Andrew M; Brister, J Rodney

    2018-01-01

    While taxonomy is an often-unappreciated branch of science it serves very important roles. Bacteriophage taxonomy has evolved from a mainly morphology-based discipline, characterized by the work of David Bradley and Hans-Wolfgang Ackermann, to the holistic approach that is taken today. The Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) takes a comprehensive approach to classifying prokaryote viruses measuring overall DNA and protein identity and phylogeny before making decisions about the taxonomic position of a new virus. The huge number of complete genomes being deposited with NCBI and other public databases has resulted in a reassessment of the taxonomy of many viruses, and the future will see the introduction of new viral families and higher orders.

  5. Evolving neural networks with genetic algorithms to study the string landscape

    NASA Astrophysics Data System (ADS)

    Ruehle, Fabian

    2017-08-01

    We study possible applications of artificial neural networks to examine the string landscape. Since the field of application is rather versatile, we propose to dynamically evolve these networks via genetic algorithms. This means that we start from basic building blocks and combine them such that the neural network performs best for the application we are interested in. We study three areas in which neural networks can be applied: to classify models according to a fixed set of (physically) appealing features, to find a concrete realization for a computation for which the precise algorithm is known in principle but very tedious to actually implement, and to predict or approximate the outcome of some involved mathematical computation which performs too inefficient to apply it, e.g. in model scans within the string landscape. We present simple examples that arise in string phenomenology for all three types of problems and discuss how they can be addressed by evolving neural networks from genetic algorithms.

  6. Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition.

    PubMed

    Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi

    2014-01-01

    In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.

  7. Using Supervised Machine Learning to Classify Real Alerts and Artifact in Online Multi-signal Vital Sign Monitoring Data

    PubMed Central

    Chen, Lujie; Dubrawski, Artur; Wang, Donghan; Fiterau, Madalina; Guillame-Bert, Mathieu; Bose, Eliezer; Kaynar, Ata M.; Wallace, David J.; Guttendorf, Jane; Clermont, Gilles; Pinsky, Michael R.; Hravnak, Marilyn

    2015-01-01

    OBJECTIVE Use machine-learning (ML) algorithms to classify alerts as real or artifacts in online noninvasive vital sign (VS) data streams to reduce alarm fatigue and missed true instability. METHODS Using a 24-bed trauma step-down unit’s non-invasive VS monitoring data (heart rate [HR], respiratory rate [RR], peripheral oximetry [SpO2]) recorded at 1/20Hz, and noninvasive oscillometric blood pressure [BP] less frequently, we partitioned data into training/validation (294 admissions; 22,980 monitoring hours) and test sets (2,057 admissions; 156,177 monitoring hours). Alerts were VS deviations beyond stability thresholds. A four-member expert committee annotated a subset of alerts (576 in training/validation set, 397 in test set) as real or artifact selected by active learning, upon which we trained ML algorithms. The best model was evaluated on alerts in the test set to enact online alert classification as signals evolve over time. MAIN RESULTS The Random Forest model discriminated between real and artifact as the alerts evolved online in the test set with area under the curve (AUC) performance of 0.79 (95% CI 0.67-0.93) for SpO2 at the instant the VS first crossed threshold and increased to 0.87 (95% CI 0.71-0.95) at 3 minutes into the alerting period. BP AUC started at 0.77 (95%CI 0.64-0.95) and increased to 0.87 (95% CI 0.71-0.98), while RR AUC started at 0.85 (95%CI 0.77-0.95) and increased to 0.97 (95% CI 0.94–1.00). HR alerts were too few for model development. CONCLUSIONS ML models can discern clinically relevant SpO2, BP and RR alerts from artifacts in an online monitoring dataset (AUC>0.87). PMID:26992068

  8. Method and System for Hydrogen Evolution and Storage

    DOEpatents

    Thorn, David L.; Tumas, William; Hay, P. Jeffrey; Schwarz, Daniel E.; Cameron, Thomas M.

    2008-10-21

    A method and system for storing and evolving hydrogen employ chemical compounds that can be hydrogenated to store hydrogen and dehydrogenated to evolve hydrogen. A catalyst lowers the energy required for storing and evolving hydrogen. The method and system can provide hydrogen for devices that consume hydrogen as fuel.

  9. Method and system for hydrogen evolution and storage

    DOEpatents

    Thorn, David L.; Tumas, William; Hay, P. Jeffrey; Schwarz, Daniel E.; Cameron, Thomas M.

    2012-12-11

    A method and system for storing and evolving hydrogen (H.sub.2) employ chemical compounds that can be hydrogenated to store hydrogen and dehydrogenated to evolve hydrogen. A catalyst lowers the energy required for storing and evolving hydrogen. The method and system can provide hydrogen for devices that consume hydrogen as fuel.

  10. Inferring the lithology of borehole rocks by applying neural network classifiers to downhole logs: an example from the Ocean Drilling Program

    NASA Astrophysics Data System (ADS)

    Benaouda, D.; Wadge, G.; Whitmarsh, R. B.; Rothwell, R. G.; MacLeod, C.

    1999-02-01

    In boreholes with partial or no core recovery, interpretations of lithology in the remainder of the hole are routinely attempted using data from downhole geophysical sensors. We present a practical neural net-based technique that greatly enhances lithological interpretation in holes with partial core recovery by using downhole data to train classifiers to give a global classification scheme for those parts of the borehole for which no core was retrieved. We describe the system and its underlying methods of data exploration, selection and classification, and present a typical example of the system in use. Although the technique is equally applicable to oil industry boreholes, we apply it here to an Ocean Drilling Program (ODP) borehole (Hole 792E, Izu-Bonin forearc, a mixture of volcaniclastic sandstones, conglomerates and claystones). The quantitative benefits of quality-control measures and different subsampling strategies are shown. Direct comparisons between a number of discriminant analysis methods and the use of neural networks with back-propagation of error are presented. The neural networks perform better than the discriminant analysis techniques both in terms of performance rates with test data sets (2-3 per cent better) and in qualitative correlation with non-depth-matched core. We illustrate with the Hole 792E data how vital it is to have a system that permits the number and membership of training classes to be changed as analysis proceeds. The initial classification for Hole 792E evolved from a five-class to a three-class and then to a four-class scheme with resultant classification performance rates for the back-propagation neural network method of 83, 84 and 93 per cent respectively.

  11. The Phylogeny of Little Red Riding Hood

    PubMed Central

    Tehrani, Jamshid J.

    2013-01-01

    Researchers have long been fascinated by the strong continuities evident in the oral traditions associated with different cultures. According to the ‘historic-geographic’ school, it is possible to classify similar tales into “international types” and trace them back to their original archetypes. However, critics argue that folktale traditions are fundamentally fluid, and that most international types are artificial constructs. Here, these issues are addressed using phylogenetic methods that were originally developed to reconstruct evolutionary relationships among biological species, and which have been recently applied to a range of cultural phenomena. The study focuses on one of the most debated international types in the literature: ATU 333, ‘Little Red Riding Hood’. A number of variants of ATU 333 have been recorded in European oral traditions, and it has been suggested that the group may include tales from other regions, including Africa and East Asia. However, in many of these cases, it is difficult to differentiate ATU 333 from another widespread international folktale, ATU 123, ‘The Wolf and the Kids’. To shed more light on these relationships, data on 58 folktales were analysed using cladistic, Bayesian and phylogenetic network-based methods. The results demonstrate that, contrary to the claims made by critics of the historic-geographic approach, it is possible to identify ATU 333 and ATU 123 as distinct international types. They further suggest that most of the African tales can be classified as variants of ATU 123, while the East Asian tales probably evolved by blending together elements of both ATU 333 and ATU 123. These findings demonstrate that phylogenetic methods provide a powerful set of tools for testing hypotheses about cross-cultural relationships among folktales, and point towards exciting new directions for research into the transmission and evolution of oral narratives. PMID:24236061

  12. Pairwise Classifier Ensemble with Adaptive Sub-Classifiers for fMRI Pattern Analysis.

    PubMed

    Kim, Eunwoo; Park, HyunWook

    2017-02-01

    The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-classifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classifiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.

  13. Between-Region Genetic Divergence Reflects the Mode and Tempo of Tumor Evolution

    PubMed Central

    Sun, Ruping; Hu, Zheng; Sottoriva, Andrea; Graham, Trevor A.; Harpak, Arbel; Ma, Zhicheng; Fischer, Jared M.; Shibata, Darryl; Curtis, Christina

    2017-01-01

    Given the implications of tumor dynamics for precision medicine, there is a need to systematically characterize the mode of evolution across diverse solid tumor types. In particular, methods to infer the role of natural selection within established human tumors are lacking. By simulating spatial tumor growth under different evolutionary modes and examining patterns of between-region subclonal genetic divergence from multi-region sequencing (MRS) data, we demonstrate that it is feasible to distinguish tumors driven by strong positive subclonal selection from those evolving neutrally or under weak selection, as the latter fail to dramatically alter subclonal composition. We developed a classifier based on measures of between-region subclonal genetic divergence and projected patient data into model space, revealing different modes of evolution both within and between solid tumor types. Our findings have broad implications for how human tumors progress, accumulate intra-tumor heterogeneity, and ultimately how they may be more effectively treated. PMID:28581503

  14. Problems in Classifying Mild Cognitive Impairment (MCI): One or Multiple Syndromes?

    PubMed Central

    Díaz-Mardomingo, María del Carmen; García-Herranz, Sara; Rodríguez-Fernández, Raquel; Venero, César; Peraita, Herminia

    2017-01-01

    As the conceptual, methodological, and technological advances applied to dementias have evolved the construct of mild cognitive impairment (MCI), one problem encountered has been its classification into subtypes. Here, we aim to revise the concept of MCI and its subtypes, addressing the problems of classification not only from the psychometric point of view or by using alternative methods, such as latent class analysis, but also considering the absence of normative data. In addition to the well-known influence of certain factors on cognitive function, such as educational level and cultural traits, recent studies highlight the relevance of other factors that may significantly affect the genesis and evolution of MCI: subjective memory complaints, loneliness, social isolation, etc. The present work will contemplate the most relevant attempts to clarify the issue of MCI categorization and classification, combining our own data with that from recent studies which suggest the role of relevant psychosocial factors in MCI. PMID:28862676

  15. New CO and HCN sources associated with IRAS carbon stars

    NASA Technical Reports Server (NTRS)

    NGUYEN-Q-RIEU; Epchtein, N.; TRUONG-BACH; Cohen, M.

    1987-01-01

    Emission of CO and HCN was detected in 22 out of a sample of 53 IRAS sources classified as unidentified carbon-rich objects. The sample was selected according to the presence of the silicon carbide feature as revealed by low-resolution spectra. The molecular line widths indicate that the CO and HCN emission arises from the circumstellar envelopes of very highly evolved stars undergoing mass loss. The visible stars tend to be deficient in CO as compared with unidentified sources. Most the detected CO and HCN IRAS stars are distinct and thick-shelled objects, but their infrared and CO luminosities are similar to those of IRC + 102156 AFGL and IRC-CO evolved stars. The 12 micron flux seems to be a good indicator of the distance, hence a guide for molecular searches.

  16. New daily persistent headache: An evolving entity.

    PubMed

    Uniyal, Ravi; Paliwal, Vimal Kumar; Anand, Sucharita; Ambesh, Paurush

    2018-01-01

    New daily persistent headache (NDPH) is characterized by an abrupt onset of headache that becomes a daily entity, is unremitting and continuous from the onset, and lasts for more than 3 months. Dr Walter Vanast first described NDPH in the year 1986. Originally, it was proposed as a chronic daily headache but it was placed under "other primary headaches" in the International Classification of Headache Disorder Second Edition (ICHD 2nd edition). However, with evolving literature and better understanding of its clinical characteristics, it was classified as a "chronic daily headache" in the ICHD 3 rd edition beta. There are still many knowledge-gaps regarding the underlying cause, pathophysiology, natural history and treatment of NDPH. This review tries to revisit the entity and discusses the current status of understanding regarding NDPH.

  17. Classification of ipsilateral breast tumor recurrences after breast conservation therapy can predict patient prognosis and facilitate treatment planning

    PubMed Central

    Yi, Min; Buchholz, Thomas A.; Meric-Bernstam, Funda; Bedrosian, Isabelle; Hwang, Rosa F.; Ross, Merrick I.; Kuerer, Henry M.; Luo, Sheng; Gonzalez-Angulo, Ana M.; Buzdar, Aman U.; Symmans, W. Fraser; Feig, Barry W.; Lucci, Anthony; Huang, Eugene H.; Hunt, Kelly K.

    2015-01-01

    Objective To classify ipsilateral breast tumor recurrences (IBTR) as either new primary tumors (NP) or true local recurrence (TR). We utilized two different methods and compared sensitivities and specificities between them. Our goal was to determine whether distinguishing NP from TR had prognostic value. Summary Background Data After breast-conservation therapy (BCT), IBTR may be classified into two distinct types (NP and TR). Studies have attempted to classify IBTR by using tumor location, histologic subtype, DNA flow cytometry data, or gene-expression profiling data. Methods 447 (7.9%) of 5660 patients undergoing BCT from 1970 to 2005 experienced IBTR. Clinical data from 397 patients were available for review. We classified IBTRs as NP or TR on the basis of either tumor location and histologic subtype (method 1) or tumor location, histologic subtype, estrogen receptor (ER) status and human epidermal growth factor receptor 2 (HER-2) status (method 2). Kaplan-Meier curves and log-rank tests were used to evaluate overall and disease-specific survival (DSS) differences between the two groups. Classification methods were validated by calculating sensitivity and specificity values using a Bayesian method. Results Of 397 patients, 196 (49.4%) were classified as NP by method 1 and 212 (53.4%) were classified as NP by method 2. The sensitivity and specificity values were 0.812 and 0.867 for method 1 and 0.870 and 0.800 for method 2, respectively. Regardless of method used, patients classified as NP developed contralateral breast carcinoma more often but had better 10-year overall and DSS rates than those classified as TR. Patients with TR were more likely to develop metastatic disease after IBTR. Conclusion IBTR classified as TR and NP had clinically different features, suggesting that classifying IBTR may provide clinically significant data for the management of IBTR. PMID:21209588

  18. Methods for data classification

    DOEpatents

    Garrity, George [Okemos, MI; Lilburn, Timothy G [Front Royal, VA

    2011-10-11

    The present invention provides methods for classifying data and uncovering and correcting annotation errors. In particular, the present invention provides a self-organizing, self-correcting algorithm for use in classifying data. Additionally, the present invention provides a method for classifying biological taxa.

  19. Classification of early-stage non-small cell lung cancer by weighing gene expression profiles with connectivity information.

    PubMed

    Zhang, Ao; Tian, Suyan

    2018-05-01

    Pathway-based feature selection algorithms, which utilize biological information contained in pathways to guide which features/genes should be selected, have evolved quickly and become widespread in the field of bioinformatics. Based on how the pathway information is incorporated, we classify pathway-based feature selection algorithms into three major categories-penalty, stepwise forward, and weighting. Compared to the first two categories, the weighting methods have been underutilized even though they are usually the simplest ones. In this article, we constructed three different genes' connectivity information-based weights for each gene and then conducted feature selection upon the resulting weighted gene expression profiles. Using both simulations and a real-world application, we have demonstrated that when the data-driven connectivity information constructed from the data of specific disease under study is considered, the resulting weighted gene expression profiles slightly outperform the original expression profiles. In summary, a big challenge faced by the weighting method is how to estimate pathway knowledge-based weights more accurately and precisely. Only until the issue is conquered successfully will wide utilization of the weighting methods be impossible. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Defining terms for proactive management of resistance to Bt crops and pesticides.

    PubMed

    Tabashnik, Bruce E; Mota-Sanchez, David; Whalon, Mark E; Hollingworth, Robert M; Carrière, Yves

    2014-04-01

    Evolution of pest resistance to pesticides is an urgent global problem with resistance recorded in at least 954 species of pests, including 546 arthropods, 218 weeds, and 190 plant pathogens. To facilitate understanding and management of resistance, we provide definitions of 50 key terms related to resistance. We confirm the broad, long-standing definition of resistance, which is a genetically based decrease in susceptibility to a pesticide, and the definition of "field-evolved resistance," which is a genetically based decrease in susceptibility to a pesticide in a population caused by exposure to the pesticide in the field. The impact of field-evolved resistance on pest control can vary from none to severe. We define "practical resistance" as field-evolved resistance that reduces pesticide efficacy and has practical consequences for pest control. Recognizing that resistance is not "all or none" and that intermediate levels of resistance can have a continuum of effects on pest control, we describe five categories of field-evolved resistance and use them to classify 13 cases of field-evolved resistance to five Bacillus thuringiensis (Bt) toxins in transgenic corn and cotton based on monitoring data from five continents for nine major pest species. We urge researchers to publish and analyze their resistance monitoring data in conjunction with data on management practices to accelerate progress in determining which actions will be most useful in response to specific data on the magnitude, distribution, and impact of resistance.

  1. Possible mechanisms for four regimes associated with cold events over East Asia

    NASA Astrophysics Data System (ADS)

    Yang, Zifan; Huang, Wenyu; Wang, Bin; Chen, Ruyan; Wright, Jonathon S.; Ma, Wenqian

    2017-09-01

    Circulation patterns associated with cold events over East Asia during the winter months of 1948-2014 are classified into four regimes by applying a k-means clustering method based on the area-weighted pattern correlation. The earliest precursor signals for two regimes are anticyclonic anomalies, which evolve into Ural and central Siberian blocking-like circulation patterns. The earliest precursor signals for the other two regimes are cyclonic anomalies, both of which evolve to amplify the East Asian trough (EAT). Both the blocking-like circulation patterns and amplified EAT favor the initialization of cold events. On average, the blocking-related regimes tend to last longer. The lead time of the earliest precursor signal for the central Siberian blocking-related regime is only 4 days, while those for the other regimes range from 16 to 18 days. The North Atlantic Oscillation plays essential roles both in triggering the precursor for the Ural blocking-related regime and in amplifying the precursors for all regimes. All regimes preferentially occur during the positive phase of the Eurasian teleconnection pattern and the negative phase of the El Niño-Southern Oscillation. For three regimes, surface cooling is primarily due to reduced downward infrared radiation and enhanced cold advection. For the remaining regime, which is associated with the southernmost cooling center, sensible and latent heat release and horizontal cold advection dominate the East Asian cooling.

  2. Evaluation of the ACR and SLICC classification criteria in juvenile-onset systemic lupus erythematosus: a longitudinal analysis.

    PubMed

    Lythgoe, H; Morgan, T; Heaf, E; Lloyd, O; Al-Abadi, E; Armon, K; Bailey, K; Davidson, J; Friswell, M; Gardner-Medwin, J; Haslam, K; Ioannou, Y; Leahy, A; Leone, V; Pilkington, C; Rangaraj, S; Riley, P; Tizard, E J; Wilkinson, N; Beresford, M W

    2017-10-01

    Objectives The Systemic Lupus International Collaborating Clinics (SLICC) group proposed revised classification criteria for systemic lupus erythematosus (SLICC-2012 criteria). This study aimed to compare these criteria with the well-established American College of Rheumatology classification criteria (ACR-1997 criteria) in a national cohort of juvenile-onset systemic lupus erythematosus (JSLE) patients and evaluate how patients' classification criteria evolved over time. Methods Data from patients in the UK JSLE Cohort Study with a senior clinician diagnosis of probable evolving, or definite JSLE, were analyzed. Patients were assessed using both classification criteria within 1 year of diagnosis and at latest follow up (following a minimum 12-month follow-up period). Results A total of 226 patients were included. The SLICC-2012 was more sensitive than ACR-1997 at diagnosis (92.9% versus 84.1% p < 0.001) and after follow up (100% versus 92.0% p < 0.001). Most patients meeting the SLICC-2012 criteria and not the ACR-1997 met more than one additional criterion on the SLICC-2012. Conclusions The SLICC-2012 was better able to classify patients with JSLE than the ACR-1997 and did so at an earlier stage in their disease course. SLICC-2012 should be considered for classification of JSLE patients in observational studies and clinical trial eligibility.

  3. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    PubMed

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Current use was established and Cochrane guidance on selection of social theories for systematic reviews of complex interventions was developed.

    PubMed

    Noyes, Jane; Hendry, Maggie; Booth, Andrew; Chandler, Jackie; Lewin, Simon; Glenton, Claire; Garside, Ruth

    2016-07-01

    To identify examples of how social theories are used in systematic reviews of complex interventions to inform production of Cochrane guidance. Secondary analysis of published/unpublished examples of theories of social phenomena for use in reviews of complex interventions identified through scoping searches, engagement with key authors and methodologists supplemented by snowballing and reference searching. Theories were classified (low-level, mid-range, grand). Over 100 theories were identified with evidence of proliferation over the last 5 years. New low-level theories (tools, taxonomies, etc) have been developed for classifying and reporting complex interventions. Numerous mid-range theories are used; one example demonstrated how control theory had changed the review's findings. Review-specific logic models are increasingly used, but these can be challenging to develop. New low-level and mid-range psychological theories of behavior change are evolving. No reviews using grand theory (e.g., feminist theory) were identified. We produced a searchable Wiki, Mendeley Inventory, and Cochrane guidance. Use of low-level theory is common and evolving; incorporation of mid-range theory is still the exception rather than the norm. Methodological work is needed to evaluate the contribution of theory. Choice of theory reflects personal preference; application of theory is a skilled endeavor. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  5. SMARTbot: A Behavioral Analysis Framework Augmented with Machine Learning to Identify Mobile Botnet Applications

    PubMed Central

    Karim, Ahmad; Salleh, Rosli; Khan, Muhammad Khurram

    2016-01-01

    Botnet phenomenon in smartphones is evolving with the proliferation in mobile phone technologies after leaving imperative impact on personal computers. It refers to the network of computers, laptops, mobile devices or tablets which is remotely controlled by the cybercriminals to initiate various distributed coordinated attacks including spam emails, ad-click fraud, Bitcoin mining, Distributed Denial of Service (DDoS), disseminating other malwares and much more. Likewise traditional PC based botnet, Mobile botnets have the same operational impact except the target audience is particular to smartphone users. Therefore, it is import to uncover this security issue prior to its widespread adaptation. We propose SMARTbot, a novel dynamic analysis framework augmented with machine learning techniques to automatically detect botnet binaries from malicious corpus. SMARTbot is a component based off-device behavioral analysis framework which can generate mobile botnet learning model by inducing Artificial Neural Networks’ back-propagation method. Moreover, this framework can detect mobile botnet binaries with remarkable accuracy even in case of obfuscated program code. The results conclude that, a classifier model based on simple logistic regression outperform other machine learning classifier for botnet apps’ detection, i.e 99.49% accuracy is achieved. Further, from manual inspection of botnet dataset we have extracted interesting trends in those applications. As an outcome of this research, a mobile botnet dataset is devised which will become the benchmark for future studies. PMID:26978523

  6. A catalogue of AKARI FIS BSC extragalactic objects

    NASA Astrophysics Data System (ADS)

    Marton, Gabor; Toth, L. Viktor; Gyorgy Balazs, Lajos

    2015-08-01

    We combined photometric data of about 70 thousand point sources from the AKARI Far-Infrared Surveyor Bright Source Catalogue with AllWISE catalogue data to identify galaxies. We used Quadratic Discriminant Analysis (QDA) to classify our sources. The classification was based on a 6D parameter space that contained AKARI [F65/F90], [F90/F140], [F140/F160] and WISE W1-W2 colours along with WISE W1 magnitudes and AKARI [F140] flux values. Sources were classified into 3 main objects types: YSO candidates, evolved stars and galaxies. The training samples were SIMBAD entries of the input point sources wherever an associated SIMBAD object was found within a 30 arcsecond search radius. The QDA resulted more than 5000 AKARI galaxy candidate sources. The selection was tested cross-correlating our AKARI extragalactic catalogue with the Revised IRAS-FSC Redshift Catalogue (RIFSCz). A very good match was found. A further classification attempt was also made to differentiate between extragalactic subtypes using Support Vector Machines (SVMs). The results of the various methods showed that we can confidently separate cirrus dominated objects (type 1 of RIFSCz). Some of our “galaxy candidate” sources are associated with 2MASS extended objects, and listed in the NASA Extragalactic Database so far without clear proofs of their extragalactic nature. Examples will be presented in our poster. Finally other AKARI extragalactic catalogues will be also compared to our statistical selection.

  7. SMARTbot: A Behavioral Analysis Framework Augmented with Machine Learning to Identify Mobile Botnet Applications.

    PubMed

    Karim, Ahmad; Salleh, Rosli; Khan, Muhammad Khurram

    2016-01-01

    Botnet phenomenon in smartphones is evolving with the proliferation in mobile phone technologies after leaving imperative impact on personal computers. It refers to the network of computers, laptops, mobile devices or tablets which is remotely controlled by the cybercriminals to initiate various distributed coordinated attacks including spam emails, ad-click fraud, Bitcoin mining, Distributed Denial of Service (DDoS), disseminating other malwares and much more. Likewise traditional PC based botnet, Mobile botnets have the same operational impact except the target audience is particular to smartphone users. Therefore, it is import to uncover this security issue prior to its widespread adaptation. We propose SMARTbot, a novel dynamic analysis framework augmented with machine learning techniques to automatically detect botnet binaries from malicious corpus. SMARTbot is a component based off-device behavioral analysis framework which can generate mobile botnet learning model by inducing Artificial Neural Networks' back-propagation method. Moreover, this framework can detect mobile botnet binaries with remarkable accuracy even in case of obfuscated program code. The results conclude that, a classifier model based on simple logistic regression outperform other machine learning classifier for botnet apps' detection, i.e 99.49% accuracy is achieved. Further, from manual inspection of botnet dataset we have extracted interesting trends in those applications. As an outcome of this research, a mobile botnet dataset is devised which will become the benchmark for future studies.

  8. 3D level set methods for evolving fronts on tetrahedral meshes with adaptive mesh refinement

    DOE PAGES

    Morgan, Nathaniel Ray; Waltz, Jacob I.

    2017-03-02

    The level set method is commonly used to model dynamically evolving fronts and interfaces. In this work, we present new methods for evolving fronts with a specified velocity field or in the surface normal direction on 3D unstructured tetrahedral meshes with adaptive mesh refinement (AMR). The level set field is located at the nodes of the tetrahedral cells and is evolved using new upwind discretizations of Hamilton–Jacobi equations combined with a Runge–Kutta method for temporal integration. The level set field is periodically reinitialized to a signed distance function using an iterative approach with a new upwind gradient. We discuss themore » details of these level set and reinitialization methods. Results from a range of numerical test problems are presented.« less

  9. Improved adaptive splitting and selection: the hybrid training method of a classifier based on a feature space partitioning.

    PubMed

    Jackowski, Konrad; Krawczyk, Bartosz; Woźniak, Michał

    2014-05-01

    Currently, methods of combined classification are the focus of intense research. A properly designed group of combined classifiers exploiting knowledge gathered in a pool of elementary classifiers can successfully outperform a single classifier. There are two essential issues to consider when creating combined classifiers: how to establish the most comprehensive pool and how to design a fusion model that allows for taking full advantage of the collected knowledge. In this work, we address the issues and propose an AdaSS+, training algorithm dedicated for the compound classifier system that effectively exploits local specialization of the elementary classifiers. An effective training procedure consists of two phases. The first phase detects the classifier competencies and adjusts the respective fusion parameters. The second phase boosts classification accuracy by elevating the degree of local specialization. The quality of the proposed algorithms are evaluated on the basis of a wide range of computer experiments that show that AdaSS+ can outperform the original method and several reference classifiers.

  10. Evaluation of AMOEBA: a spectral-spatial classification method

    USGS Publications Warehouse

    Jenson, Susan K.; Loveland, Thomas R.; Bryant, J.

    1982-01-01

    Muitispectral remotely sensed images have been treated as arbitrary multivariate spectral data for purposes of clustering and classifying. However, the spatial properties of image data can also be exploited. AMOEBA is a clustering and classification method that is based on a spatially derived model for image data. In an evaluation test, Landsat data were classified with both AMOEBA and a widely used spectral classifier. The test showed that irrigated crop types can be classified as accurately with the AMOEBA method as with the generally used spectral method ISOCLS; the AMOEBA method, however, requires less computer time.

  11. A fuzzy integral method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification across multiple subjects.

    PubMed

    Cacha, L A; Parida, S; Dehuri, S; Cho, S-B; Poznanski, R R

    2016-12-01

    The huge number of voxels in fMRI over time poses a major challenge to for effective analysis. Fast, accurate, and reliable classifiers are required for estimating the decoding accuracy of brain activities. Although machine-learning classifiers seem promising, individual classifiers have their own limitations. To address this limitation, the present paper proposes a method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification for application across multiple subjects. Similarly, the fuzzy integral (FI) approach has been employed as an efficient tool for combining different classifiers. The FI approach led to the development of a classifiers ensemble technique that performs better than any of the single classifier by reducing the misclassification, the bias, and the variance. The proposed method successfully classified the different cognitive states for multiple subjects with high accuracy of classification. Comparison of the performance improvement, while applying ensemble neural networks method, vs. that of the individual neural network strongly points toward the usefulness of the proposed method.

  12. Selecting a Classification Ensemble and Detecting Process Drift in an Evolving Data Stream

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heredia-Langner, Alejandro; Rodriguez, Luke R.; Lin, Andy

    2015-09-30

    We characterize the commercial behavior of a group of companies in a common line of business using a small ensemble of classifiers on a stream of records containing commercial activity information. This approach is able to effectively find a subset of classifiers that can be used to predict company labels with reasonable accuracy. Performance of the ensemble, its error rate under stable conditions, can be characterized using an exponentially weighted moving average (EWMA) statistic. The behavior of the EWMA statistic can be used to monitor a record stream from the commercial network and determine when significant changes have occurred. Resultsmore » indicate that larger classification ensembles may not necessarily be optimal, pointing to the need to search the combinatorial classifier space in a systematic way. Results also show that current and past performance of an ensemble can be used to detect when statistically significant changes in the activity of the network have occurred. The dataset used in this work contains tens of thousands of high level commercial activity records with continuous and categorical variables and hundreds of labels, making classification challenging.« less

  13. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    PubMed

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Evolution of EF-hand calcium-modulated proteins. II. Domains of several subfamilies have diverse evolutionary histories

    NASA Technical Reports Server (NTRS)

    Nakayama, S.; Moncrief, N. D.; Kretsinger, R. H.

    1992-01-01

    In the first report in this series we described the relationships and evolution of 152 individual proteins of the EF-hand subfamilies. Here we add 66 additional proteins and define eight (CDC, TPNV, CLNB, LPS, DGK, 1F8, VIS, TCBP) new subfamilies and seven (CAL, SQUD, CDPK, EFH5, TPP, LAV, CRGP) new unique proteins, which we assume represent new subfamilies. The main focus of this study is the classification of individual EF-hand domains. Five subfamilies--calmodulin, troponin C, essential light chain, regulatory light chain, CDC31/caltractin--and three uniques--call, squidulin, and calcium-dependent protein kinase--are congruent in that all evolved from a common four-domain precursor. In contrast calpain and sarcoplasmic calcium-binding protein (SARC) each evolved from its own one-domain precursor. The remaining 19 subfamilies and uniques appear to have evolved by translocation and splicing of genes encoding the EF-hand domains that were precursors to the congruent eight and to calpain and to SARC. The rates of evolution of the EF-hand domains are slower following formation of the subfamilies and establishment of their functions. Subfamilies are not readily classified by patterns of calcium coordination, interdomain linker stability, and glycine and proline distribution. There are many homoplasies indicating that similar variants of the EF-hand evolved by independent pathways.

  15. Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies

    PubMed Central

    Theis, Fabian J.

    2017-01-01

    Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464

  16. Method of generating features optimal to a dataset and classifier

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bruillard, Paul J.; Gosink, Luke J.; Jarman, Kenneth D.

    A method of generating features optimal to a particular dataset and classifier is disclosed. A dataset of messages is inputted and a classifier is selected. An algebra of features is encoded. Computable features that are capable of describing the dataset from the algebra of features are selected. Irredundant features that are optimal for the classifier and the dataset are selected.

  17. An Automatic Diagnosis Method of Facial Acne Vulgaris Based on Convolutional Neural Network.

    PubMed

    Shen, Xiaolei; Zhang, Jiachi; Yan, Chenjun; Zhou, Hong

    2018-04-11

    In this paper, we present a new automatic diagnosis method for facial acne vulgaris which is based on convolutional neural networks (CNNs). To overcome the shortcomings of previous methods which were the inability to classify enough types of acne vulgaris. The core of our method is to extract features of images based on CNNs and achieve classification by classifier. A binary-classifier of skin-and-non-skin is used to detect skin area and a seven-classifier is used to achieve the classification task of facial acne vulgaris and healthy skin. In the experiments, we compare the effectiveness of our CNN and the VGG16 neural network which is pre-trained on the ImageNet data set. We use a ROC curve to evaluate the performance of binary-classifier and use a normalized confusion matrix to evaluate the performance of seven-classifier. The results of our experiments show that the pre-trained VGG16 neural network is effective in extracting features from facial acne vulgaris images. And the features are very useful for the follow-up classifiers. Finally, we try applying the classifiers both based on the pre-trained VGG16 neural network to assist doctors in facial acne vulgaris diagnosis.

  18. The evolving understanding of the construct of intellectual disability.

    PubMed

    Schalock, Robert L

    2011-12-01

    This article addresses two major areas concerned with the evolving understanding of the construct of intellectual disability. The first part of the article discusses current answers to five critical questions that have revolved around the general question, "What is Intellectual Disability?" These five are what to call the phenomenon, how to explain the phenomenon, how to define the phenomenon and determine who is a member of the class, how to classify persons so defined and identified, and how to establish public policy regarding such persons. The second part of the article discusses four critical issues that will impact both our future understanding of the construct and the approach taken to persons with intellectual disability. These four critical issues relate to the conceptualisation and measurement of intellectual functioning, the constitutive definition of intellectual disability, the alignment of clinical functions related to diagnosis, classification, and planning supports, and how the field resolves a number of emerging epistemological issues.

  19. Estimating structure quality trends in the Protein Data Bank by equivalent resolution.

    PubMed

    Bagaria, Anurag; Jaravine, Victor; Güntert, Peter

    2013-10-01

    The quality of protein structures obtained by different experimental and ab-initio calculation methods varies considerably. The methods have been evolving over time by improving both experimental designs and computational techniques, and since the primary aim of these developments is the procurement of reliable and high-quality data, better techniques resulted on average in an evolution toward higher quality structures in the Protein Data Bank (PDB). Each method leaves a specific quantitative and qualitative "trace" in the PDB entry. Certain information relevant to one method (e.g. dynamics for NMR) may be lacking for another method. Furthermore, some standard measures of quality for one method cannot be calculated for other experimental methods, e.g. crystal resolution or NMR bundle RMSD. Consequently, structures are classified in the PDB by the method used. Here we introduce a method to estimate a measure of equivalent X-ray resolution (e-resolution), expressed in units of Å, to assess the quality of any type of monomeric, single-chain protein structure, irrespective of the experimental structure determination method. We showed and compared the trends in the quality of structures in the Protein Data Bank over the last two decades for five different experimental techniques, excluding theoretical structure predictions. We observed that as new methods are introduced, they undergo a rapid method development evolution: within several years the e-resolution score becomes similar for structures obtained from the five methods and they improve from initially poor performance to acceptable quality, comparable with previously established methods, the performance of which is essentially stable. Copyright © 2013 Elsevier Ltd. All rights reserved.

  20. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    PubMed Central

    Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time. PMID:26543867

  1. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

    PubMed

    Bokulich, Nicholas A; Kaehler, Benjamin D; Rideout, Jai Ram; Dillon, Matthew; Bolyen, Evan; Knight, Rob; Huttley, Gavin A; Gregory Caporaso, J

    2018-05-17

    Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

  2. Latent information in fluency lists predicts functional decline in persons at risk for Alzheimer disease.

    PubMed

    Clark, D G; Kapur, P; Geldmacher, D S; Brockington, J C; Harrell, L; DeRamus, T P; Blanton, P D; Lokken, K; Nicholas, A P; Marson, D C

    2014-06-01

    We constructed random forest classifiers employing either the traditional method of scoring semantic fluency word lists or new methods. These classifiers were then compared in terms of their ability to diagnose Alzheimer disease (AD) or to prognosticate among individuals along the continuum from cognitively normal (CN) through mild cognitive impairment (MCI) to AD. Semantic fluency lists from 44 cognitively normal elderly individuals, 80 MCI patients, and 41 AD patients were transcribed into electronic text files and scored by four methods: traditional raw scores, clustering and switching scores, "generalized" versions of clustering and switching, and a method based on independent components analysis (ICA). Random forest classifiers based on raw scores were compared to "augmented" classifiers that incorporated newer scoring methods. Outcome variables included AD diagnosis at baseline, MCI conversion, increase in Clinical Dementia Rating-Sum of Boxes (CDR-SOB) score, or decrease in Financial Capacity Instrument (FCI) score. Receiver operating characteristic (ROC) curves were constructed for each classifier and the area under the curve (AUC) was calculated. We compared AUC between raw and augmented classifiers using Delong's test and assessed validity and reliability of the augmented classifier. Augmented classifiers outperformed classifiers based on raw scores for the outcome measures AD diagnosis (AUC .97 vs. .95), MCI conversion (AUC .91 vs. .77), CDR-SOB increase (AUC .90 vs. .79), and FCI decrease (AUC .89 vs. .72). Measures of validity and stability over time support the use of the method. Latent information in semantic fluency word lists is useful for predicting cognitive and functional decline among elderly individuals at increased risk for developing AD. Modern machine learning methods may incorporate latent information to enhance the diagnostic value of semantic fluency raw scores. These methods could yield information valuable for patient care and clinical trial design with a relatively small investment of time and money. Published by Elsevier Ltd.

  3. Detection and Classification of Transformer Winding Mechanical Faults Using UWB Sensors and Bayesian Classifier

    NASA Astrophysics Data System (ADS)

    Alehosseini, Ali; A. Hejazi, Maryam; Mokhtari, Ghassem; B. Gharehpetian, Gevork; Mohammadi, Mohammad

    2015-06-01

    In this paper, the Bayesian classifier is used to detect and classify the radial deformation and axial displacement of transformer windings. The proposed method is tested on a model of transformer for different volumes of radial deformation and axial displacement. In this method, ultra-wideband (UWB) signal is sent to the simplified model of the transformer winding. The received signal from the winding model is recorded and used for training and testing of Bayesian classifier in different axial displacement and radial deformation states of the winding. It is shown that the proposed method has a good accuracy to detect and classify the axial displacement and radial deformation of the winding.

  4. Accurate Traffic Flow Prediction in Heterogeneous Vehicular Networks in an Intelligent Transport System Using a Supervised Non-Parametric Classifier.

    PubMed

    El-Sayed, Hesham; Sankar, Sharmi; Daraghmi, Yousef-Awwad; Tiwari, Prayag; Rattagan, Ekarat; Mohanty, Manoranjan; Puthal, Deepak; Prasad, Mukesh

    2018-05-24

    Heterogeneous vehicular networks (HETVNETs) evolve from vehicular ad hoc networks (VANETs), which allow vehicles to always be connected so as to obtain safety services within intelligent transportation systems (ITSs). The services and data provided by HETVNETs should be neither interrupted nor delayed. Therefore, Quality of Service (QoS) improvement of HETVNETs is one of the topics attracting the attention of researchers and the manufacturing community. Several methodologies and frameworks have been devised by researchers to address QoS-prediction service issues. In this paper, to improve QoS, we evaluate various traffic characteristics of HETVNETs and propose a new supervised learning model to capture knowledge on all possible traffic patterns. This model is a refinement of support vector machine (SVM) kernels with a radial basis function (RBF). The proposed model produces better results than SVMs, and outperforms other prediction methods used in a traffic context, as it has lower computational complexity and higher prediction accuracy.

  5. Punctuated Copy Number Evolution and Clonal Stasis in Triple-Negative Breast Cancer

    PubMed Central

    Gao, Ruli; Davis, Alexander; McDonald, Thomas O.; Sei, Emi; Shi, Xiuqing; Wang, Yong; Tsai, Pei-Ching; Casasent, Anna; Waters, Jill; Zhang, Hong; Meric-Bernstam, Funda; Michor, Franziska; Navin, Nicholas E.

    2016-01-01

    Aneuploidy is a hallmark of breast cancer; however, our knowledge of how these complex genomic rearrangements evolve during tumorigenesis is limited. In this study we developed a highly multiplexed single-nucleus-sequencing method to investigate copy number evolution in triple-negative breast cancer patients. We sequenced 1000 single cells from 12 patients and identified 1–3 major clonal subpopulations in each tumor that shared a common evolutionary lineage. We also identified a minor subpopulation of non-clonal cells that were classified as: 1) metastable, 2) pseudo-diploid, or 3) chromazemic. Phylogenetic analysis and mathematical modeling suggest that these data are unlikely to be explained by the gradual accumulation of copy number events over time. In contrast, our data challenge the paradigm of gradual evolution, showing that the majority of copy number aberrations are acquired at the earliest stages of tumor evolution, in short punctuated bursts, followed by stable clonal expansions that form the tumor mass. PMID:27526321

  6. Progress in scaffold-free bioprinting for cardiovascular medicine.

    PubMed

    Moldovan, Nicanor I

    2018-06-01

    Biofabrication of tissue analogues is aspiring to become a disruptive technology capable to solve standing biomedical problems, from generation of improved tissue models for drug testing to alleviation of the shortage of organs for transplantation. Arguably, the most powerful tool of this revolution is bioprinting, understood as the assembling of cells with biomaterials in three-dimensional structures. It is less appreciated, however, that bioprinting is not a uniform methodology, but comprises a variety of approaches. These can be broadly classified in two categories, based on the use or not of supporting biomaterials (known as "scaffolds," usually printable hydrogels also called "bioinks"). Importantly, several limitations of scaffold-dependent bioprinting can be avoided by the "scaffold-free" methods. In this overview, we comparatively present these approaches and highlight the rapidly evolving scaffold-free bioprinting, as applied to cardiovascular tissue engineering. © 2018 The Author. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  7. Current and evolving approaches for improving the oral permeability of BCS Class III or analogous molecules.

    PubMed

    Dave, Vivek S; Gupta, Deepak; Yu, Monica; Nguyen, Phuong; Varghese Gupta, Sheeba

    2017-02-01

    The Biopharmaceutics Classification System (BCS) classifies pharmaceutical compounds based on their aqueous solubility and intestinal permeability. The BCS Class III compounds are hydrophilic molecules (high aqueous solubility) with low permeability across the biological membranes. While these compounds are pharmacologically effective, poor absorption due to low permeability becomes the rate-limiting step in achieving adequate bioavailability. Several approaches have been explored and utilized for improving the permeability profiles of these compounds. The approaches include traditional methods such as prodrugs, permeation enhancers, ion-pairing, etc., as well as relatively modern approaches such as nanoencapsulation and nanosizing. The most recent approaches include a combination/hybridization of one or more traditional approaches to improve drug permeability. While some of these approaches have been extremely successful, i.e. drug products utilizing the approach have progressed through the USFDA approval for marketing; others require further investigation to be applicable. This article discusses the commonly studied approaches for improving the permeability of BCS Class III compounds.

  8. Evaluation of Burning Test Rate Method for Flammable Solids to Increase air-Cargo Safety.

    PubMed

    Lukežič, Marjan; Marinšek, Marjan; Faganeli, Jadran

    2010-03-01

    This paper deals with a standard classification procedure for readily combustible solids and their assignment to the relevant packing groups according to international air-cargo legislation and regulations. The current International Air Transport Association and United Nations Orange Book regulations were used on chemically similar substances: hexamethylenetetramine and Dancook ignition briquettes, which are both assigned into the same Packing Group III. To critically evaluate the degree of hazard both chemicals present, a standard burning test rate as well as thermogravimetry, differential scanning calorimetry and evolved gas analysis measurements were performed. It was shown that relatively small changes in the chemical composition of the material may have essential influence on the package group determination. Taking into account all the facts collected in the experimental work, it was concluded that ignition briquettes will undergo spontaneous combustion if exposed to elevated temperatures and, from this point of view, represent higher risk than hexamethylenetetramine during air transportation. Therefore, ignition briquettes should be classified into Packing Group II.

  9. Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers.

    PubMed

    Cerruela García, G; García-Pedrajas, N; Luque Ruiz, I; Gómez-Nieto, M Á

    2018-03-01

    This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.

  10. Evolvable rough-block-based neural network and its biomedical application to hypoglycemia detection system.

    PubMed

    San, Phyo Phyo; Ling, Sai Ho; Nuryani; Nguyen, Hung

    2014-08-01

    This paper focuses on the hybridization technology using rough sets concepts and neural computing for decision and classification purposes. Based on the rough set properties, the lower region and boundary region are defined to partition the input signal to a consistent (predictable) part and an inconsistent (random) part. In this way, the neural network is designed to deal only with the boundary region, which mainly consists of an inconsistent part of applied input signal causing inaccurate modeling of the data set. Owing to different characteristics of neural network (NN) applications, the same structure of conventional NN might not give the optimal solution. Based on the knowledge of application in this paper, a block-based neural network (BBNN) is selected as a suitable classifier due to its ability to evolve internal structures and adaptability in dynamic environments. This architecture will systematically incorporate the characteristics of application to the structure of hybrid rough-block-based neural network (R-BBNN). A global training algorithm, hybrid particle swarm optimization with wavelet mutation is introduced for parameter optimization of proposed R-BBNN. The performance of the proposed R-BBNN algorithm was evaluated by an application to the field of medical diagnosis using real hypoglycemia episodes in patients with Type 1 diabetes mellitus. The performance of the proposed hybrid system has been compared with some of the existing neural networks. The comparison results indicated that the proposed method has improved classification performance and results in early convergence of the network.

  11. An Incremental Type-2 Meta-Cognitive Extreme Learning Machine.

    PubMed

    Pratama, Mahardhika; Zhang, Guangquan; Er, Meng Joo; Anavatti, Sreenatha

    2017-02-01

    Existing extreme learning algorithm have not taken into account four issues: 1) complexity; 2) uncertainty; 3) concept drift; and 4) high dimensionality. A novel incremental type-2 meta-cognitive extreme learning machine (ELM) called evolving type-2 ELM (eT2ELM) is proposed to cope with the four issues in this paper. The eT2ELM presents three main pillars of human meta-cognition: 1) what-to-learn; 2) how-to-learn; and 3) when-to-learn. The what-to-learn component selects important training samples for model updates by virtue of the online certainty-based active learning method, which renders eT2ELM as a semi-supervised classifier. The how-to-learn element develops a synergy between extreme learning theory and the evolving concept, whereby the hidden nodes can be generated and pruned automatically from data streams with no tuning of hidden nodes. The when-to-learn constituent makes use of the standard sample reserved strategy. A generalized interval type-2 fuzzy neural network is also put forward as a cognitive component, in which a hidden node is built upon the interval type-2 multivariate Gaussian function while exploiting a subset of Chebyshev series in the output node. The efficacy of the proposed eT2ELM is numerically validated in 12 data streams containing various concept drifts. The numerical results are confirmed by thorough statistical tests, where the eT2ELM demonstrates the most encouraging numerical results in delivering reliable prediction, while sustaining low complexity.

  12. Medical students’ change in learning styles during the course of the undergraduate program: from ‘thinking and watching’ to ‘thinking and doing’

    PubMed Central

    Bitran, Marcela; Zúñiga, Denisse; Pedrals, Nuria; Padilla, Oslando; Mena, Beltrán

    2012-01-01

    Background Most students admitted to medical school are abstract-passive learners. However, as they progress through the program, active learning and concrete interpersonal interactions become crucial for the acquisition of professional competencies. The purpose of this study was to determine if and how medical students’ learning styles change during the course of their undergraduate program. Methods All students admitted to the Pontificia Universidad Católica de Chile (PUC) medical school between 2000 and 2011 (n = 1,290) took the Kolb’s Learning Style Inventory at school entrance. Two years later 627 students took it again, and in the seventh and last year of the program 104 students took it for a third time. The distribution of styles at years 1, 3 and 7, and the mobility of students between styles were analyzed with Bayesian models. Results Most freshmen (54%) were classified as assimilators (abstract-passive learners); convergers (abstract-active) followed with 26%, whereas divergers (concrete-passive) and accommodators (concrete-active) accounted for 11% and 9%, respectively. By year 3, the styles’ distribution remained unchanged but in year 7 convergers outnumbered assimilators (49% vs. 33%). In general, there were no gender-related differences. Discussion Medical students change their preferred way of learning: they evolve from an abstract-reflexive style to an abstract-active one. This change might represent an adaptation to the curriculum, which evolves from a lecture-based teacher-centered to a problem-based student–centered model. PMID:26451190

  13. Performance analysis of a Principal Component Analysis ensemble classifier for Emotiv headset P300 spellers.

    PubMed

    Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M

    2014-01-01

    The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.

  14. Counting motifs in dynamic networks.

    PubMed

    Mukherjee, Kingshuk; Hasan, Md Mahmudul; Boucher, Christina; Kahveci, Tamer

    2018-04-11

    A network motif is a sub-network that occurs frequently in a given network. Detection of such motifs is important since they uncover functions and local properties of the given biological network. Finding motifs is however a computationally challenging task as it requires solving the costly subgraph isomorphism problem. Moreover, the topology of biological networks change over time. These changing networks are called dynamic biological networks. As the network evolves, frequency of each motif in the network also changes. Computing the frequency of a given motif from scratch in a dynamic network as the network topology evolves is infeasible, particularly for large and fast evolving networks. In this article, we design and develop a scalable method for counting the number of motifs in a dynamic biological network. Our method incrementally updates the frequency of each motif as the underlying network's topology evolves. Our experiments demonstrate that our method can update the frequency of each motif in orders of magnitude faster than counting the motif embeddings every time the network changes. If the network evolves more frequently, the margin with which our method outperforms the existing static methods, increases. We evaluated our method extensively using synthetic and real datasets, and show that our method is highly accurate(≥ 96%) and that it can be scaled to large dense networks. The results on real data demonstrate the utility of our method in revealing interesting insights on the evolution of biological processes.

  15. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection

    PubMed Central

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-01-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices. PMID:25177107

  16. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    PubMed

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  17. Building Diversified Multiple Trees for classification in high dimensional noisy biomedical data.

    PubMed

    Li, Jiuyong; Liu, Lin; Liu, Jixue; Green, Ryan

    2017-12-01

    It is common that a trained classification model is applied to the operating data that is deviated from the training data because of noise. This paper will test an ensemble method, Diversified Multiple Tree (DMT), on its capability for classifying instances in a new laboratory using the classifier built on the instances of another laboratory. DMT is tested on three real world biomedical data sets from different laboratories in comparison with four benchmark ensemble methods, AdaBoost, Bagging, Random Forests, and Random Trees. Experiments have also been conducted on studying the limitation of DMT and its possible variations. Experimental results show that DMT is significantly more accurate than other benchmark ensemble classifiers on classifying new instances of a different laboratory from the laboratory where instances are used to build the classifier. This paper demonstrates that an ensemble classifier, DMT, is more robust in classifying noisy data than other widely used ensemble methods. DMT works on the data set that supports multiple simple trees.

  18. Behavior analytic approaches to problem behavior in intellectual disabilities.

    PubMed

    Hagopian, Louis P; Gregory, Meagan K

    2016-03-01

    The purpose of the current review is to summarize recent behavior analytic research on problem behavior in individuals with intellectual disabilities. We have focused our review on studies published from 2013 to 2015, but also included earlier studies that were relevant. Behavior analytic research on problem behavior continues to focus on the use and refinement of functional behavioral assessment procedures and function-based interventions. During the review period, a number of studies reported on procedures aimed at making functional analysis procedures more time efficient. Behavioral interventions continue to evolve, and there were several larger scale clinical studies reporting on multiple individuals. There was increased attention on the part of behavioral researchers to develop statistical methods for analysis of within subject data and continued efforts to aggregate findings across studies through evaluative reviews and meta-analyses. Findings support continued utility of functional analysis for guiding individualized interventions and for classifying problem behavior. Modifications designed to make functional analysis more efficient relative to the standard method of functional analysis were reported; however, these require further validation. Larger scale studies on behavioral assessment and treatment procedures provided additional empirical support for effectiveness of these approaches and their sustainability outside controlled clinical settings.

  19. Systematic technology transfer from biology to engineering.

    PubMed

    Vincent, Julian F V; Mann, Darrell L

    2002-02-15

    Solutions to problems move only very slowly between different disciplines. Transfer can be greatly speeded up with suitable abstraction and classification of problems. Russian researchers working on the TRIZ (Teoriya Resheniya Izobretatelskikh Zadatch) method for inventive problem solving have identified systematic means of transferring knowledge between different scientific and engineering disciplines. With over 1500 person years of effort behind it, TRIZ represents the biggest study of human creativity ever conducted, whose aim has been to establish a system into which all known solutions can be placed, classified in terms of function. At present, the functional classification structure covers nearly 3 000 000 of the world's successful patents and large proportions of the known physical, chemical and mathematical knowledge-base. Additional tools are the identification of factors which prevent the attainment of new technology, leading directly to a system of inventive principles which will resolve the impasse, a series of evolutionary trends of development, and to a system of methods for effecting change in a system (Su-fields). As yet, the database contains little biological knowledge despite early recognition by the instigator of TRIZ (Genrich Altshuller) that one day it should. This is illustrated by natural systems evolved for thermal stability and the maintenance of cleanliness.

  20. Waterpipe industry products and marketing strategies: analysis of an industry trade exhibition

    PubMed Central

    Jawad, Mohammed; Nakkash, Rima T; Hawkins, Ben; Akl, Elie A

    2016-01-01

    Introduction Understanding product development and marketing strategies of transnational tobacco companies (TTCs) has been of vital importance in developing effective tobacco control policy. However, comparatively little is known of the waterpipe tobacco industry, which TTCs have recently entered. This study aimed to gain an understanding of waterpipe tobacco products and marketing strategies by visiting a waterpipe trade exhibition. Methods In April 2014 the first author attended an international waterpipe trade exhibition, recording descriptions of products and collecting all marketing items available. We described the purpose and function of all products, and performed a thematic analysis of messages in marketing material. Results We classified the waterpipe products into seven categories and noted product variation within categories. Electronic waterpipe products (which mimic electronic cigarettes) rarely appeared on waterpipe tobacco marketing material, but were displayed just as widely. Claims of reduced harm, safety and quality were paramount on marketing materials, regardless of whether they were promoting waterpipe tobacco, waterpipe tobacco-substitutes, electronic waterpipes or charcoal. Conclusions Waterpipe products are diverse in nature and are marketed as healthy and safe products. Furthermore, the development of electronic waterpipe products appear to be closely connected with the electronic cigarette industry, rather than the waterpipe tobacco manufacturers. Tobacco control policy must evolve to take account of the vast and expanding array of waterpipe products, and potentially also charcoal products developed for waterpipe smokers. We recommend tobacco-substitutes be classified as tobacco products. Continued surveillance of the waterpipe industry is warranted. PMID:26149455

  1. ITS2 data corroborate a monophyletic chlorophycean DO-group (Sphaeropleales)

    PubMed Central

    2008-01-01

    Background Within Chlorophyceae the ITS2 secondary structure shows an unbranched helix I, except for the 'Hydrodictyon' and the 'Scenedesmus' clade having a ramified first helix. The latter two are classified within the Sphaeropleales, characterised by directly opposed basal bodies in their flagellar apparatuses (DO-group). Previous studies could not resolve the taxonomic position of the 'Sphaeroplea' clade within the Chlorophyceae without ambiguity and two pivotal questions remain open: (1) Is the DO-group monophyletic and (2) is a branched helix I an apomorphic feature of the DO-group? In the present study we analysed the secondary structure of three newly obtained ITS2 sequences classified within the 'Sphaeroplea' clade and resolved sphaeroplealean relationships by applying different phylogenetic approaches based on a combined sequence-structure alignment. Results The newly obtained ITS2 sequences of Ankyra judayi, Atractomorpha porcata and Sphaeroplea annulina of the 'Sphaeroplea' clade do not show any branching in the secondary structure of their helix I. All applied phylogenetic methods highly support the 'Sphaeroplea' clade as a sister group to the 'core Sphaeropleales'. Thus, the DO-group is monophyletic. Furthermore, based on characteristics in the sequence-structure alignment one is able to distinguish distinct lineages within the green algae. Conclusion In green algae, a branched helix I in the secondary structure of the ITS2 evolves past the 'Sphaeroplea' clade. A branched helix I is an apomorph characteristic within the monophyletic DO-group. Our results corroborate the fundamental relevance of including the secondary structure in sequence analysis and phylogenetics. PMID:18655698

  2. Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines.

    PubMed

    Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat

    2017-01-01

    Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.

  3. Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines

    PubMed Central

    Abuassba, Adnan O. M.; Ali, Hazrat

    2017-01-01

    Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808

  4. The evolution of resource adaptation: how generalist and specialist consumers evolve.

    PubMed

    Ma, Junling; Levin, Simon A

    2006-07-01

    Why and how specialist and generalist strategies evolve are important questions in evolutionary ecology. In this paper, with the method of adaptive dynamics and evolutionary branching, we identify conditions that select for specialist and generalist strategies. Generally, generalist strategies evolve if there is a switching benefit; specialists evolve if there is a switching cost. If the switching cost is large, specialists always evolve. If the switching cost is small, even though the consumer will first evolve toward a generalist strategy, it will eventually branch into two specialists.

  5. Comparison Analysis of Recognition Algorithms of Forest-Cover Objects on Hyperspectral Air-Borne and Space-Borne Images

    NASA Astrophysics Data System (ADS)

    Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.

    2017-12-01

    The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.

  6. Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.

    PubMed

    Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G

    2017-09-01

    To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.

  7. Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection.

    PubMed

    Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter Je; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong

    2017-11-01

    Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.

  8. Boomerang: A method for recursive reclassification.

    PubMed

    Devlin, Sean M; Ostrovnaya, Irina; Gönen, Mithat

    2016-09-01

    While there are many validated prognostic classifiers used in practice, often their accuracy is modest and heterogeneity in clinical outcomes exists in one or more risk subgroups. Newly available markers, such as genomic mutations, may be used to improve the accuracy of an existing classifier by reclassifying patients from a heterogenous group into a higher or lower risk category. The statistical tools typically applied to develop the initial classifiers are not easily adapted toward this reclassification goal. In this article, we develop a new method designed to refine an existing prognostic classifier by incorporating new markers. The two-stage algorithm called Boomerang first searches for modifications of the existing classifier that increase the overall predictive accuracy and then merges to a prespecified number of risk groups. Resampling techniques are proposed to assess the improvement in predictive accuracy when an independent validation data set is not available. The performance of the algorithm is assessed under various simulation scenarios where the marker frequency, degree of censoring, and total sample size are varied. The results suggest that the method selects few false positive markers and is able to improve the predictive accuracy of the classifier in many settings. Lastly, the method is illustrated on an acute myeloid leukemia data set where a new refined classifier incorporates four new mutations into the existing three category classifier and is validated on an independent data set. © 2016, The International Biometric Society.

  9. Boomerang: A Method for Recursive Reclassification

    PubMed Central

    Devlin, Sean M.; Ostrovnaya, Irina; Gönen, Mithat

    2016-01-01

    Summary While there are many validated prognostic classifiers used in practice, often their accuracy is modest and heterogeneity in clinical outcomes exists in one or more risk subgroups. Newly available markers, such as genomic mutations, may be used to improve the accuracy of an existing classifier by reclassifying patients from a heterogenous group into a higher or lower risk category. The statistical tools typically applied to develop the initial classifiers are not easily adapted towards this reclassification goal. In this paper, we develop a new method designed to refine an existing prognostic classifier by incorporating new markers. The two-stage algorithm called Boomerang first searches for modifications of the existing classifier that increase the overall predictive accuracy and then merges to a pre-specified number of risk groups. Resampling techniques are proposed to assess the improvement in predictive accuracy when an independent validation data set is not available. The performance of the algorithm is assessed under various simulation scenarios where the marker frequency, degree of censoring, and total sample size are varied. The results suggest that the method selects few false positive markers and is able to improve the predictive accuracy of the classifier in many settings. Lastly, the method is illustrated on an acute myeloid leukemia dataset where a new refined classifier incorporates four new mutations into the existing three category classifier and is validated on an independent dataset. PMID:26754051

  10. Currency crisis indication by using ensembles of support vector machine classifiers

    NASA Astrophysics Data System (ADS)

    Ramli, Nor Azuana; Ismail, Mohd Tahir; Wooi, Hooy Chee

    2014-07-01

    There are many methods that had been experimented in the analysis of currency crisis. However, not all methods could provide accurate indications. This paper introduces an ensemble of classifiers by using Support Vector Machine that's never been applied in analyses involving currency crisis before with the aim of increasing the indication accuracy. The proposed ensemble classifiers' performances are measured using percentage of accuracy, root mean squared error (RMSE), area under the Receiver Operating Characteristics (ROC) curve and Type II error. The performances of an ensemble of Support Vector Machine classifiers are compared with the single Support Vector Machine classifier and both of classifiers are tested on the data set from 27 countries with 12 macroeconomic indicators for each country. From our analyses, the results show that the ensemble of Support Vector Machine classifiers outperforms single Support Vector Machine classifier on the problem involving indicating a currency crisis in terms of a range of standard measures for comparing the performance of classifiers.

  11. Consensus Classification Using Non-Optimized Classifiers.

    PubMed

    Brownfield, Brett; Lemos, Tony; Kalivas, John H

    2018-04-03

    Classifying samples into categories is a common problem in analytical chemistry and other fields. Classification is usually based on only one method, but numerous classifiers are available with some being complex, such as neural networks, and others are simple, such as k nearest neighbors. Regardless, most classification schemes require optimization of one or more tuning parameters for best classification accuracy, sensitivity, and specificity. A process not requiring exact selection of tuning parameter values would be useful. To improve classification, several ensemble approaches have been used in past work to combine classification results from multiple optimized single classifiers. The collection of classifications for a particular sample are then combined by a fusion process such as majority vote to form the final classification. Presented in this Article is a method to classify a sample by combining multiple classification methods without specifically classifying the sample by each method, that is, the classification methods are not optimized. The approach is demonstrated on three analytical data sets. The first is a beer authentication set with samples measured on five instruments, allowing fusion of multiple instruments by three ways. The second data set is composed of textile samples from three classes based on Raman spectra. This data set is used to demonstrate the ability to classify simultaneously with different data preprocessing strategies, thereby reducing the need to determine the ideal preprocessing method, a common prerequisite for accurate classification. The third data set contains three wine cultivars for three classes measured at 13 unique chemical and physical variables. In all cases, fusion of nonoptimized classifiers improves classification. Also presented are atypical uses of Procrustes analysis and extended inverted signal correction (EISC) for distinguishing sample similarities to respective classes.

  12. Vascularized Composite Allografts: Procurement, Allocation, and Implementation.

    PubMed

    Rahmel, Axel

    Vascularized composite allotransplantation is a continuously evolving area of modern transplant medicine. Recently, vascularized composite allografts (VCAs) have been formally classified as 'organs'. In this review, key aspects of VCA procurement are discussed, with a special focus on interaction with the procurement of classical solid organs. In addition, options for a matching and allocation system that ensures VCA donor organs are allocated to the best-suited recipients are looked at. Finally, the different steps needed to promote VCA transplantation in society in general and in the medical community in particular are highlighted.

  13. Brain tissue segmentation in 4D CT using voxel classification

    NASA Astrophysics Data System (ADS)

    van den Boom, R.; Oei, M. T. H.; Lafebre, S.; Oostveen, L. J.; Meijer, F. J. A.; Steens, S. C. A.; Prokop, M.; van Ginneken, B.; Manniesing, R.

    2012-02-01

    A method is proposed to segment anatomical regions of the brain from 4D computer tomography (CT) patient data. The method consists of a three step voxel classification scheme, each step focusing on structures that are increasingly difficult to segment. The first step classifies air and bone, the second step classifies vessels and the third step classifies white matter, gray matter and cerebrospinal fluid. As features the time averaged intensity value and the temporal intensity change value were used. In each step, a k-Nearest-Neighbor classifier was used to classify the voxels. Training data was obtained by placing regions of interest in reconstructed 3D image data. The method has been applied to ten 4D CT cerebral patient data. A leave-one-out experiment showed consistent and accurate segmentation results.

  14. Computational structural mechanics methods research using an evolving framework

    NASA Technical Reports Server (NTRS)

    Knight, N. F., Jr.; Lotts, C. G.; Gillian, R. E.

    1990-01-01

    Advanced structural analysis and computational methods that exploit high-performance computers are being developed in a computational structural mechanics research activity sponsored by the NASA Langley Research Center. These new methods are developed in an evolving framework and applied to representative complex structural analysis problems from the aerospace industry. An overview of the methods development environment is presented, and methods research areas are described. Selected application studies are also summarized.

  15. The Dynamical Classification of Centaurs which Evolve into Comets

    NASA Astrophysics Data System (ADS)

    Wood, Jeremy R.; Horner, Jonathan; Hinse, Tobias; Marsden, Stephen; Swinburne University of Technology

    2016-10-01

    Centaurs are small Solar system bodies with semi-major axes between Jupiter and Neptune and perihelia beyond Jupiter. Centaurs can be further subclassified into two dynamical categories - random walk and resonance hopping. Random walk Centaurs have mean square semi-major axes (< a2 >) which vary in time according to a generalized diffusion equation where < a2 > ~t2H. H is the Hurst exponent with 0 < H < 1, and t is time. The behavior of < a2 > for resonance hopping Centaurs is not well described by generalized diffusion.The aim of this study is to determine which dynamical type of Centaur is most likely to evolve into each class of comet. 31,722 fictional massless test particles were integrated for 3 Myr in the 6-body problem (Sun, Jovian planets, test particle). Initially each test particle was a member of one of four groups. The semi-major axes of all test particles in a group were clustered within 0.27 au from a first order, interior Mean Motion resonance of Neptune. The resonances were centered at 18.94 au, 22.95 au, 24.82 au and 28.37 au.If the perihelion of a test particle reached < 4 au then the test particle was considered to be a comet and classified as either a random walk or resonance hopping Centaur. The results showed that over 4,000 test particles evolved into comets within 3 Myr. 59% of these test particles were random walk and 41% were resonance hopping. The behavior of the semi-major axis in time was usually well described by generalized diffusion for random walk Centaurs (ravg = 0.98) and poorly described for resonance hopping Centaurs (ravg = 0.52). The average Hurst exponent was 0.48 for random walk Centaurs and 0.20 for resonance hopping Centaurs. Random walk Centaurs were more likely to evolve into short period comets while resonance hopping Centaurs were more likely to evolve into long period comets. For each initial cluster, resonance hopping Centaurs took longer to evolve into comets than random walk Centaurs. Overall the population of random walk Centaurs averaged 143 kyr to evolve into comets, and the population of resonance hopping Centaurs averaged 164 kyr.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morgan, Nathaniel Ray; Waltz, Jacob I.

    The level set method is commonly used to model dynamically evolving fronts and interfaces. In this work, we present new methods for evolving fronts with a specified velocity field or in the surface normal direction on 3D unstructured tetrahedral meshes with adaptive mesh refinement (AMR). The level set field is located at the nodes of the tetrahedral cells and is evolved using new upwind discretizations of Hamilton–Jacobi equations combined with a Runge–Kutta method for temporal integration. The level set field is periodically reinitialized to a signed distance function using an iterative approach with a new upwind gradient. We discuss themore » details of these level set and reinitialization methods. Results from a range of numerical test problems are presented.« less

  17. A Quantitative Approach to Assessing System Evolvability

    NASA Technical Reports Server (NTRS)

    Christian, John A., III

    2004-01-01

    When selecting a system from multiple candidates, the customer seeks the one that best meets his or her needs. Recently the desire for evolvable systems has become more important and engineers are striving to develop systems that accommodate this need. In response to this search for evolvability, we present a historical perspective on evolvability, propose a refined definition of evolvability, and develop a quantitative method for measuring this property. We address this quantitative methodology from both a theoretical and practical perspective. This quantitative model is then applied to the problem of evolving a lunar mission to a Mars mission as a case study.

  18. Ensemble of classifiers for ontology enrichment

    NASA Astrophysics Data System (ADS)

    Semenova, A. V.; Kureichik, V. M.

    2018-05-01

    A classifier is a basis of ontology learning systems. Classification of text documents is used in many applications, such as information retrieval, information extraction, definition of spam. A new ensemble of classifiers based on SVM (a method of support vectors), LSTM (neural network) and word embedding are suggested. An experiment was conducted on open data, which allows us to conclude that the proposed classification method is promising. The implementation of the proposed classifier is performed in the Matlab using the functions of the Text Analytics Toolbox. The principal difference between the proposed ensembles of classifiers is the high quality of classification of data at acceptable time costs.

  19. Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    PubMed Central

    Arshad, Sannia; Rho, Seungmin

    2014-01-01

    We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes. PMID:25295302

  20. Robust framework to combine diverse classifiers assigning distributed confidence to individual classifiers at class level.

    PubMed

    Khalid, Shehzad; Arshad, Sannia; Jabbar, Sohail; Rho, Seungmin

    2014-01-01

    We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.

  1. Methods Evolved by Observation

    ERIC Educational Resources Information Center

    Montessori, Maria

    2016-01-01

    Montessori's idea of the child's nature and the teacher's perceptiveness begins with amazing simplicity, and when she speaks of "methods evolved," she is unveiling a methodological system for observation. She begins with the early childhood explosion into writing, which is a familiar child phenomenon that Montessori has written about…

  2. Use of the parameterised finite element method to robustly and efficiently evolve the edge of a moving cell.

    PubMed

    Neilson, Matthew P; Mackenzie, John A; Webb, Steven D; Insall, Robert H

    2010-11-01

    In this paper we present a computational tool that enables the simulation of mathematical models of cell migration and chemotaxis on an evolving cell membrane. Recent models require the numerical solution of systems of reaction-diffusion equations on the evolving cell membrane and then the solution state is used to drive the evolution of the cell edge. Previous work involved moving the cell edge using a level set method (LSM). However, the LSM is computationally very expensive, which severely limits the practical usefulness of the algorithm. To address this issue, we have employed the parameterised finite element method (PFEM) as an alternative method for evolving a cell boundary. We show that the PFEM is far more efficient and robust than the LSM. We therefore suggest that the PFEM potentially has an essential role to play in computational modelling efforts towards the understanding of many of the complex issues related to chemotaxis.

  3. Application of machine learning on brain cancer multiclass classification

    NASA Astrophysics Data System (ADS)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  4. A Neural Network-Based Gait Phase Classification Method Using Sensors Equipped on Lower Limb Exoskeleton Robots

    PubMed Central

    Jung, Jun-Young; Heo, Wonho; Yang, Hyundae; Park, Hyunsub

    2015-01-01

    An exact classification of different gait phases is essential to enable the control of exoskeleton robots and detect the intentions of users. We propose a gait phase classification method based on neural networks using sensor signals from lower limb exoskeleton robots. In such robots, foot sensors with force sensing registers are commonly used to classify gait phases. We describe classifiers that use the orientation of each lower limb segment and the angular velocities of the joints to output the current gait phase. Experiments to obtain the input signals and desired outputs for the learning and validation process are conducted, and two neural network methods (a multilayer perceptron and nonlinear autoregressive with external inputs (NARX)) are used to develop an optimal classifier. Offline and online evaluations using four criteria are used to compare the performance of the classifiers. The proposed NARX-based method exhibits sufficiently good performance to replace foot sensors as a means of classifying gait phases. PMID:26528986

  5. A Neural Network-Based Gait Phase Classification Method Using Sensors Equipped on Lower Limb Exoskeleton Robots.

    PubMed

    Jung, Jun-Young; Heo, Wonho; Yang, Hyundae; Park, Hyunsub

    2015-10-30

    An exact classification of different gait phases is essential to enable the control of exoskeleton robots and detect the intentions of users. We propose a gait phase classification method based on neural networks using sensor signals from lower limb exoskeleton robots. In such robots, foot sensors with force sensing registers are commonly used to classify gait phases. We describe classifiers that use the orientation of each lower limb segment and the angular velocities of the joints to output the current gait phase. Experiments to obtain the input signals and desired outputs for the learning and validation process are conducted, and two neural network methods (a multilayer perceptron and nonlinear autoregressive with external inputs (NARX)) are used to develop an optimal classifier. Offline and online evaluations using four criteria are used to compare the performance of the classifiers. The proposed NARX-based method exhibits sufficiently good performance to replace foot sensors as a means of classifying gait phases.

  6. Classifying Higher Education Institutions in Korea: A Performance-Based Approach

    ERIC Educational Resources Information Center

    Shin, Jung Cheol

    2009-01-01

    The purpose of this study was to classify higher education institutions according to institutional performance rather than predetermined benchmarks. Institutional performance was defined as research performance and classified using Hierarchical Cluster Analysis, a statistical method that classifies objects according to specified classification…

  7. Probabilistic classifiers with high-dimensional data

    PubMed Central

    Kim, Kyung In; Simon, Richard

    2011-01-01

    For medical classification problems, it is often desirable to have a probability associated with each class. Probabilistic classifiers have received relatively little attention for small n large p classification problems despite of their importance in medical decision making. In this paper, we introduce 2 criteria for assessment of probabilistic classifiers: well-calibratedness and refinement and develop corresponding evaluation measures. We evaluated several published high-dimensional probabilistic classifiers and developed 2 extensions of the Bayesian compound covariate classifier. Based on simulation studies and analysis of gene expression microarray data, we found that proper probabilistic classification is more difficult than deterministic classification. It is important to ensure that a probabilistic classifier is well calibrated or at least not “anticonservative” using the methods developed here. We provide this evaluation for several probabilistic classifiers and also evaluate their refinement as a function of sample size under weak and strong signal conditions. We also present a cross-validation method for evaluating the calibration and refinement of any probabilistic classifier on any data set. PMID:21087946

  8. Histology, Fusion Status, and Outcome in Alveolar Rhabdomyosarcoma With Low-Risk Clinical Features: A Report From the Children's Oncology Group.

    PubMed

    Arnold, Michael A; Anderson, James R; Gastier-Foster, Julie M; Barr, Frederic G; Skapek, Stephen X; Hawkins, Douglas S; Raney, R Beverly; Parham, David M; Teot, Lisa A; Rudzinski, Erin R; Walterhouse, David O

    2016-04-01

    Distinguishing alveolar rhabdomyosarcoma (ARMS) from embryonal rhabdomyosarcoma (ERMS) is of prognostic and therapeutic importance. Criteria for classifying these entities evolved significantly from 1995 to 2013. ARMS is associated with inferior outcome; therefore, patients with alveolar histology have generally been excluded from low-risk therapy. However, patients with ARMS and low-risk stage and group (Stage 1, Group I/II/orbit III; or Stage 2/3, Group I/II) were eligible for the Children's Oncology Group (COG) low-risk rhabdomyosarcoma (RMS) study D9602 from 1997 to 1999. The characteristics and outcomes of these patients have not been previously reported, and the histology of these cases has not been reviewed using current criteria. We re-reviewed cases that were classified as ARMS on D9602 using current histologic criteria, determined PAX3/PAX7-FOXO1 fusion status, and compared these data with outcome for this unique group of patients. Thirty-eight patients with ARMS were enrolled onto D9602. Only one-third of cases with slides available for re-review (11/33) remained classified as ARMS by current histologic criteria. Most cases were reclassified as ERMS (17/33, 51.5%). Cases that remained classified as ARMS were typically fusion-positive (8/11, 73%), therefore current classification results in a similar rate of fusion-positive ARMS for all clinical risk groups. In conjunction with data from COG intermediate-risk treatment protocol D9803, our data demonstrate excellent outcomes for fusion-negative ARMS with otherwise low-risk clinical features. Patients with fusion-positive RMS with low-risk clinical features should be classified and treated as intermediate risk, while patients with fusion-negative ARMS could be appropriately treated with reduced intensity therapy. © 2016 Wiley Periodicals, Inc.

  9. Concerted and nonconcerted evolution of the Hsp70 gene superfamily in two sibling species of nematodes.

    PubMed

    Nikolaidis, Nikolas; Nei, Masatoshi

    2004-03-01

    We have identified the Hsp70 gene superfamily of the nematode Caenorhabditis briggsae and investigated the evolution of these genes in comparison with Hsp70 genes from C. elegans, Drosophila, and yeast. The Hsp70 genes are classified into three monophyletic groups according to their subcellular localization, namely, cytoplasm (CYT), endoplasmic reticulum (ER), and mitochondria (MT). The Hsp110 genes can be classified into the polyphyletic CYT group and the monophyletic ER group. The different Hsp70 and Hsp110 groups appeared to evolve following the model of divergent evolution. This model can also explain the evolution of the ER and MT genes. On the other hand, the CYT genes are divided into heat-inducible and constitutively expressed genes. The constitutively expressed genes have evolved more or less following the birth-and-death process, and the rates of gene birth and gene death are different between the two nematode species. By contrast, some heat-inducible genes show an intraspecies phylogenetic clustering. This suggests that they are subject to sequence homogenization resulting from gene conversion-like events. In addition, the heat-inducible genes show high levels of sequence conservation in both intra-species and inter-species comparisons, and in most cases, amino acid sequence similarity is higher than nucleotide sequence similarity. This indicates that purifying selection also plays an important role in maintaining high sequence similarity among paralogous Hsp70 genes. Therefore, we suggest that the CYT heat-inducible genes have been subjected to a combination of purifying selection, birth-and-death process, and gene conversion-like events.

  10. Thin Cloud Detection Method by Linear Combination Model of Cloud Image

    NASA Astrophysics Data System (ADS)

    Liu, L.; Li, J.; Wang, Y.; Xiao, Y.; Zhang, W.; Zhang, S.

    2018-04-01

    The existing cloud detection methods in photogrammetry often extract the image features from remote sensing images directly, and then use them to classify images into cloud or other things. But when the cloud is thin and small, these methods will be inaccurate. In this paper, a linear combination model of cloud images is proposed, by using this model, the underlying surface information of remote sensing images can be removed. So the cloud detection result can become more accurate. Firstly, the automatic cloud detection program in this paper uses the linear combination model to split the cloud information and surface information in the transparent cloud images, then uses different image features to recognize the cloud parts. In consideration of the computational efficiency, AdaBoost Classifier was introduced to combine the different features to establish a cloud classifier. AdaBoost Classifier can select the most effective features from many normal features, so the calculation time is largely reduced. Finally, we selected a cloud detection method based on tree structure and a multiple feature detection method using SVM classifier to compare with the proposed method, the experimental data shows that the proposed cloud detection program in this paper has high accuracy and fast calculation speed.

  11. TYC 3159-6-1: a runaway blue supergiant

    NASA Astrophysics Data System (ADS)

    Gvaramadze, V. V.; Miroshnichenko, A. S.; Castro, N.; Langer, N.; Zharikov, S. V.

    2014-01-01

    We report the results of optical spectroscopy of a candidate evolved massive star in the Cygnus-X region, TYC 3159-6-1, revealed via detection of its curious circumstellar nebula in archival data of the Spitzer Space Telescope. We classify TYC 3159-6-1 as an O9.5-O9.7 Ib star and derive its fundamental parameters by using the stellar atmosphere code FASTWIND. The He and CNO abundances in the photosphere of TYC 3159-6-1 are consistent with the solar abundances, suggesting that the star only recently evolved off the main sequence. Proper motion and radial velocity measurements for TYC 3159-6-1 show that it is a runaway star. We propose that Dolidze 7 is its parent cluster. We discuss the origin of the nebula around TYC 3159-6-1 and suggest that it might be produced in several successive episodes of enhanced mass-loss rate (outbursts) caused by rotation of the star near the critical Ω limit.

  12. Selective Transfer Machine for Personalized Facial Expression Analysis

    PubMed Central

    Chu, Wen-Sheng; De la Torre, Fernando; Cohn, Jeffrey F.

    2017-01-01

    Automatic facial action unit (AU) and expression detection from videos is a long-standing problem. The problem is challenging in part because classifiers must generalize to previously unknown subjects that differ markedly in behavior and facial morphology (e.g., heavy versus delicate brows, smooth versus deeply etched wrinkles) from those on which the classifiers are trained. While some progress has been achieved through improvements in choices of features and classifiers, the challenge occasioned by individual differences among people remains. Person-specific classifiers would be a possible solution but for a paucity of training data. Sufficient training data for person-specific classifiers typically is unavailable. This paper addresses the problem of how to personalize a generic classifier without additional labels from the test subject. We propose a transductive learning method, which we refer as a Selective Transfer Machine (STM), to personalize a generic classifier by attenuating person-specific mismatches. STM achieves this effect by simultaneously learning a classifier and re-weighting the training samples that are most relevant to the test subject. We compared STM to both generic classifiers and cross-domain learning methods on four benchmarks: CK+ [44], GEMEP-FERA [67], RU-FACS [4] and GFT [57]. STM outperformed generic classifiers in all. PMID:28113267

  13. The Upper and Lower Bounds of the Prediction Accuracies of Ensemble Methods for Binary Classification

    PubMed Central

    Wang, Xueyi; Davidson, Nicholas J.

    2011-01-01

    Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162

  14. Knowledge extraction from evolving spiking neural networks with rank order population coding.

    PubMed

    Soltic, Snjezana; Kasabov, Nikola

    2010-12-01

    This paper demonstrates how knowledge can be extracted from evolving spiking neural networks with rank order population coding. Knowledge discovery is a very important feature of intelligent systems. Yet, a disproportionally small amount of research is centered on the issue of knowledge extraction from spiking neural networks which are considered to be the third generation of artificial neural networks. The lack of knowledge representation compatibility is becoming a major detriment to end users of these networks. We show that a high-level knowledge can be obtained from evolving spiking neural networks. More specifically, we propose a method for fuzzy rule extraction from an evolving spiking network with rank order population coding. The proposed method was used for knowledge discovery on two benchmark taste recognition problems where the knowledge learnt by an evolving spiking neural network was extracted in the form of zero-order Takagi-Sugeno fuzzy IF-THEN rules.

  15. Intelligent query by humming system based on score level fusion of multiple classifiers

    NASA Astrophysics Data System (ADS)

    Pyo Nam, Gi; Thu Trang Luong, Thi; Ha Nam, Hyun; Ryoung Park, Kang; Park, Sung-Joo

    2011-12-01

    Recently, the necessity for content-based music retrieval that can return results even if a user does not know information such as the title or singer has increased. Query-by-humming (QBH) systems have been introduced to address this need, as they allow the user to simply hum snatches of the tune to find the right song. Even though there have been many studies on QBH, few have combined multiple classifiers based on various fusion methods. Here we propose a new QBH system based on the score level fusion of multiple classifiers. This research is novel in the following three respects: three local classifiers [quantized binary (QB) code-based linear scaling (LS), pitch-based dynamic time warping (DTW), and LS] are employed; local maximum and minimum point-based LS and pitch distribution feature-based LS are used as global classifiers; and the combination of local and global classifiers based on the score level fusion by the PRODUCT rule is used to achieve enhanced matching accuracy. Experimental results with the 2006 MIREX QBSH and 2009 MIR-QBSH corpus databases show that the performance of the proposed method is better than that of single classifier and other fusion methods.

  16. Gas chimney detection based on improving the performance of combined multilayer perceptron and support vector classifier

    NASA Astrophysics Data System (ADS)

    Hashemi, H.; Tax, D. M. J.; Duin, R. P. W.; Javaherian, A.; de Groot, P.

    2008-11-01

    Seismic object detection is a relatively new field in which 3-D bodies are visualized and spatial relationships between objects of different origins are studied in order to extract geologic information. In this paper, we propose a method for finding an optimal classifier with the help of a statistical feature ranking technique and combining different classifiers. The method, which has general applicability, is demonstrated here on a gas chimney detection problem. First, we evaluate a set of input seismic attributes extracted at locations labeled by a human expert using regularized discriminant analysis (RDA). In order to find the RDA score for each seismic attribute, forward and backward search strategies are used. Subsequently, two non-linear classifiers: multilayer perceptron (MLP) and support vector classifier (SVC) are run on the ranked seismic attributes. Finally, to capitalize on the intrinsic differences between both classifiers, the MLP and SVC results are combined using logical rules of maximum, minimum and mean. The proposed method optimizes the ranked feature space size and yields the lowest classification error in the final combined result. We will show that the logical minimum reveals gas chimneys that exhibit both the softness of MLP and the resolution of SVC classifiers.

  17. Local-global classifier fusion for screening chest radiographs

    NASA Astrophysics Data System (ADS)

    Ding, Meng; Antani, Sameer; Jaeger, Stefan; Xue, Zhiyun; Candemir, Sema; Kohli, Marc; Thoma, George

    2017-03-01

    Tuberculosis (TB) is a severe comorbidity of HIV and chest x-ray (CXR) analysis is a necessary step in screening for the infective disease. Automatic analysis of digital CXR images for detecting pulmonary abnormalities is critical for population screening, especially in medical resource constrained developing regions. In this article, we describe steps that improve previously reported performance of NLM's CXR screening algorithms and help advance the state of the art in the field. We propose a local-global classifier fusion method where two complementary classification systems are combined. The local classifier focuses on subtle and partial presentation of the disease leveraging information in radiology reports that roughly indicates locations of the abnormalities. In addition, the global classifier models the dominant spatial structure in the gestalt image using GIST descriptor for the semantic differentiation. Finally, the two complementary classifiers are combined using linear fusion, where the weight of each decision is calculated by the confidence probabilities from the two classifiers. We evaluated our method on three datasets in terms of the area under the Receiver Operating Characteristic (ROC) curve, sensitivity, specificity and accuracy. The evaluation demonstrates the superiority of our proposed local-global fusion method over any single classifier.

  18. Guidelines 13 and 14—Prediction uncertainty

    USGS Publications Warehouse

    Hill, Mary C.; Tiedeman, Claire

    2005-01-01

    An advantage of using optimization for model development and calibration is that optimization provides methods for evaluating and quantifying prediction uncertainty. Both deterministic and statistical methods can be used. Guideline 13 discusses using regression and post-audits, which we classify as deterministic methods. Guideline 14 discusses inferential statistics and Monte Carlo methods, which we classify as statistical methods.

  19. A two-dimensional matrix image based feature extraction method for classification of sEMG: A comparative analysis based on SVM, KNN and RBF-NN.

    PubMed

    Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen

    2017-01-01

    The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.

  20. Enhancing atlas based segmentation with multiclass linear classifiers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sdika, Michaël, E-mail: michael.sdika@creatis.insa-lyon.fr

    Purpose: To present a method to enrich atlases for atlas based segmentation. Such enriched atlases can then be used as a single atlas or within a multiatlas framework. Methods: In this paper, machine learning techniques have been used to enhance the atlas based segmentation approach. The enhanced atlas defined in this work is a pair composed of a gray level image alongside an image of multiclass classifiers with one classifier per voxel. Each classifier embeds local information from the whole training dataset that allows for the correction of some systematic errors in the segmentation and accounts for the possible localmore » registration errors. The authors also propose to use these images of classifiers within a multiatlas framework: results produced by a set of such local classifier atlases can be combined using a label fusion method. Results: Experiments have been made on the in vivo images of the IBSR dataset and a comparison has been made with several state-of-the-art methods such as FreeSurfer and the multiatlas nonlocal patch based method of Coupé or Rousseau. These experiments show that their method is competitive with state-of-the-art methods while having a low computational cost. Further enhancement has also been obtained with a multiatlas version of their method. It is also shown that, in this case, nonlocal fusion is unnecessary. The multiatlas fusion can therefore be done efficiently. Conclusions: The single atlas version has similar quality as state-of-the-arts multiatlas methods but with the computational cost of a naive single atlas segmentation. The multiatlas version offers a improvement in quality and can be done efficiently without a nonlocal strategy.« less

  1. LocFuse: human protein-protein interaction prediction via classifier fusion using protein localization information.

    PubMed

    Zahiri, Javad; Mohammad-Noori, Morteza; Ebrahimpour, Reza; Saadat, Samaneh; Bozorgmehr, Joseph H; Goldberg, Tatyana; Masoudi-Nejad, Ali

    2014-12-01

    Protein-protein interaction (PPI) detection is one of the central goals of functional genomics and systems biology. Knowledge about the nature of PPIs can help fill the widening gap between sequence information and functional annotations. Although experimental methods have produced valuable PPI data, they also suffer from significant limitations. Computational PPI prediction methods have attracted tremendous attentions. Despite considerable efforts, PPI prediction is still in its infancy in complex multicellular organisms such as humans. Here, we propose a novel ensemble learning method, LocFuse, which is useful in human PPI prediction. This method uses eight different genomic and proteomic features along with four types of different classifiers. The prediction performance of this classifier selection method was found to be considerably better than methods employed hitherto. This confirms the complex nature of the PPI prediction problem and also the necessity of using biological information for classifier fusion. The LocFuse is available at: http://lbb.ut.ac.ir/Download/LBBsoft/LocFuse. The results revealed that if we divide proteome space according to the cellular localization of proteins, then the utility of some classifiers in PPI prediction can be improved. Therefore, to predict the interaction for any given protein pair, we can select the most accurate classifier with regard to the cellular localization information. Based on the results, we can say that the importance of different features for PPI prediction varies between differently localized proteins; however in general, our novel features, which were extracted from position-specific scoring matrices (PSSMs), are the most important ones and the Random Forest (RF) classifier performs best in most cases. LocFuse was developed with a user-friendly graphic interface and it is freely available for Linux, Mac OSX and MS Windows operating systems. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Brain medical image diagnosis based on corners with importance-values.

    PubMed

    Gao, Linlin; Pan, Haiwei; Li, Qing; Xie, Xiaoqin; Zhang, Zhiqiang; Han, Jinming; Zhai, Xiao

    2017-11-21

    Brain disorders are one of the top causes of human death. Generally, neurologists analyze brain medical images for diagnosis. In the image analysis field, corners are one of the most important features, which makes corner detection and matching studies essential. However, existing corner detection studies do not consider the domain information of brain. This leads to many useless corners and the loss of significant information. Regarding corner matching, the uncertainty and structure of brain are not employed in existing methods. Moreover, most corner matching studies are used for 3D image registration. They are inapplicable for 2D brain image diagnosis because of the different mechanisms. To address these problems, we propose a novel corner-based brain medical image classification method. Specifically, we automatically extract multilayer texture images (MTIs) which embody diagnostic information from neurologists. Moreover, we present a corner matching method utilizing the uncertainty and structure of brain medical images and a bipartite graph model. Finally, we propose a similarity calculation method for diagnosis. Brain CT and MRI image sets are utilized to evaluate the proposed method. First, classifiers are trained in N-fold cross-validation analysis to produce the best θ and K. Then independent brain image sets are tested to evaluate the classifiers. Moreover, the classifiers are also compared with advanced brain image classification studies. For the brain CT image set, the proposed classifier outperforms the comparison methods by at least 8% on accuracy and 2.4% on F1-score. Regarding the brain MRI image set, the proposed classifier is superior to the comparison methods by more than 7.3% on accuracy and 4.9% on F1-score. Results also demonstrate that the proposed method is robust to different intensity ranges of brain medical image. In this study, we develop a robust corner-based brain medical image classifier. Specifically, we propose a corner detection method utilizing the diagnostic information from neurologists and a corner matching method based on the uncertainty and structure of brain medical images. Additionally, we present a similarity calculation method for brain image classification. Experimental results on two brain image sets show the proposed corner-based brain medical image classifier outperforms the state-of-the-art studies.

  3. Ensemble Sparse Classification of Alzheimer’s Disease

    PubMed Central

    Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang

    2012-01-01

    The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer’s disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classification (SRC) method, which has shown effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352

  4. Evolving optimised decision rules for intrusion detection using particle swarm paradigm

    NASA Astrophysics Data System (ADS)

    Sivatha Sindhu, Siva S.; Geetha, S.; Kannan, A.

    2012-12-01

    The aim of this article is to construct a practical intrusion detection system (IDS) that properly analyses the statistics of network traffic pattern and classify them as normal or anomalous class. The objective of this article is to prove that the choice of effective network traffic features and a proficient machine-learning paradigm enhances the detection accuracy of IDS. In this article, a rule-based approach with a family of six decision tree classifiers, namely Decision Stump, C4.5, Naive Baye's Tree, Random Forest, Random Tree and Representative Tree model to perform the detection of anomalous network pattern is introduced. In particular, the proposed swarm optimisation-based approach selects instances that compose training set and optimised decision tree operate over this trained set producing classification rules with improved coverage, classification capability and generalisation ability. Experiment with the Knowledge Discovery and Data mining (KDD) data set which have information on traffic pattern, during normal and intrusive behaviour shows that the proposed algorithm produces optimised decision rules and outperforms other machine-learning algorithm.

  5. MT Ser, a binary blue subdwarf

    NASA Astrophysics Data System (ADS)

    Shimanskii, V. V.; Borisov, N. V.; Sakhibullin, N. A.; Sheveleva, D. V.

    2008-06-01

    We have classified and determined the parameters of the evolved close binary MT Ser. Our moderate-resolution spectra covering various phases of the orbital period were taken with the 6-m telescope of the Special Astrophysical Observatory. The spectra of MT Ser freed from the contribution of the surrounding nebula Abell 41 contained no emission lines due to the reflection effect. The radial velocities measured from lines of different elements showed them to be constant on a time scale corresponding to the orbital period. At the same time, we find effects of broadening for the HeII absorption lines, due to the orbital motion of two hot stars of similar types. As a result, we classify MT Ser as a system with two blue subdwarfs after the common-envelope stage. We estimate the component masses and the distance to the object from the Doppler broadening of the HeII lines. We demonstrate that the age of the ambient nebula, Abell 41, is about 35 000 years.

  6. Efficacy Evaluation of Different Wavelet Feature Extraction Methods on Brain MRI Tumor Detection

    NASA Astrophysics Data System (ADS)

    Nabizadeh, Nooshin; John, Nigel; Kubat, Miroslav

    2014-03-01

    Automated Magnetic Resonance Imaging brain tumor detection and segmentation is a challenging task. Among different available methods, feature-based methods are very dominant. While many feature extraction techniques have been employed, it is still not quite clear which of feature extraction methods should be preferred. To help improve the situation, we present the results of a study in which we evaluate the efficiency of using different wavelet transform features extraction methods in brain MRI abnormality detection. Applying T1-weighted brain image, Discrete Wavelet Transform (DWT), Discrete Wavelet Packet Transform (DWPT), Dual Tree Complex Wavelet Transform (DTCWT), and Complex Morlet Wavelet Transform (CMWT) methods are applied to construct the feature pool. Three various classifiers as Support Vector Machine, K Nearest Neighborhood, and Sparse Representation-Based Classifier are applied and compared for classifying the selected features. The results show that DTCWT and CMWT features classified with SVM, result in the highest classification accuracy, proving of capability of wavelet transform features to be informative in this application.

  7. A visual tracking method based on improved online multiple instance learning

    NASA Astrophysics Data System (ADS)

    He, Xianhui; Wei, Yuxing

    2016-09-01

    Visual tracking is an active research topic in the field of computer vision and has been well studied in the last decades. The method based on multiple instance learning (MIL) was recently introduced into the tracking task, which can solve the problem that template drift well. However, MIL method has relatively poor performance in running efficiency and accuracy, due to its strong classifiers updating strategy is complicated, and the speed of the classifiers update is not always same with the change of the targets' appearance. In this paper, we present a novel online effective MIL (EMIL) tracker. A new update strategy for strong classifier was proposed to improve the running efficiency of MIL method. In addition, to improve the t racking accuracy and stability of the MIL method, a new dynamic mechanism for learning rate renewal of the classifier and variable search window were proposed. Experimental results show that our method performs good performance under the complex scenes, with strong stability and high efficiency.

  8. New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome: Line Based Normalization Method (LBNM)

    NASA Astrophysics Data System (ADS)

    Akdemir, Bayram; Güneş, Salih; Yosunkaya, Şebnem

    Sleep disorders are a very common unawareness illness among public. Obstructive Sleep Apnea Syndrome (OSAS) is characterized with decreased oxygen saturation level and repetitive upper respiratory tract obstruction episodes during full night sleep. In the present study, we have proposed a novel data normalization method called Line Based Normalization Method (LBNM) to evaluate OSAS using real data set obtained from Polysomnography device as a diagnostic tool in patients and clinically suspected of suffering OSAS. Here, we have combined the LBNM and classification methods comprising C4.5 decision tree classifier and Artificial Neural Network (ANN) to diagnose the OSAS. Firstly, each clinical feature in OSAS dataset is scaled by LBNM method in the range of [0,1]. Secondly, normalized OSAS dataset is classified using different classifier algorithms including C4.5 decision tree classifier and ANN, respectively. The proposed normalization method was compared with min-max normalization, z-score normalization, and decimal scaling methods existing in literature on the diagnosis of OSAS. LBNM has produced very promising results on the assessing of OSAS. Also, this method could be applied to other biomedical datasets.

  9. Multiclass cancer classification using a feature subset-based ensemble from microRNA expression profiles.

    PubMed

    Piao, Yongjun; Piao, Minghao; Ryu, Keun Ho

    2017-01-01

    Cancer classification has been a crucial topic of research in cancer treatment. In the last decade, messenger RNA (mRNA) expression profiles have been widely used to classify different types of cancers. With the discovery of a new class of small non-coding RNAs; known as microRNAs (miRNAs), various studies have shown that the expression patterns of miRNA can also accurately classify human cancers. Therefore, there is a great demand for the development of machine learning approaches to accurately classify various types of cancers using miRNA expression data. In this article, we propose a feature subset-based ensemble method in which each model is learned from a different projection of the original feature space to classify multiple cancers. In our method, the feature relevance and redundancy are considered to generate multiple feature subsets, the base classifiers are learned from each independent miRNA subset, and the average posterior probability is used to combine the base classifiers. To test the performance of our method, we used bead-based and sequence-based miRNA expression datasets and conducted 10-fold and leave-one-out cross validations. The experimental results show that the proposed method yields good results and has higher prediction accuracy than popular ensemble methods. The Java program and source code of the proposed method and the datasets in the experiments are freely available at https://sourceforge.net/projects/mirna-ensemble/. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Liquid-Based Medium Used to Prepare Cytological Breast Nipple Fluid Improves the Quality of Cellular Samples Automatic Collection

    PubMed Central

    Zonta, Marco Antonio; Velame, Fernanda; Gema, Samara; Filassi, Jose Roberto; Longatto-Filho, Adhemar

    2014-01-01

    Background Breast cancer is the second cause of death in women worldwide. The spontaneous breast nipple discharge may contain cells that can be analyzed for malignancy. Halo® Mamo Cyto Test (HMCT) was recently developed as an automated system indicated to aspirate cells from the breast ducts. The objective of this study was to standardize the methodology of sampling and sample preparation of nipple discharge obtained by the automated method Halo breast test and perform cytological evaluation in samples preserved in liquid medium (SurePath™). Methods We analyzed 564 nipple fluid samples, from women between 20 and 85 years old, without history of breast disease and neoplasia, no pregnancy, and without gynecologic medical history, collected by HMCT method and preserved in two different vials with solutions for transport. Results From 306 nipple fluid samples from method 1, 199 (65%) were classified as unsatisfactory (class 0), 104 (34%) samples were classified as benign findings (class II), and three (1%) were classified as undetermined to neoplastic cells (class III). From 258 samples analyzed in method 2, 127 (49%) were classified as class 0, 124 (48%) were classified as class II, and seven (2%) were classified as class III. Conclusion Our study suggests an improvement in the quality and quantity of cellular samples when the association of the two methodologies is performed, Halo breast test and the method in liquid medium. PMID:29147397

  11. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation.

    PubMed

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-04-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers.

  12. A complete electrical shock hazard classification system and its application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gordon, Lloyd; Cartelli, Laura; Graham, Nicole

    Current electrical safety standards evolved to address the hazards of 60-Hz power that are faced primarily by electricians, linemen, and others performing facility and utility work. As a result, this leaves a substantial gap in the management of electrical hazards in Research and Development (R&D) and specialized high voltage and high power equipment. We find substantial use of direct current (dc) electrical energy, and the use of capacitors, inductors, batteries, and radiofrequency (RF) power. The electrical hazards of these forms of electricity and their systems are different than for 50/60 Hz power. This paper proposes a method of classifying allmore » of the electrical shock hazards found in all types of R&D and utilization equipment. Examples of the variation of these hazards from NFPA 70E include (a) high voltage can be harmless, if the available current is sufficiently low, (b) low voltage can be harmful if the available current/power is high, (c) high voltage capacitor hazards are unique and include severe reflex action, affects on the heart, and tissue damage, and (d) arc flash hazard analysis for dc and capacitor systems are not provided in existing standards. This work has led to a comprehensive electrical hazard classification system that is based on various research conducted over the past 100 years, on analysis of such systems in R&D, and on decades of experience. Lastly, the new comprehensive electrical shock hazard classification system uses a combination of voltage, shock current available, fault current available, power, energy, and waveform to classify all forms of electrical hazards.« less

  13. A complete electrical shock hazard classification system and its application

    DOE PAGES

    Gordon, Lloyd; Cartelli, Laura; Graham, Nicole

    2018-02-08

    Current electrical safety standards evolved to address the hazards of 60-Hz power that are faced primarily by electricians, linemen, and others performing facility and utility work. As a result, this leaves a substantial gap in the management of electrical hazards in Research and Development (R&D) and specialized high voltage and high power equipment. We find substantial use of direct current (dc) electrical energy, and the use of capacitors, inductors, batteries, and radiofrequency (RF) power. The electrical hazards of these forms of electricity and their systems are different than for 50/60 Hz power. This paper proposes a method of classifying allmore » of the electrical shock hazards found in all types of R&D and utilization equipment. Examples of the variation of these hazards from NFPA 70E include (a) high voltage can be harmless, if the available current is sufficiently low, (b) low voltage can be harmful if the available current/power is high, (c) high voltage capacitor hazards are unique and include severe reflex action, affects on the heart, and tissue damage, and (d) arc flash hazard analysis for dc and capacitor systems are not provided in existing standards. This work has led to a comprehensive electrical hazard classification system that is based on various research conducted over the past 100 years, on analysis of such systems in R&D, and on decades of experience. Lastly, the new comprehensive electrical shock hazard classification system uses a combination of voltage, shock current available, fault current available, power, energy, and waveform to classify all forms of electrical hazards.« less

  14. Feature weighting using particle swarm optimization for learning vector quantization classifier

    NASA Astrophysics Data System (ADS)

    Dongoran, A.; Rahmadani, S.; Zarlis, M.; Zakarias

    2018-03-01

    This paper discusses and proposes a method of feature weighting in classification assignments on competitive learning artificial neural network LVQ. The weighting feature method is the search for the weight of an attribute using the PSO so as to give effect to the resulting output. This method is then applied to the LVQ-Classifier and tested on the 3 datasets obtained from the UCI Machine Learning repository. Then an accuracy analysis will be generated by two approaches. The first approach using LVQ1, referred to as LVQ-Classifier and the second approach referred to as PSOFW-LVQ, is a proposed model. The result shows that the PSO algorithm is capable of finding attribute weights that increase LVQ-classifier accuracy.

  15. ClearTK 2.0: Design Patterns for Machine Learning in UIMA

    PubMed Central

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-01-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework. PMID:29104966

  16. ClearTK 2.0: Design Patterns for Machine Learning in UIMA.

    PubMed

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-05-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.

  17. Mid-Infrared Interferometric Monitoring of Evolved Stars: The Dust Shell Around the Mira Variable RR Aquilae at 13 Epochs

    DTIC Science & Technology

    2011-01-01

    photometric and interferometric data. 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT Same as Report (SAR) 18. NUMBER OF...λ = 2.2 μm, Δλ = 0.4 μm) angular size with the Infrared Optical Telescope Array ( IOTA ). The uniform disk diameter (UD) of θUD = 10.73 ± 0.66 mas at...with IOTA in the H-band, and classified RR Aql as a target with no detectable asymmetries. The IRAS flux at 12 μm is 332 Jy. The light curve in the V

  18. Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces

    PubMed Central

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550

  19. Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.

  20. Assessing Hospital Performance After Percutaneous Coronary Intervention Using Big Data.

    PubMed

    Spertus, Jacob V; T Normand, Sharon-Lise; Wolf, Robert; Cioffi, Matt; Lovett, Ann; Rose, Sherri

    2016-11-01

    Although risk adjustment remains a cornerstone for comparing outcomes across hospitals, optimal strategies continue to evolve in the presence of many confounders. We compared conventional regression-based model to approaches particularly suited to leveraging big data. We assessed hospital all-cause 30-day excess mortality risk among 8952 adults undergoing percutaneous coronary intervention between October 1, 2011, and September 30, 2012, in 24 Massachusetts hospitals using clinical registry data linked with billing data. We compared conventional logistic regression models with augmented inverse probability weighted estimators and targeted maximum likelihood estimators to generate more efficient and unbiased estimates of hospital effects. We also compared a clinically informed and a machine-learning approach to confounder selection, using elastic net penalized regression in the latter case. Hospital excess risk estimates range from -1.4% to 2.0% across methods and confounder sets. Some hospitals were consistently classified as low or as high excess mortality outliers; others changed classification depending on the method and confounder set used. Switching from the clinically selected list of 11 confounders to a full set of 225 confounders increased the estimation uncertainty by an average of 62% across methods as measured by confidence interval length. Agreement among methods ranged from fair, with a κ statistic of 0.39 (SE: 0.16), to perfect, with a κ of 1 (SE: 0.0). Modern causal inference techniques should be more frequently adopted to leverage big data while minimizing bias in hospital performance assessments. © 2016 American Heart Association, Inc.

  1. Comparative multi-goal tradeoffs in systems engineering of microbial metabolism

    PubMed Central

    2012-01-01

    Background Metabolic engineering design methodology has evolved from using pathway-centric, random and empirical-based methods to using systems-wide, rational and integrated computational and experimental approaches. Persistent during these advances has been the desire to develop design strategies that address multiple simultaneous engineering goals, such as maximizing productivity, while minimizing raw material costs. Results Here, we use constraint-based modeling to systematically design multiple combinations of medium compositions and gene-deletion strains for three microorganisms (Escherichia coli, Saccharomyces cerevisiae, and Shewanella oneidensis) and six industrially important byproducts (acetate, D-lactate, hydrogen, ethanol, formate, and succinate). We evaluated over 435 million simulated conditions and 36 engineering metabolic traits, including product rates, costs, yields and purity. Conclusions The resulting metabolic phenotypes can be classified into dominant clusters (meta-phenotypes) for each organism. These meta-phenotypes illustrate global phenotypic variation and sensitivities, trade-offs associated with multiple engineering goals, and fundamental differences in organism-specific capabilities. Given the increasing number of sequenced genomes and corresponding stoichiometric models, we envisage that the proposed strategy could be extended to address a growing range of biological questions and engineering applications. PMID:23009214

  2. Development of archetypes for non-ranking classification and comparison of European National Health Technology Assessment systems.

    PubMed

    Allen, Nicola; Pichler, Franz; Wang, Tina; Patel, Sundip; Salek, Sam

    2013-12-01

    European countries are increasingly utilising health technology assessment (HTA) to inform reimbursement decision-making. However, the current European HTA environment is very diverse, and projects are already underway to initiate a more efficient and aligned HTA practice within Europe. This study aims to identify a non-ranking method for classifying the diversity of European HTA agencies process and the organisational architecture of the national regulatory review to reimbursement systems. Using a previously developed mapping methodology, this research created process maps to describe national processes for regulatory review to reimbursement for 33 European jurisdictions. These process maps enabled the creation of 2 HTA taxonomic sets. The confluence of the two taxonomic sets was subsequently cross-referenced to identify 10 HTA archetype groups. HTA is a young, rapidly evolving field and it can be argued that optimal practices for performing HTA are yet to emerge. Therefore, a non-ranking classification approach could objectively characterise and compare the diversity observed in the current European HTA environment. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  3. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    PubMed

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.

  4. Metal Oxide Gas Sensor Drift Compensation Using a Two-Dimensional Classifier Ensemble

    PubMed Central

    Liu, Hang; Chu, Renzhi; Tang, Zhenan

    2015-01-01

    Sensor drift is the most challenging problem in gas sensing at present. We propose a novel two-dimensional classifier ensemble strategy to solve the gas discrimination problem, regardless of the gas concentration, with high accuracy over extended periods of time. This strategy is appropriate for multi-class classifiers that consist of combinations of pairwise classifiers, such as support vector machines. We compare the performance of the strategy with those of competing methods in an experiment based on a public dataset that was compiled over a period of three years. The experimental results demonstrate that the two-dimensional ensemble outperforms the other methods considered. Furthermore, we propose a pre-aging process inspired by that applied to the sensors to improve the stability of the classifier ensemble. The experimental results demonstrate that the weight of each multi-class classifier model in the ensemble remains fairly static before and after the addition of new classifier models to the ensemble, when a pre-aging procedure is applied. PMID:25942640

  5. Dealing with contaminated datasets: An approach to classifier training

    NASA Astrophysics Data System (ADS)

    Homenda, Wladyslaw; Jastrzebska, Agnieszka; Rybnik, Mariusz

    2016-06-01

    The paper presents a novel approach to classification reinforced with rejection mechanism. The method is based on a two-tier set of classifiers. First layer classifies elements, second layer separates native elements from foreign ones in each distinguished class. The key novelty presented here is rejection mechanism training scheme according to the philosophy "one-against-all-other-classes". Proposed method was tested in an empirical study of handwritten digits recognition.

  6. On Algorithms for Generating Computationally Simple Piecewise Linear Classifiers

    DTIC Science & Technology

    1989-05-01

    suffers. - Waveform classification, e.g. speech recognition, seismic analysis (i.e. discrimination between earthquakes and nuclear explosions), target...assuming Gaussian distributions (B-G) d) Bayes classifier with probability densities estimated with the k-N-N method (B- kNN ) e) The -arest neighbour...range of classifiers are chosen including a fast, easy computable and often used classifier (B-G), reliable and complex classifiers (B- kNN and NNR

  7. Genetic programming for evolving due-date assignment models in job shop environments.

    PubMed

    Nguyen, Su; Zhang, Mengjie; Johnston, Mark; Tan, Kay Chen

    2014-01-01

    Due-date assignment plays an important role in scheduling systems and strongly influences the delivery performance of job shops. Because of the stochastic and dynamic nature of job shops, the development of general due-date assignment models (DDAMs) is complicated. In this study, two genetic programming (GP) methods are proposed to evolve DDAMs for job shop environments. The experimental results show that the evolved DDAMs can make more accurate estimates than other existing dynamic DDAMs with promising reusability. In addition, the evolved operation-based DDAMs show better performance than the evolved DDAMs employing aggregate information of jobs and machines.

  8. Using an object-based grid system to evaluate a newly developed EP approach to formulate SVMs as applied to the classification of organophosphate nerve agents

    NASA Astrophysics Data System (ADS)

    Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun

    2004-04-01

    This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.

  9. Automated grouping of action potentials of human embryonic stem cell-derived cardiomyocytes.

    PubMed

    Gorospe, Giann; Zhu, Renjun; Millrod, Michal A; Zambidis, Elias T; Tung, Leslie; Vidal, Rene

    2014-09-01

    Methods for obtaining cardiomyocytes from human embryonic stem cells (hESCs) are improving at a significant rate. However, the characterization of these cardiomyocytes (CMs) is evolving at a relatively slower rate. In particular, there is still uncertainty in classifying the phenotype (ventricular-like, atrial-like, nodal-like, etc.) of an hESC-derived cardiomyocyte (hESC-CM). While previous studies identified the phenotype of a CM based on electrophysiological features of its action potential, the criteria for classification were typically subjective and differed across studies. In this paper, we use techniques from signal processing and machine learning to develop an automated approach to discriminate the electrophysiological differences between hESC-CMs. Specifically, we propose a spectral grouping-based algorithm to separate a population of CMs into distinct groups based on the similarity of their action potential shapes. We applied this method to a dataset of optical maps of cardiac cell clusters dissected from human embryoid bodies. While some of the nine cell clusters in the dataset are presented with just one phenotype, the majority of the cell clusters are presented with multiple phenotypes. The proposed algorithm is generally applicable to other action potential datasets and could prove useful in investigating the purification of specific types of CMs from an electrophysiological perspective.

  10. Automated Grouping of Action Potentials of Human Embryonic Stem Cell-Derived Cardiomyocytes

    PubMed Central

    Gorospe, Giann; Zhu, Renjun; Millrod, Michal A.; Zambidis, Elias T.; Tung, Leslie; Vidal, René

    2015-01-01

    Methods for obtaining cardiomyocytes from human embryonic stem cells (hESCs) are improving at a significant rate. However, the characterization of these cardiomyocytes is evolving at a relatively slower rate. In particular, there is still uncertainty in classifying the phenotype (ventricular-like, atrial-like, nodal-like, etc.) of an hESC-derived cardiomyocyte (hESC-CM). While previous studies identified the phenotype of a cardiomyocyte based on electrophysiological features of its action potential, the criteria for classification were typically subjective and differed across studies. In this paper, we use techniques from signal processing and machine learning to develop an automated approach to discriminate the electrophysiological differences between hESC-CMs. Specifically, we propose a spectral grouping-based algorithm to separate a population of cardiomyocytes into distinct groups based on the similarity of their action potential shapes. We applied this method to a dataset of optical maps of cardiac cell clusters dissected from human embryoid bodies (hEBs). While some of the 9 cell clusters in the dataset presented with just one phenotype, the majority of the cell clusters presented with multiple phenotypes. The proposed algorithm is generally applicable to other action potential datasets and could prove useful in investigating the purification of specific types of cardiomyocytes from an electrophysiological perspective. PMID:25148658

  11. Computational Identification of Novel Genes: Current and Future Perspectives.

    PubMed

    Klasberg, Steffen; Bitard-Feildel, Tristan; Mallet, Ludovic

    2016-01-01

    While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.

  12. Characterization of radiation belt electron energy spectra from CRRES observations

    NASA Astrophysics Data System (ADS)

    Johnston, W. R.; Lindstrom, C. D.; Ginet, G. P.

    2010-12-01

    Energetic electrons in the outer radiation belt and the slot region exhibit a wide variety of energy spectral forms, more so than radiation belt protons. We characterize the spatial and temporal dependence of these forms using observations from the CRRES satellite Medium Electron Sensor A (MEA) and High-Energy Electron Fluxmeter (HEEF) instruments, together covering an energy range 0.15-8 MeV. Spectra were classified with two independent methods, data clustering and curve-fitting analyses, in each case defining categories represented by power law, exponential, and bump-on-tail (BOT) or other complex shapes. Both methods yielded similar results, with BOT, exponential, and power law spectra respectively dominating in the slot region, outer belt, and regions just beyond the outer belt. The transition from exponential to power law spectra occurs at higher L for lower magnetic latitude. The location of the transition from exponential to BOT spectra is highly correlated with the location of the plasmapause. In the slot region during the days following storm events, electron spectra were observed to evolve from exponential to BOT yielding differential flux minima at 350-650 keV and maxima at 1.5-2 MeV; such evolution has been attributed to energy-dependent losses from scattering by whistler hiss.

  13. Input Decimated Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)

    2001-01-01

    Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.

  14. Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor

    PubMed Central

    Alamedine, D.; Khalil, M.; Marque, C.

    2013-01-01

    Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536

  15. An information-based network approach for protein classification

    PubMed Central

    Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.

    2017-01-01

    Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835

  16. A Feature-Free 30-Disease Pathological Brain Detection System by Linear Regression Classifier.

    PubMed

    Chen, Yi; Shao, Ying; Yan, Jie; Yuan, Ti-Fei; Qu, Yanwen; Lee, Elizabeth; Wang, Shuihua

    2017-01-01

    Alzheimer's disease patients are increasing rapidly every year. Scholars tend to use computer vision methods to develop automatic diagnosis system. (Background) In 2015, Gorji et al. proposed a novel method using pseudo Zernike moment. They tested four classifiers: learning vector quantization neural network, pattern recognition neural network trained by Levenberg-Marquardt, by resilient backpropagation, and by scaled conjugate gradient. This study presents an improved method by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Therefore, it can be used to detect Alzheimer's disease. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  17. A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.

    PubMed

    Gao, Xiang; Lin, Huaiying; Dong, Qunfeng

    2017-01-01

    Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.

  18. A Hyper-Heuristic Ensemble Method for Static Job-Shop Scheduling.

    PubMed

    Hart, Emma; Sim, Kevin

    2016-01-01

    We describe a new hyper-heuristic method NELLI-GP for solving job-shop scheduling problems (JSSP) that evolves an ensemble of heuristics. The ensemble adopts a divide-and-conquer approach in which each heuristic solves a unique subset of the instance set considered. NELLI-GP extends an existing ensemble method called NELLI by introducing a novel heuristic generator that evolves heuristics composed of linear sequences of dispatching rules: each rule is represented using a tree structure and is itself evolved. Following a training period, the ensemble is shown to outperform both existing dispatching rules and a standard genetic programming algorithm on a large set of new test instances. In addition, it obtains superior results on a set of 210 benchmark problems from the literature when compared to two state-of-the-art hyper-heuristic approaches. Further analysis of the relationship between heuristics in the evolved ensemble and the instances each solves provides new insights into features that might describe similar instances.

  19. New approach to information fusion for Lipschitz classifiers ensembles: Application in multi-channel C-OTDR-monitoring systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Timofeev, Andrey V.; Egorov, Dmitry V.

    This paper presents new results concerning selection of an optimal information fusion formula for an ensemble of Lipschitz classifiers. The goal of information fusion is to create an integral classificatory which could provide better generalization ability of the ensemble while achieving a practically acceptable level of effectiveness. The problem of information fusion is very relevant for data processing in multi-channel C-OTDR-monitoring systems. In this case we have to effectively classify targeted events which appear in the vicinity of the monitored object. Solution of this problem is based on usage of an ensemble of Lipschitz classifiers each of which corresponds tomore » a respective channel. We suggest a brand new method for information fusion in case of ensemble of Lipschitz classifiers. This method is called “The Weighing of Inversely as Lipschitz Constants” (WILC). Results of WILC-method practical usage in multichannel C-OTDR monitoring systems are presented.« less

  20. An Improved Ensemble Learning Method for Classifying High-Dimensional and Imbalanced Biomedicine Data.

    PubMed

    Yu, Hualong; Ni, Jun

    2014-01-01

    Training classifiers on skewed data can be technically challenging tasks, especially if the data is high-dimensional simultaneously, the tasks can become more difficult. In biomedicine field, skewed data type often appears. In this study, we try to deal with this problem by combining asymmetric bagging ensemble classifier (asBagging) that has been presented in previous work and an improved random subspace (RS) generation strategy that is called feature subspace (FSS). Specifically, FSS is a novel method to promote the balance level between accuracy and diversity of base classifiers in asBagging. In view of the strong generalization capability of support vector machine (SVM), we adopt it to be base classifier. Extensive experiments on four benchmark biomedicine data sets indicate that the proposed ensemble learning method outperforms many baseline approaches in terms of Accuracy, F-measure, G-mean and AUC evaluation criterions, thus it can be regarded as an effective and efficient tool to deal with high-dimensional and imbalanced biomedical data.

  1. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation

    PubMed Central

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-01-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers. PMID:24098863

  2. Multi-feature classifiers for burst detection in single EEG channels from preterm infants

    NASA Astrophysics Data System (ADS)

    Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.

    2017-08-01

    Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA  ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa  =  0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants  ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.

  3. Purification and characterization of an oxygen-evolving photosystem II from Leptolyngbya sp. strain O-77.

    PubMed

    Nakamori, Harutaka; Yatabe, Takeshi; Yoon, Ki-Seok; Ogo, Seiji

    2014-08-01

    A new cyanobacterium of strain O-77 was isolated from a hot spring at Aso-Kuju National Park, Kumamoto, Japan. According to the phylogenetic analysis determined by 16S rRNA gene sequence, the strain O-77 belongs to the genus Leptolyngbya, classifying into filamentous non-heterocystous cyanobacteria. The strain O-77 showed the thermophilic behavior with optimal growth temperature of 55°C. Moreover, we have purified and characterized the oxygen-evolving photosystem II (PSII) from the strain O-77. The O2-evolving activity of the purified PSII from strain O-77 (PSIIO77) was 1275 ± 255 μmol O2 (mg Chl a)(-1) h(-1). Based on the results of MALDI-TOF mass spectrometry and urea-SDS-PAGE analysis, the purified PSIIO77 was composite of the typical PSII components of CP47, CP43, PsbO, D2, D1, PsbV, PsbQ, PsbU, and several low molecular mass subunits. Visible absorption and 77 K fluorescence spectra of the purified PSIIO77 were almost identical to those of other purified PSIIs from cyanobacteria. This report provides the successful example for the purification and characterization of an active PSII from thermophilic, filamentous non-heterocystous cyanobacteria. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  4. Methods, systems and devices for detecting threatening objects and for classifying magnetic data

    DOEpatents

    Kotter, Dale K [Shelley, ID; Roybal, Lyle G [Idaho Falls, ID; Rohrbaugh, David T [Idaho Falls, ID; Spencer, David F [Idaho Falls, ID

    2012-01-24

    A method for detecting threatening objects in a security screening system. The method includes a step of classifying unique features of magnetic data as representing a threatening object. Another step includes acquiring magnetic data. Another step includes determining if the acquired magnetic data comprises a unique feature.

  5. Listening to galaxies tuning at z ~ 2.5-3.0: The first strikes of the Hubble fork

    NASA Astrophysics Data System (ADS)

    Talia, M.; Cimatti, A.; Mignoli, M.; Pozzetti, L.; Renzini, A.; Kurk, J.; Halliday, C.

    2014-02-01

    Aims: We investigate the morphological properties of 494 galaxies selected from the Galaxy Mass Assembly ultra-deep Spectroscopic Survey (GMASS) at z > 1, primarily in their optical rest frame, using Hubble Space Telescope (HST) infrared images, from the Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS). Methods: The morphological analysis of Wield Field Camera (WFC3) H160 band images was performed using two different methods: a visual classification identifying traditional Hubble types, and a quantitative analysis using parameters that describe structural properties, such as the concentration of light and the rotational asymmetry. The two classifications are compared. We then analysed how apparent morphologies correlate with the physical properties of galaxies. Results: The fractions of both elliptical and disk galaxies decrease between redshifts z ~ 1 to z ~ 3, while at z > 3 the galaxy population is dominated by irregular galaxies. The quantitative morphological analysis shows that, at 1 < z < 3, morphological parameters are not as effective in distinguishing the different morphological Hubble types as they are at low redshift. No significant morphological k-correction was found to be required for the Hubble type classification, with some exceptions. In general, different morphological types occupy the two peaks of the (U - B)rest colour bimodality of galaxies: most irregulars occupy the blue peak, while ellipticals are mainly found in the red peak, though with some level of contamination. Disks are more evenly distributed than either irregulars and ellipticals. We find that the position of a galaxy in a UVJ diagram is related to its morphological type: the "quiescent" region of the plot is mainly occupied by ellipticals and, to a lesser extent, by disks. We find that only ~33% of all morphological ellipticals in our sample are red and passively evolving galaxies, a percentage that is consistent with previous results obtained at z < 1. Blue galaxies morphologically classified as ellipticals show a remarkable structural similarity to red ones. We search for correlations between our morphological and spectroscopic galaxy classifications. Almost all irregulars have a star-forming galaxy spectrum. In addition, the majority of disks show some sign of star-formation activity in their spectra, though in some cases their red continuum is indicative of old stellar populations. Finally, an elliptical morphology may be associated with either passively evolving or strongly star-forming galaxies. Conclusions: We propose that the Hubble sequence of galaxy morphologies takes shape at redshift 2.5 < z < 3. The fractions of both ellipticals and disks decrease with increasing lookback time at z > 1, such that at redshifts z = 2.5-2.7 and above, the Hubble types cannot be identified, and most galaxies are classified as irregular. Appendix A is available in electronic form at http://www.aanda.org

  6. Recognition of medication information from discharge summaries using ensembles of classifiers.

    PubMed

    Doan, Son; Collier, Nigel; Xu, Hua; Pham, Hoang Duy; Tu, Minh Phuong

    2012-05-07

    Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different classifiers to further improve the performance of clinical entity recognition systems has not been investigated extensively. Combining classifiers into an ensemble classifier presents both challenges and opportunities to improve performance in such NLP tasks. We investigated ensemble classifiers that used different voting strategies to combine outputs from three individual classifiers: a rule-based system, a support vector machine (SVM) based system, and a conditional random field (CRF) based system. Three voting methods were proposed and evaluated using the annotated data sets from the 2009 i2b2 NLP challenge: simple majority, local SVM-based voting, and local CRF-based voting. Evaluation on 268 manually annotated discharge summaries from the i2b2 challenge showed that the local CRF-based voting method achieved the best F-score of 90.84% (94.11% Precision, 87.81% Recall) for 10-fold cross-validation. We then compared our systems with the first-ranked system in the challenge by using the same training and test sets. Our system based on majority voting achieved a better F-score of 89.65% (93.91% Precision, 85.76% Recall) than the previously reported F-score of 89.19% (93.78% Precision, 85.03% Recall) by the first-ranked system in the challenge. Our experimental results using the 2009 i2b2 challenge datasets showed that ensemble classifiers that combine individual classifiers into a voting system could achieve better performance than a single classifier in recognizing medication information from clinical text. It suggests that simple strategies that can be easily implemented such as majority voting could have the potential to significantly improve clinical entity recognition.

  7. Image Change Detection via Ensemble Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martin, Benjamin W; Vatsavai, Raju

    2013-01-01

    The concept of geographic change detection is relevant in many areas. Changes in geography can reveal much information about a particular location. For example, analysis of changes in geography can identify regions of population growth, change in land use, and potential environmental disturbance. A common way to perform change detection is to use a simple method such as differencing to detect regions of change. Though these techniques are simple, often the application of these techniques is very limited. Recently, use of machine learning methods such as neural networks for change detection has been explored with great success. In this work,more » we explore the use of ensemble learning methodologies for detecting changes in bitemporal synthetic aperture radar (SAR) images. Ensemble learning uses a collection of weak machine learning classifiers to create a stronger classifier which has higher accuracy than the individual classifiers in the ensemble. The strength of the ensemble lies in the fact that the individual classifiers in the ensemble create a mixture of experts in which the final classification made by the ensemble classifier is calculated from the outputs of the individual classifiers. Our methodology leverages this aspect of ensemble learning by training collections of weak decision tree based classifiers to identify regions of change in SAR images collected of a region in the Staten Island, New York area during Hurricane Sandy. Preliminary studies show that the ensemble method has approximately 11.5% higher change detection accuracy than an individual classifier.« less

  8. Thinking Through Computational Exposure as an Evolving Paradign Shift for Exposure Science: Development and Application of Predictive Models from Big Data

    EPA Science Inventory

    Symposium Abstract: Exposure science has evolved from a time when the primary focus was on measurements of environmental and biological media and the development of enabling field and laboratory methods. The Total Exposure Assessment Method (TEAM) studies of the 1980s were class...

  9. Dynamical Bayesian inference of time-evolving interactions: from a pair of coupled oscillators to networks of oscillators.

    PubMed

    Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V E; Stefanovska, Aneta

    2012-12-01

    Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski et al. [Phys. Rev. Lett. 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.

  10. Vision based nutrient deficiency classification in maize plants using multi class support vector machines

    NASA Astrophysics Data System (ADS)

    Leena, N.; Saju, K. K.

    2018-04-01

    Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.

  11. Polar cloud and surface classification using AVHRR imagery - An intercomparison of methods

    NASA Technical Reports Server (NTRS)

    Welch, R. M.; Sengupta, S. K.; Goroch, A. K.; Rabindra, P.; Rangaraj, N.; Navar, M. S.

    1992-01-01

    Six Advanced Very High-Resolution Radiometer local area coverage (AVHRR LAC) arctic scenes are classified into ten classes. Three different classifiers are examined: (1) the traditional stepwise discriminant analysis (SDA) method; (2) the feed-forward back-propagation (FFBP) neural network; and (3) the probabilistic neural network (PNN). More than 200 spectral and textural measures are computed. These are reduced to 20 features using sequential forward selection. Theoretical accuracy of the classifiers is determined using the bootstrap approach. Overall accuracy is 85.6 percent, 87.6 percent, and 87.0 percent for the SDA, FFBP, and PNN classifiers, respectively, with standard deviations of approximately 1 percent.

  12. Identifying Degenerative Brain Disease Using Rough Set Classifier Based on Wavelet Packet Method.

    PubMed

    Cheng, Ching-Hsue; Liu, Wei-Xiang

    2018-05-28

    Population aging has become a worldwide phenomenon, which causes many serious problems. The medical issues related to degenerative brain disease have gradually become a concern. Magnetic Resonance Imaging is one of the most advanced methods for medical imaging and is especially suitable for brain scans. From the literature, although the automatic segmentation method is less laborious and time-consuming, it is restricted in several specific types of images. In addition, hybrid techniques segmentation improves the shortcomings of the single segmentation method. Therefore, this study proposed a hybrid segmentation combined with rough set classifier and wavelet packet method to identify degenerative brain disease. The proposed method is a three-stage image process method to enhance accuracy of brain disease classification. In the first stage, this study used the proposed hybrid segmentation algorithms to segment the brain ROI (region of interest). In the second stage, wavelet packet was used to conduct the image decomposition and calculate the feature values. In the final stage, the rough set classifier was utilized to identify the degenerative brain disease. In verification and comparison, two experiments were employed to verify the effectiveness of the proposed method and compare with the TV-seg (total variation segmentation) algorithm, Discrete Cosine Transform, and the listing classifiers. Overall, the results indicated that the proposed method outperforms the listing methods.

  13. Selective Transfer Machine for Personalized Facial Action Unit Detection

    PubMed Central

    Chu, Wen-Sheng; De la Torre, Fernando; Cohn, Jeffery F.

    2014-01-01

    Automatic facial action unit (AFA) detection from video is a long-standing problem in facial expression analysis. Most approaches emphasize choices of features and classifiers. They neglect individual differences in target persons. People vary markedly in facial morphology (e.g., heavy versus delicate brows, smooth versus deeply etched wrinkles) and behavior. Individual differences can dramatically influence how well generic classifiers generalize to previously unseen persons. While a possible solution would be to train person-specific classifiers, that often is neither feasible nor theoretically compelling. The alternative that we propose is to personalize a generic classifier in an unsupervised manner (no additional labels for the test subjects are required). We introduce a transductive learning method, which we refer to Selective Transfer Machine (STM), to personalize a generic classifier by attenuating person-specific biases. STM achieves this effect by simultaneously learning a classifier and re-weighting the training samples that are most relevant to the test subject. To evaluate the effectiveness of STM, we compared STM to generic classifiers and to cross-domain learning methods in three major databases: CK+ [20], GEMEP-FERA [32] and RU-FACS [2]. STM outperformed generic classifiers in all. PMID:25242877

  14. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    PubMed

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  15. PPCM: Combing multiple classifiers to improve protein-protein interaction prediction

    DOE PAGES

    Yao, Jianzhuang; Guo, Hong; Yang, Xiaohan

    2015-08-01

    Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using anmore » assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.« less

  16. OpenCL based machine learning labeling of biomedical datasets

    NASA Astrophysics Data System (ADS)

    Amoros, Oscar; Escalera, Sergio; Puig, Anna

    2011-03-01

    In this paper, we propose a two-stage labeling method of large biomedical datasets through a parallel approach in a single GPU. Diagnostic methods, structures volume measurements, and visualization systems are of major importance for surgery planning, intra-operative imaging and image-guided surgery. In all cases, to provide an automatic and interactive method to label or to tag different structures contained into input data becomes imperative. Several approaches to label or segment biomedical datasets has been proposed to discriminate different anatomical structures in an output tagged dataset. Among existing methods, supervised learning methods for segmentation have been devised to easily analyze biomedical datasets by a non-expert user. However, they still have some problems concerning practical application, such as slow learning and testing speeds. In addition, recent technological developments have led to widespread availability of multi-core CPUs and GPUs, as well as new software languages, such as NVIDIA's CUDA and OpenCL, allowing to apply parallel programming paradigms in conventional personal computers. Adaboost classifier is one of the most widely applied methods for labeling in the Machine Learning community. In a first stage, Adaboost trains a binary classifier from a set of pre-labeled samples described by a set of features. This binary classifier is defined as a weighted combination of weak classifiers. Each weak classifier is a simple decision function estimated on a single feature value. Then, at the testing stage, each weak classifier is independently applied on the features of a set of unlabeled samples. In this work, we propose an alternative representation of the Adaboost binary classifier. We use this proposed representation to define a new GPU-based parallelized Adaboost testing stage using OpenCL. We provide numerical experiments based on large available data sets and we compare our results to CPU-based strategies in terms of time and labeling speeds.

  17. Detection of inter-patient left and right bundle branch block heartbeats in ECG using ensemble classifiers

    PubMed Central

    2014-01-01

    Background Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). Methods This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. Results The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. Conclusions A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients. PMID:24903422

  18. Multi-Pixel Simultaneous Classification of PolSAR Image Using Convolutional Neural Networks

    PubMed Central

    Xu, Xin; Gui, Rong; Pu, Fangling

    2018-01-01

    Convolutional neural networks (CNN) have achieved great success in the optical image processing field. Because of the excellent performance of CNN, more and more methods based on CNN are applied to polarimetric synthetic aperture radar (PolSAR) image classification. Most CNN-based PolSAR image classification methods can only classify one pixel each time. Because all the pixels of a PolSAR image are classified independently, the inherent interrelation of different land covers is ignored. We use a fixed-feature-size CNN (FFS-CNN) to classify all pixels in a patch simultaneously. The proposed method has several advantages. First, FFS-CNN can classify all the pixels in a small patch simultaneously. When classifying a whole PolSAR image, it is faster than common CNNs. Second, FFS-CNN is trained to learn the interrelation of different land covers in a patch, so it can use the interrelation of land covers to improve the classification results. The experiments of FFS-CNN are evaluated on a Chinese Gaofen-3 PolSAR image and other two real PolSAR images. Experiment results show that FFS-CNN is comparable with the state-of-the-art PolSAR image classification methods. PMID:29510499

  19. Transfer Learning for Class Imbalance Problems with Inadequate Data.

    PubMed

    Al-Stouhi, Samir; Reddy, Chandan K

    2016-07-01

    A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.

  20. Developing collaborative classifiers using an expert-based model

    USGS Publications Warehouse

    Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan

    2009-01-01

    This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.

  1. Multi-Pixel Simultaneous Classification of PolSAR Image Using Convolutional Neural Networks.

    PubMed

    Wang, Lei; Xu, Xin; Dong, Hao; Gui, Rong; Pu, Fangling

    2018-03-03

    Convolutional neural networks (CNN) have achieved great success in the optical image processing field. Because of the excellent performance of CNN, more and more methods based on CNN are applied to polarimetric synthetic aperture radar (PolSAR) image classification. Most CNN-based PolSAR image classification methods can only classify one pixel each time. Because all the pixels of a PolSAR image are classified independently, the inherent interrelation of different land covers is ignored. We use a fixed-feature-size CNN (FFS-CNN) to classify all pixels in a patch simultaneously. The proposed method has several advantages. First, FFS-CNN can classify all the pixels in a small patch simultaneously. When classifying a whole PolSAR image, it is faster than common CNNs. Second, FFS-CNN is trained to learn the interrelation of different land covers in a patch, so it can use the interrelation of land covers to improve the classification results. The experiments of FFS-CNN are evaluated on a Chinese Gaofen-3 PolSAR image and other two real PolSAR images. Experiment results show that FFS-CNN is comparable with the state-of-the-art PolSAR image classification methods.

  2. Deep learning classification in asteroseismology using an improved neural network: results on 15 000 Kepler red giants and applications to K2 and TESS data

    NASA Astrophysics Data System (ADS)

    Hon, Marc; Stello, Dennis; Yu, Jie

    2018-05-01

    Deep learning in the form of 1D convolutional neural networks have previously been shown to be capable of efficiently classifying the evolutionary state of oscillating red giants into red giant branch stars and helium-core burning stars by recognizing visual features in their asteroseismic frequency spectra. We elaborate further on the deep learning method by developing an improved convolutional neural network classifier. To make our method useful for current and future space missions such as K2, TESS, and PLATO, we train classifiers that are able to classify the evolutionary states of lower frequency resolution spectra expected from these missions. Additionally, we provide new classifications for 8633 Kepler red giants, out of which 426 have previously not been classified using asteroseismology. This brings the total to 14983 Kepler red giants classified with our new neural network. We also verify that our classifiers are remarkably robust to suboptimal data, including low signal-to-noise and incorrect training truth labels.

  3. Repliscan: a tool for classifying replication timing regions.

    PubMed

    Zynda, Gregory J; Song, Jawon; Concia, Lorenzo; Wear, Emily E; Hanley-Bowdoin, Linda; Thompson, William F; Vaughn, Matthew W

    2017-08-07

    Replication timing experiments that use label incorporation and high throughput sequencing produce peaked data similar to ChIP-Seq experiments. However, the differences in experimental design, coverage density, and possible results make traditional ChIP-Seq analysis methods inappropriate for use with replication timing. To accurately detect and classify regions of replication across the genome, we present Repliscan. Repliscan robustly normalizes, automatically removes outlying and uninformative data points, and classifies Repli-seq signals into discrete combinations of replication signatures. The quality control steps and self-fitting methods make Repliscan generally applicable and more robust than previous methods that classify regions based on thresholds. Repliscan is simple and effective to use on organisms with different genome sizes. Even with analysis window sizes as small as 1 kilobase, reliable profiles can be generated with as little as 2.4x coverage.

  4. Assigning Polarity to Causal Information in Financial Articles on Business Performance of Companies

    NASA Astrophysics Data System (ADS)

    Sakai, Hiroyuki; Masuyama, Shigeru

    We propose a method of assigning polarity to causal information extracted from Japanese financial articles concerning business performance of companies. Our method assigns polarity (positive or negative) to causal information in accordance with business performance, e.g. “zidousya no uriage ga koutyou: (Sales of cars are good)” (The polarity positive is assigned in this example). We may use causal expressions assigned polarity by our method, e.g., to analyze content of articles concerning business performance circumstantially. First, our method classifies articles concerning business performance into positive articles and negative articles. Using them, our method assigns polarity (positive or negative) to causal information extracted from the set of articles concerning business performance. Although our method needs training dataset for classifying articles concerning business performance into positive and negative ones, our method does not need a training dataset for assigning polarity to causal information. Hence, even if causal information not appearing in the training dataset for classifying articles concerning business performance into positive and negative ones exist, our method is able to assign it polarity by using statistical information of this classified sets of articles. We evaluated our method and confirmed that it attained 74.4% precision and 50.4% recall of assigning polarity positive, and 76.8% precision and 61.5% recall of assigning polarity negative, respectively.

  5. Biomimetic molecular design tools that learn, evolve, and adapt.

    PubMed

    Winkler, David A

    2017-01-01

    A dominant hallmark of living systems is their ability to adapt to changes in the environment by learning and evolving. Nature does this so superbly that intensive research efforts are now attempting to mimic biological processes. Initially this biomimicry involved developing synthetic methods to generate complex bioactive natural products. Recent work is attempting to understand how molecular machines operate so their principles can be copied, and learning how to employ biomimetic evolution and learning methods to solve complex problems in science, medicine and engineering. Automation, robotics, artificial intelligence, and evolutionary algorithms are now converging to generate what might broadly be called in silico-based adaptive evolution of materials. These methods are being applied to organic chemistry to systematize reactions, create synthesis robots to carry out unit operations, and to devise closed loop flow self-optimizing chemical synthesis systems. Most scientific innovations and technologies pass through the well-known "S curve", with slow beginning, an almost exponential growth in capability, and a stable applications period. Adaptive, evolving, machine learning-based molecular design and optimization methods are approaching the period of very rapid growth and their impact is already being described as potentially disruptive. This paper describes new developments in biomimetic adaptive, evolving, learning computational molecular design methods and their potential impacts in chemistry, engineering, and medicine.

  6. Biomimetic molecular design tools that learn, evolve, and adapt

    PubMed Central

    2017-01-01

    A dominant hallmark of living systems is their ability to adapt to changes in the environment by learning and evolving. Nature does this so superbly that intensive research efforts are now attempting to mimic biological processes. Initially this biomimicry involved developing synthetic methods to generate complex bioactive natural products. Recent work is attempting to understand how molecular machines operate so their principles can be copied, and learning how to employ biomimetic evolution and learning methods to solve complex problems in science, medicine and engineering. Automation, robotics, artificial intelligence, and evolutionary algorithms are now converging to generate what might broadly be called in silico-based adaptive evolution of materials. These methods are being applied to organic chemistry to systematize reactions, create synthesis robots to carry out unit operations, and to devise closed loop flow self-optimizing chemical synthesis systems. Most scientific innovations and technologies pass through the well-known “S curve”, with slow beginning, an almost exponential growth in capability, and a stable applications period. Adaptive, evolving, machine learning-based molecular design and optimization methods are approaching the period of very rapid growth and their impact is already being described as potentially disruptive. This paper describes new developments in biomimetic adaptive, evolving, learning computational molecular design methods and their potential impacts in chemistry, engineering, and medicine. PMID:28694872

  7. Bayes Error Rate Estimation Using Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2003-01-01

    The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

  8. Evolving Systems: Adaptive Key Component Control and Inheritance of Passivity and Dissipativity

    NASA Technical Reports Server (NTRS)

    Frost, S. A.; Balas, M. J.

    2010-01-01

    We propose a new framework called Evolving Systems to describe the self-assembly, or autonomous assembly, of actively controlled dynamical subsystems into an Evolved System with a higher purpose. Autonomous assembly of large, complex flexible structures in space is a target application for Evolving Systems. A critical requirement for autonomous assembling structures is that they remain stable during and after assembly. The fundamental topic of inheritance of stability, dissipativity, and passivity in Evolving Systems is the primary focus of this research. In this paper, we develop an adaptive key component controller to restore stability in Nonlinear Evolving Systems that would otherwise fail to inherit the stability traits of their components. We provide sufficient conditions for the use of this novel control method and demonstrate its use on an illustrative example.

  9. A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers

    PubMed Central

    2012-01-01

    Background Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? Results The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Conclusion Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway. PMID:23216969

  10. A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers.

    PubMed

    Günther, Oliver P; Chen, Virginia; Freue, Gabriela Cohen; Balshaw, Robert F; Tebbutt, Scott J; Hollander, Zsuzsanna; Takhar, Mandeep; McMaster, W Robert; McManus, Bruce M; Keown, Paul A; Ng, Raymond T

    2012-12-08

    Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.

  11. Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

    PubMed

    Agarwal, Shashank; Yu, Hong

    2009-12-01

    Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at-http://wood.ims.uwm.edu/full_text_classifier/.

  12. Classification of the Gabon SAR Mosaic Using a Wavelet Based Rule Classifier

    NASA Technical Reports Server (NTRS)

    Simard, Marc; Saatchi, Sasan; DeGrandi, Gianfranco

    2000-01-01

    A method is developed for semi-automated classification of SAR images of the tropical forest. Information is extracted using the wavelet transform (WT). The transform allows for extraction of structural information in the image as a function of scale. In order to classify the SAR image, a Desicion Tree Classifier is used. The method of pruning is used to optimize classification rate versus tree size. The results give explicit insight on the type of information useful for a given class.

  13. φ-evo: A program to evolve phenotypic models of biological networks.

    PubMed

    Henry, Adrien; Hemery, Mathieu; François, Paul

    2018-06-01

    Molecular networks are at the core of most cellular decisions, but are often difficult to comprehend. Reverse engineering of network architecture from their functions has proved fruitful to classify and predict the structure and function of molecular networks, suggesting new experimental tests and biological predictions. We present φ-evo, an open-source program to evolve in silico phenotypic networks performing a given biological function. We include implementations for evolution of biochemical adaptation, adaptive sorting for immune recognition, metazoan development (somitogenesis, hox patterning), as well as Pareto evolution. We detail the program architecture based on C, Python 3, and a Jupyter interface for project configuration and network analysis. We illustrate the predictive power of φ-evo by first recovering the asymmetrical structure of the lac operon regulation from an objective function with symmetrical constraints. Second, we use the problem of hox-like embryonic patterning to show how a single effective fitness can emerge from multi-objective (Pareto) evolution. φ-evo provides an efficient approach and user-friendly interface for the phenotypic prediction of networks and the numerical study of evolution itself.

  14. Tissue Engineering and Regenerative Medicine: Semantic Considerations for an Evolving Paradigm

    PubMed Central

    Katari, Ravi; Peloso, Andrea; Orlando, Giuseppe

    2015-01-01

    Tissue engineering (TE) and regenerative medicine (RM) are rapidly evolving fields that are often obscured by a dense cloud of hype and commercialization potential. We find, in the literature and general commentary, that several of the associated terms are casually referenced in varying contexts that ultimately result in the blurring of the distinguishing boundaries which define them. “TE” and “RM” are often used interchangeably, though some experts vehemently argue that they, in fact, represent different conceptual entities. Nevertheless, contemporary scientists have a general idea of the experiments and milestones that can be classified within either or both categories. Given the groundbreaking achievements reported within the past decade and consequent watershed potential of this field, we feel that it would be useful to properly contextualize these terms semantically and historically. In this concept paper, we explore the various definitions proposed in the literature and emphasize that ambiguous terminology can lead to misplaced apprehension. We assert that the central motifs of both concepts have existed within the surgical sciences long before their appearance as terms in the scientific literature. PMID:25629029

  15. Molecular Characterization of Wild Type Measles Virus from Adult Patients in Northern China, 2014.

    PubMed

    Xu, Wen; Zhang, Ming-Xiang; Qin, En-Qiang; Yan, Ying-Chun; Li, Feng-Yi; Xu, Zhe; Tian, Xia; Fan, Rong; Tu, Bo; Chen, Wei-Wei; Zhao, Min

    2016-04-01

    In this study, we studied the N and H genes from wild type measles viruses (MeVs) isolated during the 2013-2014 outbreak. Clinical samples were collected, and the genotyping, phylogenetic analysis were performed. The vaccination rate of the study population was 4%. Genotype H1a was the predominant genotype. Wild type viruses were classified into clusters A and B, C and may have different origins. N-450 sequences from wild type viruses were highly homologous with, and likely evolved from MeVs circulating in Tianjing and Henan in 2012. MVs/Shenyang.CHN/18.14/3 could have evolved from MeVs from Liaoning, Beijing, Hebei, Heilongjiang, Henan, Jilin, and Tianjin. Our data suggested that one or more of the same viruses circulated between Beijing, Shenyang, Hong Kong, Taiwan and Berlin. Important factors contributing to outbreaks could include weak vaccination coverage, poor vaccination strategies, and migration of adult workers between cities, countries, and from rural areas to urban areas. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Dynamical Bayesian inference of time-evolving interactions: From a pair of coupled oscillators to networks of oscillators

    NASA Astrophysics Data System (ADS)

    Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V. E.; Stefanovska, Aneta

    2012-12-01

    Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.024101 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.

  17. Exploiting Language Models to Classify Events from Twitter

    PubMed Central

    Vo, Duc-Thuan; Hai, Vo Thuan; Ock, Cheol-Young

    2015-01-01

    Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP), which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets' features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events. PMID:26451139

  18. Hierarchical ensemble of global and local classifiers for face recognition.

    PubMed

    Su, Yu; Shan, Shiguang; Chen, Xilin; Gao, Wen

    2009-08-01

    In the literature of psychophysics and neurophysiology, many studies have shown that both global and local features are crucial for face representation and recognition. This paper proposes a novel face recognition method which exploits both global and local discriminative features. In this method, global features are extracted from the whole face images by keeping the low-frequency coefficients of Fourier transform, which we believe encodes the holistic facial information, such as facial contour. For local feature extraction, Gabor wavelets are exploited considering their biological relevance. After that, Fisher's linear discriminant (FLD) is separately applied to the global Fourier features and each local patch of Gabor features. Thus, multiple FLD classifiers are obtained, each embodying different facial evidences for face recognition. Finally, all these classifiers are combined to form a hierarchical ensemble classifier. We evaluate the proposed method using two large-scale face databases: FERET and FRGC version 2.0. Experiments show that the results of our method are impressively better than the best known results with the same evaluation protocol.

  19. Effective Heart Disease Detection Based on Quantitative Computerized Traditional Chinese Medicine Using Representation Based Classifiers.

    PubMed

    Shu, Ting; Zhang, Bob; Tang, Yuan Yan

    2017-01-01

    At present, heart disease is the number one cause of death worldwide. Traditionally, heart disease is commonly detected using blood tests, electrocardiogram, cardiac computerized tomography scan, cardiac magnetic resonance imaging, and so on. However, these traditional diagnostic methods are time consuming and/or invasive. In this paper, we propose an effective noninvasive computerized method based on facial images to quantitatively detect heart disease. Specifically, facial key block color features are extracted from facial images and analyzed using the Probabilistic Collaborative Representation Based Classifier. The idea of facial key block color analysis is founded in Traditional Chinese Medicine. A new dataset consisting of 581 heart disease and 581 healthy samples was experimented by the proposed method. In order to optimize the Probabilistic Collaborative Representation Based Classifier, an analysis of its parameters was performed. According to the experimental results, the proposed method obtains the highest accuracy compared with other classifiers and is proven to be effective at heart disease detection.

  20. A semi-automated method for bone age assessment using cervical vertebral maturation.

    PubMed

    Baptista, Roberto S; Quaglio, Camila L; Mourad, Laila M E H; Hummel, Anderson D; Caetano, Cesar Augusto C; Ortolani, Cristina Lúcia F; Pisa, Ivan T

    2012-07-01

    To propose a semi-automated method for pattern classification to predict individuals' stage of growth based on morphologic characteristics that are described in the modified cervical vertebral maturation (CVM) method of Baccetti et al. A total of 188 lateral cephalograms were collected, digitized, evaluated manually, and grouped into cervical stages by two expert examiners. Landmarks were located on each image and measured. Three pattern classifiers based on the Naïve Bayes algorithm were built and assessed using a software program. The classifier with the greatest accuracy according to the weighted kappa test was considered best. The classifier showed a weighted kappa coefficient of 0.861 ± 0.020. If an adjacent estimated pre-stage or poststage value was taken to be acceptable, the classifier would show a weighted kappa coefficient of 0.992 ± 0.019. Results from this study show that the proposed semi-automated pattern classification method can help orthodontists identify the stage of CVM. However, additional studies are needed before this semi-automated classification method for CVM assessment can be implemented in clinical practice.

  1. Employing Machine-Learning Methods to Study Young Stellar Objects

    NASA Astrophysics Data System (ADS)

    Moore, Nicholas

    2018-01-01

    Vast amounts of data exist in the astronomical data archives, and yet a large number of sources remain unclassified. We developed a multi-wavelength pipeline to classify infrared sources. The pipeline uses supervised machine learning methods to classify objects into the appropriate categories. The program is fed data that is already classified to train it, and is then applied to unknown catalogues. The primary use for such a pipeline is the rapid classification and cataloging of data that would take a much longer time to classify otherwise. While our primary goal is to study young stellar objects (YSOs), the applications extend beyond the scope of this project. We present preliminary results from our analysis and discuss future applications.

  2. Layered classification techniques for remote sensing applications

    NASA Technical Reports Server (NTRS)

    Swain, P. H.; Wu, C. L.; Landgrebe, D. A.; Hauska, H.

    1975-01-01

    The single-stage method of pattern classification utilizes all available features in a single test which assigns the unknown to a category according to a specific decision strategy (such as the maximum likelihood strategy). The layered classifier classifies the unknown through a sequence of tests, each of which may be dependent on the outcome of previous tests. Although the layered classifier was originally investigated as a means of improving classification accuracy and efficiency, it was found that in the context of remote sensing data analysis, other advantages also accrue due to many of the special characteristics of both the data and the applications pursued. The layered classifier method and several of the diverse applications of this approach are discussed.

  3. Yellow evolved stars in open clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sowell, J.R.

    1987-05-01

    This paper describes a program in which Galactic cluster post-AGB candidates were first identified and then analyzed for cluster membership via radial velocities, monitored for possible photometric variations, examined for evidence of mass loss, and classified as completely as possible in terms of their basic stellar parameters. The intrinsically brightest supergiants are found in the youngest clusters. With increasing cluster age, the absolute luminosities attained by the supergiants decline. It appears that the evolutionary tracks of luminosity class II stars are more similar to those of class I than of class III. Only two superluminous giant star candidates are foundmore » in open clusters. 154 references.« less

  4. Reliable samples of quasars and hot stars from a spectrophotometric survey of the U.S. catalogs

    NASA Technical Reports Server (NTRS)

    Mitchell, Kenneth J.

    1987-01-01

    The U.S. survey for blue- and ultraviolet-excess starlike objects is reviewed, focusing on the features which have contributed to its accuracy. The spectrophotometric survey is described in terms of the observational setup and procedures. It is suggested that the survey has produced reliably classified samples of quasars and hot evolved stars and that the procedures used in the study provide a means of deriving distance and luminosity information about these objects. Several cumulative number counts and spectra of a DA white dwarf and a quasar with prominent C IV and C III emission are given as examples.

  5. A Novel Locally Linear KNN Method With Applications to Visual Recognition.

    PubMed

    Liu, Qingfeng; Liu, Chengjun

    2017-09-01

    A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.

  6. A proposed defect tracking model for classifying the inserted defect reports to enhance software quality control.

    PubMed

    Sultan, Torky; Khedr, Ayman E; Sayed, Mostafa

    2013-01-01

    NONE DECLARED Defect tracking systems play an important role in the software development organizations as they can store historical information about defects. There are many research in defect tracking models and systems to enhance their capabilities to be more specifically tracking, and were adopted with new technology. Furthermore, there are different studies in classifying bugs in a step by step method to have clear perception and applicable method in detecting such bugs. This paper shows a new proposed defect tracking model for the purpose of classifying the inserted defects reports in a step by step method for more enhancement of the software quality.

  7. Classification of patients by severity grades during triage in the emergency department using data mining methods.

    PubMed

    Zmiri, Dror; Shahar, Yuval; Taieb-Maimon, Meirav

    2012-04-01

    To test the feasibility of classifying emergency department patients into severity grades using data mining methods. Emergency department records of 402 patients were classified into five severity grades by two expert physicians. The Naïve Bayes and C4.5 algorithms were applied to produce classifiers from patient data into severity grades. The classifiers' results over several subsets of the data were compared with the physicians' assessments, with a random classifier, and with a classifier that selects the maximal-prevalence class. Positive predictive value, multiple-class extensions of sensitivity and specificity combinations, and entropy change. The mean accuracy of the data mining classifiers was 52.94 ± 5.89%, significantly better (P < 0.05) than the mean accuracy of a random classifier (34.60 ± 2.40%). The entropy of the input data sets was reduced through classification by a mean of 10.1%. Allowing for classification deviations of one severity grade led to mean accuracy of 85.42 ± 1.42%. The classifiers' accuracy in that case was similar to the physicians' consensus rate. Learning from consensus records led to better performance. Reducing the number of severity grades improved results in certain cases. The performance of the Naïve Bayes and C4.5 algorithms was similar; in unbalanced data sets, Naïve Bayes performed better. It is possible to produce a computerized classification model for the severity grade of triage patients, using data mining methods. Learning from patient records regarding which there is a consensus of several physicians is preferable to learning from each physician's patients. Either Naïve Bayes or C4.5 can be used; Naïve Bayes is preferable for unbalanced data sets. An ambiguity in the intermediate severity grades seems to hamper both the physicians' agreement and the classifiers' accuracy. © 2010 Blackwell Publishing Ltd.

  8. Racial Definition Handbook.

    ERIC Educational Resources Information Center

    Nelson, William J.

    Our culture draws lines between "races" in a variety of ways. Professional scholars and society in general each have their own set of aims and methods for dividing by race. The professional classifiers have been at times inconsistent and fallacious in their methods. The scientific classifiers (physical anthropologists, for example) assume an…

  9. Environmental monitoring: data trending using a frequency model.

    PubMed

    Caputo, Ross A; Huffman, Anne

    2004-01-01

    Environmental monitoring programs for the oversight of classified environments have used traditional statistical control charts to monitor trends in microbial recovery for classified environments. These methodologies work well for environments that yield measurable microbial recoveries. However, today successful increased control of microbial content yields numerous instances where microbial recovery in a sample is generally zero. As a result, traditional control chart methods cannot be used appropriately. Two methods to monitor the performance of a classified environment where microbial recovery is zero are presented. Both methods use the frequency between non-zero microbial recovery as an event. Therefore, the frequency of events is monitored rather than the microbial recovery count. Both methods are shown to be appropriate for use in the described instances.

  10. Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network.

    PubMed

    Li, Na; Zhao, Xinbo; Yang, Yongjia; Zou, Xiaochun

    2016-01-01

    Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN) as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.

  11. Deep-cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes.

    PubMed

    Sabokrou, Mohammad; Fayyaz, Mohsen; Fathy, Mahmood; Klette, Reinhard

    2017-02-17

    This paper proposes a fast and reliable method for anomaly detection and localization in video data showing crowded scenes. Time-efficient anomaly localization is an ongoing challenge and subject of this paper. We propose a cubicpatch- based method, characterised by a cascade of classifiers, which makes use of an advanced feature-learning approach. Our cascade of classifiers has two main stages. First, a light but deep 3D auto-encoder is used for early identification of "many" normal cubic patches. This deep network operates on small cubic patches as being the first stage, before carefully resizing remaining candidates of interest, and evaluating those at the second stage using a more complex and deeper 3D convolutional neural network (CNN). We divide the deep autoencoder and the CNN into multiple sub-stages which operate as cascaded classifiers. Shallow layers of the cascaded deep networks (designed as Gaussian classifiers, acting as weak single-class classifiers) detect "simple" normal patches such as background patches, and more complex normal patches are detected at deeper layers. It is shown that the proposed novel technique (a cascade of two cascaded classifiers) performs comparable to current top-performing detection and localization methods on standard benchmarks, but outperforms those in general with respect to required computation time.

  12. Incorporating spatial context into statistical classification of multidimensional image data

    NASA Technical Reports Server (NTRS)

    Bauer, M. E. (Principal Investigator); Tilton, J. C.; Swain, P. H.

    1981-01-01

    Compound decision theory is employed to develop a general statistical model for classifying image data using spatial context. The classification algorithm developed from this model exploits the tendency of certain ground-cover classes to occur more frequently in some spatial contexts than in others. A key input to this contextural classifier is a quantitative characterization of this tendency: the context function. Several methods for estimating the context function are explored, and two complementary methods are recommended. The contextural classifier is shown to produce substantial improvements in classification accuracy compared to the accuracy produced by a non-contextural uniform-priors maximum likelihood classifier when these methods of estimating the context function are used. An approximate algorithm, which cuts computational requirements by over one-half, is presented. The search for an optimal implementation is furthered by an exploration of the relative merits of using spectral classes or information classes for classification and/or context function estimation.

  13. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

    PubMed

    Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

    2010-05-07

    Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  14. Breast Cancer Recognition Using a Novel Hybrid Intelligent Method

    PubMed Central

    Addeh, Jalil; Ebrahimzadeh, Ata

    2012-01-01

    Breast cancer is the second largest cause of cancer deaths among women. At the same time, it is also among the most curable cancer types if it can be diagnosed early. This paper presents a novel hybrid intelligent method for recognition of breast cancer tumors. The proposed method includes three main modules: the feature extraction module, the classifier module, and the optimization module. In the feature extraction module, fuzzy features are proposed as the efficient characteristic of the patterns. In the classifier module, because of the promising generalization capability of support vector machines (SVM), a SVM-based classifier is proposed. In support vector machine training, the hyperparameters have very important roles for its recognition accuracy. Therefore, in the optimization module, the bees algorithm (BA) is proposed for selecting appropriate parameters of the classifier. The proposed system is tested on Wisconsin Breast Cancer database and simulation results show that the recommended system has a high accuracy. PMID:23626945

  15. A Cross-Classified CFA-MTMM Model for Structurally Different and Nonindependent Interchangeable Methods.

    PubMed

    Koch, Tobias; Schultze, Martin; Jeon, Minjeong; Nussbeck, Fridtjof W; Praetorius, Anna-Katharina; Eid, Michael

    2016-01-01

    Multirater (multimethod, multisource) studies are increasingly applied in psychology. Eid and colleagues (2008) proposed a multilevel confirmatory factor model for multitrait-multimethod (MTMM) data combining structurally different and multiple independent interchangeable methods (raters). In many studies, however, different interchangeable raters (e.g., peers, subordinates) are asked to rate different targets (students, supervisors), leading to violations of the independence assumption and to cross-classified data structures. In the present work, we extend the ML-CFA-MTMM model by Eid and colleagues (2008) to cross-classified multirater designs. The new C4 model (Cross-Classified CTC[M-1] Combination of Methods) accounts for nonindependent interchangeable raters and enables researchers to explicitly model the interaction between targets and raters as a latent variable. Using a real data application, it is shown how credibility intervals of model parameters and different variance components can be obtained using Bayesian estimation techniques.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng

    An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less

  17. Solving a Higgs optimization problem with quantum annealing for machine learning.

    PubMed

    Mott, Alex; Job, Joshua; Vlimant, Jean-Roch; Lidar, Daniel; Spiropulu, Maria

    2017-10-18

    The discovery of Higgs-boson decays in a background of standard-model processes was assisted by machine learning methods. The classifiers used to separate signals such as these from background are trained using highly unerring but not completely perfect simulations of the physical processes involved, often resulting in incorrect labelling of background processes or signals (label noise) and systematic errors. Here we use quantum and classical annealing (probabilistic techniques for approximating the global maximum or minimum of a given function) to solve a Higgs-signal-versus-background machine learning optimization problem, mapped to a problem of finding the ground state of a corresponding Ising spin model. We build a set of weak classifiers based on the kinematic observables of the Higgs decay photons, which we then use to construct a strong classifier. This strong classifier is highly resilient against overtraining and against errors in the correlations of the physical observables in the training data. We show that the resulting quantum and classical annealing-based classifier systems perform comparably to the state-of-the-art machine learning methods that are currently used in particle physics. However, in contrast to these methods, the annealing-based classifiers are simple functions of directly interpretable experimental parameters with clear physical meaning. The annealer-trained classifiers use the excited states in the vicinity of the ground state and demonstrate some advantage over traditional machine learning methods for small training datasets. Given the relative simplicity of the algorithm and its robustness to error, this technique may find application in other areas of experimental particle physics, such as real-time decision making in event-selection problems and classification in neutrino physics.

  18. Solving a Higgs optimization problem with quantum annealing for machine learning

    NASA Astrophysics Data System (ADS)

    Mott, Alex; Job, Joshua; Vlimant, Jean-Roch; Lidar, Daniel; Spiropulu, Maria

    2017-10-01

    The discovery of Higgs-boson decays in a background of standard-model processes was assisted by machine learning methods. The classifiers used to separate signals such as these from background are trained using highly unerring but not completely perfect simulations of the physical processes involved, often resulting in incorrect labelling of background processes or signals (label noise) and systematic errors. Here we use quantum and classical annealing (probabilistic techniques for approximating the global maximum or minimum of a given function) to solve a Higgs-signal-versus-background machine learning optimization problem, mapped to a problem of finding the ground state of a corresponding Ising spin model. We build a set of weak classifiers based on the kinematic observables of the Higgs decay photons, which we then use to construct a strong classifier. This strong classifier is highly resilient against overtraining and against errors in the correlations of the physical observables in the training data. We show that the resulting quantum and classical annealing-based classifier systems perform comparably to the state-of-the-art machine learning methods that are currently used in particle physics. However, in contrast to these methods, the annealing-based classifiers are simple functions of directly interpretable experimental parameters with clear physical meaning. The annealer-trained classifiers use the excited states in the vicinity of the ground state and demonstrate some advantage over traditional machine learning methods for small training datasets. Given the relative simplicity of the algorithm and its robustness to error, this technique may find application in other areas of experimental particle physics, such as real-time decision making in event-selection problems and classification in neutrino physics.

  19. A multiple-point spatially weighted k-NN method for object-based classification

    NASA Astrophysics Data System (ADS)

    Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.

    2016-10-01

    Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.

  20. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains.

    PubMed

    Bulashevska, Alla; Eils, Roland

    2006-06-14

    The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  1. Advanced Methods for Passive Acoustic Detection, Classification, and Localization of Marine Mammals

    DTIC Science & Technology

    2012-09-30

    floor 1176 Howell St Newport RI 02842 phone: (401) 832-5749 fax: (401) 832-4441 email: David.Moretti@navy.mil Steve W. Martin SPAWAR...multiclass support vector machine (SVM) classifier was previously developed ( Jarvis et al. 2008). This classifier both detects and classifies echolocation...whales. Here Moretti’s group, especially S. Jarvis , will improve the SVM classifier by resolving confusion between species whose clicks overlap in

  2. Documenting the Emergence of Electronic Nicotine Delivery Systems as a Disruptive Technology in Nicotine and Tobacco Science

    PubMed Central

    Correa, John B.; Ariel, Idan; Menzie, Nicole S.; Brandon, Thomas H.

    2016-01-01

    Background The emergence of the electronic nicotine delivery systems (ENDS, or “e-cigarettes”) has resulted in nicotine and tobacco scientists committing increased resources to studying these products. Despite this surge of research on various topics related to e-cigarettes, it is important to characterize the evolving e-cigarette research landscape as a way to identify important future research directions. The purpose of this review was to broadly categorize published scholarly work on e-cigarettes using a structured, multi-level coding scheme. Methods A systematic literature search was conducted to collect articles on e-cigarettes that were published in peer-reviewed journals from 2006 through 2014. Studies were classified through 3 coding waves. Articles were first divided into research reports, literature reviews and opinions/editorials. Research reports were further categorized to determine the proportion of these studies using human participants. Finally, human studies were classified based on their methodologies: descriptive, predictive, explanatory, and intervention. Results Research reports (n = 224) and opinions/editorials (n = 248) were published at similar rates during this time period. All types of articles showed exponential rates of increase in more recent years. 76.3% of human research studies tended to be descriptive in nature, with very little research employing experimental (6.8%) or intervention-based methodologies (5.4%). Conclusions This review reinforces the idea that e-cigarettes are a disruptive technology exerting substantial influence on nicotine and tobacco science. This review also suggests that opinions on e-cigarettes may be outpacing our scientific understanding of these devices. Our findings highlight the need for more e-cigarette research involving experimental, intervention, and longitudinal designs. PMID:27816664

  3. Prosthetists' perceptions and use of outcome measures in clinical practice: Long-term effects of focused continuing education.

    PubMed

    Hafner, Brian J; Spaulding, Susan E; Salem, Rana; Morgan, Sara J; Gaunaurd, Ignacio; Gailey, Robert

    2017-06-01

    Continuing education is intended to facilitate clinicians' skills and knowledge in areas of practice, such as administration and interpretation of outcome measures. To evaluate the long-term effect of continuing education on prosthetists' confidence in administering outcome measures and their perceptions of outcomes measurement in clinical practice. Pretest-posttest survey methods. A total of 66 prosthetists were surveyed before, immediately after, and 2 years after outcomes measurement education and training. Prosthetists were grouped as routine or non-routine outcome measures users, based on experience reported prior to training. On average, prosthetists were just as confident administering measures 1-2 years after continuing education as they were immediately after continuing education. In all, 20% of prosthetists, initially classified as non-routine users, were subsequently classified as routine users at follow-up. Routine and non-routine users' opinions differed on whether outcome measures contributed to efficient patient evaluations (79.3% and 32.4%, respectively). Both routine and non-routine users reported challenges integrating outcome measures into normal clinical routines (20.7% and 45.9%, respectively). Continuing education had a long-term impact on prosthetists' confidence in administering outcome measures and may influence their clinical practices. However, remaining barriers to using standardized measures need to be addressed to keep practitioners current with evolving practice expectations. Clinical relevance Continuing education (CE) had a significant long-term impact on prosthetists' confidence in administering outcome measures and influenced their clinical practices. In all, approximately 20% of prosthetists, who previously were non-routine outcome measure users, became routine users after CE. There remains a need to develop strategies to integrate outcome measurement into routine clinical practice.

  4. Consistency of Nutrition Recommendations for Foods Marketed to Children in the United States, 2009–2010

    PubMed Central

    Quilliam, Elizabeth Taylor; Paek, Hye-Jin; Kim, Sookyong; Venkatesh, Sumathi; Plasencia, Julie; Lee, Mira; Rifon, Nora J.

    2013-01-01

    Introduction Food marketing has emerged as an environmental factor that shapes children’s dietary behaviors. “Advergames,” or free online games designed to promote branded products, are an example of evolving food marketing tactics aimed at children. Our primary objective was to classify foods marketed to children (aged 2–11 y) in advergames as those meeting or not meeting nutrition recommendations of the US Department of Agriculture (USDA), Food & Drug Administration (FDA), Center for Science in the Public Interest (CSPI), and the Institute of Medicine (IOM). We document the consistency of classification of those foods across agency guidelines and offer policy recommendations. Methods We used comScore Media Builder Metrix to identify 143 websites that marketed foods (n = 439) to children aged 2 to 11 years through advergames. Foods were classified on the basis of each of the 4 agency criteria. Food nutrient labels provided information on serving size, calories, micronutrients, and macronutrients. Results The websites advertised 254 meals, 101 snacks, and 84 beverages. Proportions of meals and snacks meeting USDA and FDA recommendations were similarly low, with the exception of saturated fat in meals and sodium content in snacks. Inconsistency in recommendations was evidenced by only a small proportion of meals and fewer snacks meeting the recommendations of all the agencies per their guidelines. Beverage recommendations were also inconsistent across the 3 agencies that provide recommendations (USDA, IOM, and CSPI). Most (65%–95%) beverages advertised in advergames did not meet some of these recommendations. Conclusion Our findings indicate that a large number of foods with low nutritional value are being marketed to children via advergames. A standardized system of food marketing guidance is needed to better inform the public about healthfulness of foods advertised to children. PMID:24070037

  5. A review and experimental study on the application of classifiers and evolutionary algorithms in EEG-based brain-machine interface systems

    NASA Astrophysics Data System (ADS)

    Tahernezhad-Javazm, Farajollah; Azimirad, Vahid; Shoaran, Maryam

    2018-04-01

    Objective. Considering the importance and the near-future development of noninvasive brain-machine interface (BMI) systems, this paper presents a comprehensive theoretical-experimental survey on the classification and evolutionary methods for BMI-based systems in which EEG signals are used. Approach. The paper is divided into two main parts. In the first part, a wide range of different types of the base and combinatorial classifiers including boosting and bagging classifiers and evolutionary algorithms are reviewed and investigated. In the second part, these classifiers and evolutionary algorithms are assessed and compared based on two types of relatively widely used BMI systems, sensory motor rhythm-BMI and event-related potentials-BMI. Moreover, in the second part, some of the improved evolutionary algorithms as well as bi-objective algorithms are experimentally assessed and compared. Main results. In this study two databases are used, and cross-validation accuracy (CVA) and stability to data volume (SDV) are considered as the evaluation criteria for the classifiers. According to the experimental results on both databases, regarding the base classifiers, linear discriminant analysis and support vector machines with respect to CVA evaluation metric, and naive Bayes with respect to SDV demonstrated the best performances. Among the combinatorial classifiers, four classifiers, Bagg-DT (bagging decision tree), LogitBoost, and GentleBoost with respect to CVA, and Bagging-LR (bagging logistic regression) and AdaBoost (adaptive boosting) with respect to SDV had the best performances. Finally, regarding the evolutionary algorithms, single-objective invasive weed optimization (IWO) and bi-objective nondominated sorting IWO algorithms demonstrated the best performances. Significance. We present a general survey on the base and the combinatorial classification methods for EEG signals (sensory motor rhythm and event-related potentials) as well as their optimization methods through the evolutionary algorithms. In addition, experimental and statistical significance tests are carried out to study the applicability and effectiveness of the reviewed methods.

  6. Genomic profiles of low-grade murine gliomas evolve during progression to glioblastoma. | Office of Cancer Genomics

    Cancer.gov

    Background: Gliomas are diverse neoplasms with multiple molecular subtypes. How tumor-initiating mutations relate to molecular subtypes as these tumors evolve during malignant progression remains unclear.Methods: We used genetically engineered mouse models, histopathology, genetic lineage tracing, expression profiling, and copy number analyses to examine how genomic tumor diversity evolves during the course of malignant progression from low- to high-grade disease.

  7. Sentiment analysis system for movie review in Bahasa Indonesia using naive bayes classifier method

    NASA Astrophysics Data System (ADS)

    Nurdiansyah, Yanuar; Bukhori, Saiful; Hidayat, Rahmad

    2018-04-01

    There are many ways of implementing the use of sentiments often found in documents; one of which is the sentiments found on the product or service reviews. It is so important to be able to process and extract textual data from the documents. Therefore, we propose a system that is able to classify sentiments from review documents into two classes: positive sentiment and negative sentiment. We use Naive Bayes Classifier method in this document classification system that we build. We choose Movienthusiast, a movie reviews in Bahasa Indonesia website as the source of our review documents. From there, we were able to collect 1201 movie reviews: 783 positive reviews and 418 negative reviews that we use as the dataset for this machine learning classifier. The classifying accuracy yields an average of 88.37% from five times of accuracy measuring attempts using aforementioned dataset.

  8. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  9. E-Nose Vapor Identification Based on Dempster-Shafer Fusion of Multiple Classifiers

    NASA Technical Reports Server (NTRS)

    Li, Winston; Leung, Henry; Kwan, Chiman; Linnell, Bruce R.

    2005-01-01

    Electronic nose (e-nose) vapor identification is an efficient approach to monitor air contaminants in space stations and shuttles in order to ensure the health and safety of astronauts. Data preprocessing (measurement denoising and feature extraction) and pattern classification are important components of an e-nose system. In this paper, a wavelet-based denoising method is applied to filter the noisy sensor measurements. Transient-state features are then extracted from the denoised sensor measurements, and are used to train multiple classifiers such as multi-layer perceptions (MLP), support vector machines (SVM), k nearest neighbor (KNN), and Parzen classifier. The Dempster-Shafer (DS) technique is used at the end to fuse the results of the multiple classifiers to get the final classification. Experimental analysis based on real vapor data shows that the wavelet denoising method can remove both random noise and outliers successfully, and the classification rate can be improved by using classifier fusion.

  10. Feature selection and classification of multiparametric medical images using bagging and SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Resnick, Susan M.; Davatzikos, Christos

    2008-03-01

    This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.

  11. Automated aural classification used for inter-species discrimination of cetaceans.

    PubMed

    Binder, Carolyn M; Hines, Paul C

    2014-04-01

    Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.

  12. Minimum distance classification in remote sensing

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1972-01-01

    The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wurtz, R.; Kaplan, A.

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejectionmore » rate (GRR) relevant for realistic applications.« less

  14. Applying under-sampling techniques and cost-sensitive learning methods on risk assessment of breast cancer.

    PubMed

    Hsu, Jia-Lien; Hung, Ping-Cheng; Lin, Hung-Yen; Hsieh, Chung-Ho

    2015-04-01

    Breast cancer is one of the most common cause of cancer mortality. Early detection through mammography screening could significantly reduce mortality from breast cancer. However, most of screening methods may consume large amount of resources. We propose a computational model, which is solely based on personal health information, for breast cancer risk assessment. Our model can be served as a pre-screening program in the low-cost setting. In our study, the data set, consisting of 3976 records, is collected from Taipei City Hospital starting from 2008.1.1 to 2008.12.31. Based on the dataset, we first apply the sampling techniques and dimension reduction method to preprocess the testing data. Then, we construct various kinds of classifiers (including basic classifiers, ensemble methods, and cost-sensitive methods) to predict the risk. The cost-sensitive method with random forest classifier is able to achieve recall (or sensitivity) as 100 %. At the recall of 100 %, the precision (positive predictive value, PPV), and specificity of cost-sensitive method with random forest classifier was 2.9 % and 14.87 %, respectively. In our study, we build a breast cancer risk assessment model by using the data mining techniques. Our model has the potential to be served as an assisting tool in the breast cancer screening.

  15. Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, Shiju; Qian, Wei; Guan, Yubao

    2016-06-15

    Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less

  16. MUSQA: a CS method to build a multi-standard quality management system

    NASA Astrophysics Data System (ADS)

    Cros, Elizabeth; Sneed, Isabelle

    2002-07-01

    CS Communication & Systèmes, through its long quality management experience, has been able to build and evolve its Quality Management System according to clients requirements, norms, standards and models (ISO, DO178, ECSS, CMM, ...), evolving norms (transition from ISO 9001:1994 to ISO 9001:2000) and the TQM approach, being currently deployed. The aim of this paper is to show how, from this enriching and instructive experience, CS has defined and formalised its method: MuSQA (Multi-Standard Quality Approach). This method allows to built a new Quality Management System or simplify and unify an existing one. MuSQA objective is to provide any organisation with an open Quality Management System, which is able to evolve easily and turns to be a useful instrument for everyone, operational as well as non-operational staff.

  17. Naive Bayes as opinion classifier to evaluate students satisfaction based on student sentiment in Twitter Social Media

    NASA Astrophysics Data System (ADS)

    Candra Permana, Fahmi; Rosmansyah, Yusep; Setiawan Abdullah, Atje

    2017-10-01

    Students activity on social media can provide implicit knowledge and new perspectives for an educational system. Sentiment analysis is a part of text mining that can help to analyze and classify the opinion data. This research uses text mining and naive Bayes method as opinion classifier, to be used as an alternative methods in the process of evaluating studentss satisfaction for educational institution. Based on test results, this system can determine the opinion classification in Bahasa Indonesia using naive Bayes as opinion classifier with accuracy level of 84% correct, and the comparison between the existing system and the proposed system to evaluate students satisfaction in learning process, there is only a difference of 16.49%.

  18. A Label Propagation Approach for Detecting Buried Objects in Handheld GPR Data

    DTIC Science & Technology

    2016-04-17

    regions of interest that correspond to locations with anomalous signatures. Second, a classifier (or an ensemble of classifiers ) is used to assign a...investigated for almost two decades and several classifiers have been developed. Most of these methods are based on the supervised learning paradigm where...labeled target and clutter signatures are needed to train a classifier to discriminate between the two classes. Typically, large and diverse labeled

  19. Advanced Methods for Passive Acoustic Detection, Classification, and Localization of Marine Mammals

    DTIC Science & Technology

    2013-09-30

    N0001411WX21394 Steve W. Martin SPAWAR Systems Center Pacific 53366 Front St. San Diego, CA 92152-6551 phone: (619) 553-9882 email: Steve.W.Martin...multiclass support vector machine (SVM) classifier was previously developed ( Jarvis et al. 2008). This classifier both detects and classifies echolocation...whales. Here Moretti’s group, particularly S. Jarvis , will improve the SVM classifier by resolving confusion between species whose clicks overlap in

  20. Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.

    PubMed

    Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan

    2017-07-01

    Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.

  1. A Proposed Defect Tracking Model for Classifying the Inserted Defect Reports to Enhance Software Quality Control

    PubMed Central

    Khedr, Ayman E.; Sayed, Mostafa

    2013-01-01

    CONFLICT OF INTEREST: NONE DECLARED Defect tracking systems play an important role in the software development organizations as they can store historical information about defects. There are many research in defect tracking models and systems to enhance their capabilities to be more specifically tracking, and were adopted with new technology. Furthermore, there are different studies in classifying bugs in a step by step method to have clear perception and applicable method in detecting such bugs. This paper shows a new proposed defect tracking model for the purpose of classifying the inserted defects reports in a step by step method for more enhancement of the software quality. PMID:24039334

  2. EVALUATION OF SURGICAL TREATMENT OF FRACTURES OF THORACOLUMBAR SPINE WITH THIRD-GENERATION MATERIAL FOR INTERNAL FIXATION

    PubMed Central

    Bortoletto, Adalberto; Rodrigues, Luiz Cláudio Lacerda; Matsumoto, Marcelo Hide

    2015-01-01

    Objective: To evaluate the functional results from patients with surgical fractures in the thoracolumbar spine. Method: A prospective study including 100 patients with spinal fractures in the thoracic and lumbar segments was conducted. The lesions were classified in accordance with the AO system, and the patients were treated surgically. The presence of early kyphosis and its evolution after the surgical intervention, and the presence of postoperative pain and its evolution up to the 24th week after the surgery, were evaluated. We compared our data with the literature. Results: One hundred surgical patients were analyzed, of which 37 were type A, 46 were type B and 17 were Type C. Patients who presented Frankel A kept their clinical status, but patients with Frankel B or higher evolved with some improvement. The average improvement in pain based on a visual analog scale was more than four points. All the patients were able to return to their daily routine activities, although we did not take the return to work to be an assessment criterion. Conclusion: Despite controversy regarding the indications for surgery in cases of fractured spine, we believe that the method that we used was satisfactory because of the good results and low complication rate. However, more randomized prospective studies with longer follow-up are needed in order to evaluate this type of fixation. PMID:27047822

  3. Applying analytic hierarchy process to assess healthcare-oriented cloud computing service systems.

    PubMed

    Liao, Wen-Hwa; Qiu, Wan-Li

    2016-01-01

    Numerous differences exist between the healthcare industry and other industries. Difficulties in the business operation of the healthcare industry have continually increased because of the volatility and importance of health care, changes to and requirements of health insurance policies, and the statuses of healthcare providers, which are typically considered not-for-profit organizations. Moreover, because of the financial risks associated with constant changes in healthcare payment methods and constantly evolving information technology, healthcare organizations must continually adjust their business operation objectives; therefore, cloud computing presents both a challenge and an opportunity. As a response to aging populations and the prevalence of the Internet in fast-paced contemporary societies, cloud computing can be used to facilitate the task of balancing the quality and costs of health care. To evaluate cloud computing service systems for use in health care, providing decision makers with a comprehensive assessment method for prioritizing decision-making factors is highly beneficial. Hence, this study applied the analytic hierarchy process, compared items related to cloud computing and health care, executed a questionnaire survey, and then classified the critical factors influencing healthcare cloud computing service systems on the basis of statistical analyses of the questionnaire results. The results indicate that the primary factor affecting the design or implementation of optimal cloud computing healthcare service systems is cost effectiveness, with the secondary factors being practical considerations such as software design and system architecture.

  4. Designing User-Centric Patient Portals: Clinician and Patients' Uses and Gratifications

    PubMed Central

    Krist, Alex H.; Aycock, Rebecca A.; Kreps, Gary L.

    2017-01-01

    Abstract Background: Legislation mandates that clinicians make patients' medical information available digitally. This has resulted in hurriedly installing patient portals that do not fully meet the needs of patients or clinicians. This study examined a specific portal, MyPreventiveCare (MPC), a patient-centered portal designed to promote preventive care to consumers, to elicit recommendations from patients and clinicians about how it could be more beneficial by uncovering their uses and gratifications (U&G). Materials and Methods: In-depth interviews with 31 patients and two clinician focus groups were conducted. Multiple methods were utilized, such as grounded theory coding to develop themes and content analysis to classify responses according to the U&G framework. Results: Four main categories emerged that users desire to be included in health portals: integration with technology (27%), coordination of care (27%), incorporation of lifestyle (26%), and increased control (20%). Additional analysis revealed that health portals are mainly utilized to fulfill cognitive and affective needs, with over 80% of recommendations related to the U&G categories of cognitive and affective needs. Cognitive (60%), affective (21%), social integrative (10%), personal integrative (9%), and tension release (0%). Conclusions: Portals will continue to evolve and become important health communication tools if they address the user's perspective and are inclusive of new technological advances. Specifically, portals must become more user centric and incorporate aspects of the patients' lifestyle and integrate health information technology. PMID:27333468

  5. Evolving a Method to Capture Science Stakeholder Inputs to Optimize Instrument, Payload, and Program Design

    NASA Astrophysics Data System (ADS)

    Clark, P. E.; Rilee, M. L.; Curtis, S. A.; Bailin, S.

    2012-03-01

    We are developing Frontier, a highly adaptable, stably reconfigurable, web-accessible intelligent decision engine capable of optimizing design as well as the simulating operation of complex systems in response to evolving needs and environment.

  6. Inferring Human Activity Recognition with Ambient Sound on Wireless Sensor Nodes.

    PubMed

    Salomons, Etto L; Havinga, Paul J M; van Leeuwen, Henk

    2016-09-27

    A wireless sensor network that consists of nodes with a sound sensor can be used to obtain context awareness in home environments. However, the limited processing power of wireless nodes offers a challenge when extracting features from the signal, and subsequently, classifying the source. Although multiple papers can be found on different methods of sound classification, none of these are aimed at limited hardware or take the efficiency of the algorithms into account. In this paper, we compare and evaluate several classification methods on a real sensor platform using different feature types and classifiers, in order to find an approach that results in a good classifier that can run on limited hardware. To be as realistic as possible, we trained our classifiers using sound waves from many different sources. We conclude that despite the fact that the classifiers are often of low quality due to the highly restricted hardware resources, sufficient performance can be achieved when (1) the window length for our classifiers is increased, and (2) if we apply a two-step approach that uses a refined classification after a global classification has been performed.

  7. Onboard Classifiers for Science Event Detection on a Remote Sensing Spacecraft

    NASA Technical Reports Server (NTRS)

    Castano, Rebecca; Mazzoni, Dominic; Tang, Nghia; Greeley, Ron; Doggett, Thomas; Cichy, Ben; Chien, Steve; Davies, Ashley

    2006-01-01

    Typically, data collected by a spacecraft is downlinked to Earth and pre-processed before any analysis is performed. We have developed classifiers that can be used onboard a spacecraft to identify high priority data for downlink to Earth, providing a method for maximizing the use of a potentially bandwidth limited downlink channel. Onboard analysis can also enable rapid reaction to dynamic events, such as flooding, volcanic eruptions or sea ice break-up. Four classifiers were developed to identify cryosphere events using hyperspectral images. These classifiers include a manually constructed classifier, a Support Vector Machine (SVM), a Decision Tree and a classifier derived by searching over combinations of thresholded band ratios. Each of the classifiers was designed to run in the computationally constrained operating environment of the spacecraft. A set of scenes was hand-labeled to provide training and testing data. Performance results on the test data indicate that the SVM and manual classifiers outperformed the Decision Tree and band-ratio classifiers with the SVM yielding slightly better classifications than the manual classifier.

  8. An efficient ensemble learning method for gene microarray classification.

    PubMed

    Osareh, Alireza; Shadgar, Bita

    2013-01-01

    The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  9. Locating articular cartilage in MR images

    NASA Astrophysics Data System (ADS)

    Folkesson, Jenny; Dam, Erik; Pettersen, Paola; Olsen, Ole F.; Nielsen, Mads; Christiansen, Claus

    2005-04-01

    Accurate computation of the thickness of the articular cartilage is of great importance when diagnosing and monitoring the progress of joint diseases such as osteoarthritis. A fully automated cartilage assessment method is preferable compared to methods using manual interaction in order to avoid inter- and intra-observer variability. As a first step in the cartilage assessment, we present an automatic method for locating articular cartilage in knee MRI using supervised learning. The next step will be to fit a variable shape model to the cartilage, initiated at the location found using the method presented in this paper. From the model, disease markers will be extracted for the quantitative evaluation of the cartilage. The cartilage is located using an ANN-classifier, where every voxel is classified as cartilage or non-cartilage based on prior knowledge of the cartilage structure. The classifier is tested using leave-one-out-evaluation, and we found the average sensitivity and specificity to be 91.0% and 99.4%, respectively. The center of mass calculated from voxels classified as cartilage are similar to the corresponding values calculated from manual segmentations, which confirms that this method can find a good initial position for a shape model.

  10. Arabic Supervised Learning Method Using N-Gram

    ERIC Educational Resources Information Center

    Sanan, Majed; Rammal, Mahmoud; Zreik, Khaldoun

    2008-01-01

    Purpose: Recently, classification of Arabic documents is a real problem for juridical centers. In this case, some of the Lebanese official journal documents are classified, and the center has to classify new documents based on these documents. This paper aims to study and explain the useful application of supervised learning method on Arabic texts…

  11. Two Methods for Classifying Jobs into Equal Employment Opportunity Categories. Working Paper 83/84-4-21.

    ERIC Educational Resources Information Center

    Potter, Penny F.; Graham-Moore, Brian E.

    Most organizations planning to assess adverse impact or perform a stock analysis for affirmative action planning must correctly classify their jobs into appropriate occupational categories. Two methods of job classification were assessed in a combination archival and field study. Classification results from expert judgment of functional job…

  12. A new analytical method for the classification of time-location data obtained from the global positioning system (GPS).

    PubMed

    Kim, Taehyun; Lee, Kiyoung; Yang, Wonho; Yu, Seung Do

    2012-08-01

    Although the global positioning system (GPS) has been suggested as an alternative way to determine time-location patterns, its use has been limited. The purpose of this study was to evaluate a new analytical method of classifying time-location data obtained by GPS. A field technician carried a GPS device while simulating various scripted activities and recorded all movements by the second in an activity diary. The GPS device recorded geological data once every 15 s. The daily monitoring was repeated 18 times. The time-location data obtained by the GPS were compared with the activity diary to determine selection criteria for the classification of the GPS data. The GPS data were classified into four microenvironments (residential indoors, other indoors, transit, and walking outdoors); the selection criteria used were used number of satellites (used-NSAT), speed, and distance from residence. The GPS data were classified as indoors when the used-NSAT was below 9. Data classified as indoors were further classified as residential indoors when the distance from the residence was less than 40 m; otherwise, they were classified as other indoors. Data classified as outdoors were further classified as being in transit when the speed exceeded 2.5 m s(-1); otherwise, they were classified as walking outdoors. The average simple percentage agreement between the time-location classifications and the activity diary was 84.3 ± 12.4%, and the kappa coefficient was 0.71. The average differences between the time diary and the GPS results were 1.6 ± 2.3 h for the time spent in residential indoors, 0.9 ± 1.7 h for the time spent in other indoors, 0.4 ± 0.4 h for the time spent in transit, and 0.8 ± 0.5 h for the time spent walking outdoors. This method can be used to determine time-activity patterns in exposure-science studies.

  13. Parallel processing implementations of a contextual classifier for multispectral remote sensing data

    NASA Technical Reports Server (NTRS)

    Siegel, H. J.; Swain, P. H.; Smith, B. W.

    1980-01-01

    Contextual classifiers are being developed as a method to exploit the spatial/spectral context of a pixel to achieve accurate classification. Classification algorithms such as the contextual classifier typically require large amounts of computation time. One way to reduce the execution time of these tasks is through the use of parallelism. The applicability of the CDC flexible processor system and of a proposed multimicroprocessor system (PASM) for implementing contextual classifiers is examined.

  14. Performance comparison of classifiers for differentiation among obstructive lung diseases based on features of texture analysis at HRCT

    NASA Astrophysics Data System (ADS)

    Lee, Youngjoo; Seo, Joon Beom; Kang, Bokyoung; Kim, Dongil; Lee, June Goo; Kim, Song Soo; Kim, Namkug; Kang, Suk Ho

    2007-03-01

    The performance of classification algorithms for differentiating among obstructive lung diseases based on features from texture analysis using HRCT (High Resolution Computerized Tomography) images was compared. HRCT can provide accurate information for the detection of various obstructive lung diseases, including centrilobular emphysema, panlobular emphysema and bronchiolitis obliterans. Features on HRCT images can be subtle, however, particularly in the early stages of disease, and image-based diagnosis is subject to inter-observer variation. To automate the diagnosis and improve the accuracy, we compared three types of automated classification systems, naÃve Bayesian classifier, ANN (Artificial Neural Net) and SVM (Support Vector Machine), based on their ability to differentiate among normal lung and three types of obstructive lung diseases. To assess the performance and cross-validation of these three classifiers, 5 folding methods with 5 randomly chosen groups were used. For a more robust result, each validation was repeated 100 times. SVM showed the best performance, with 86.5% overall sensitivity, significantly different from the other classifiers (one way ANOVA, p<0.01). We address the characteristics of each classifier affecting performance and the issue of which classifier is the most suitable for clinical applications, and propose an appropriate method to choose the best classifier and determine its optimal parameters for optimal disease discrimination. These results can be applied to classifiers for differentiation of other diseases.

  15. Heidelberg Retina Tomograph 3 machine learning classifiers for glaucoma detection

    PubMed Central

    Townsend, K A; Wollstein, G; Danks, D; Sung, K R; Ishikawa, H; Kagemann, L; Gabriele, M L; Schuman, J S

    2010-01-01

    Aims To assess performance of classifiers trained on Heidelberg Retina Tomograph 3 (HRT3) parameters for discriminating between healthy and glaucomatous eyes. Methods Classifiers were trained using HRT3 parameters from 60 healthy subjects and 140 glaucomatous subjects. The classifiers were trained on all 95 variables and smaller sets created with backward elimination. Seven types of classifiers, including Support Vector Machines with radial basis (SVM-radial), and Recursive Partitioning and Regression Trees (RPART), were trained on the parameters. The area under the ROC curve (AUC) was calculated for classifiers, individual parameters and HRT3 glaucoma probability scores (GPS). Classifier AUCs and leave-one-out accuracy were compared with the highest individual parameter and GPS AUCs and accuracies. Results The highest AUC and accuracy for an individual parameter were 0.848 and 0.79, for vertical cup/disc ratio (vC/D). For GPS, global GPS performed best with AUC 0.829 and accuracy 0.78. SVM-radial with all parameters showed significant improvement over global GPS and vC/ D with AUC 0.916 and accuracy 0.85. RPART with all parameters provided significant improvement over global GPS with AUC 0.899 and significant improvement over global GPS and vC/D with accuracy 0.875. Conclusions Machine learning classifiers of HRT3 data provide significant enhancement over current methods for detection of glaucoma. PMID:18523087

  16. Length-independent structural similarities enrich the antibody CDR canonical class model.

    PubMed

    Nowak, Jaroslaw; Baker, Terry; Georges, Guy; Kelm, Sebastian; Klostermann, Stefan; Shi, Jiye; Sridharan, Sudharsan; Deane, Charlotte M

    2016-01-01

    Complementarity-determining regions (CDRs) are antibody loops that make up the antigen binding site. Here, we show that all CDR types have structurally similar loops of different lengths. Based on these findings, we created length-independent canonical classes for the non-H3 CDRs. Our length variable structural clusters show strong sequence patterns suggesting either that they evolved from the same original structure or result from some form of convergence. We find that our length-independent method not only clusters a larger number of CDRs, but also predicts canonical class from sequence better than the standard length-dependent approach. To demonstrate the usefulness of our findings, we predicted cluster membership of CDR-L3 sequences from 3 next-generation sequencing datasets of the antibody repertoire (over 1,000,000 sequences). Using the length-independent clusters, we can structurally classify an additional 135,000 sequences, which represents a ∼20% improvement over the standard approach. This suggests that our length-independent canonical classes might be a highly prevalent feature of antibody space, and could substantially improve our ability to accurately predict the structure of novel CDRs identified by next-generation sequencing.

  17. Arrogance analysis of several typical pattern recognition classifiers

    NASA Astrophysics Data System (ADS)

    Jing, Chen; Xia, Shengping; Hu, Weidong

    2007-04-01

    Various kinds of classification methods have been developed. However, most of these classical methods, such as Back-Propagation (BP), Bayesian method, Support Vector Machine(SVM), Self-Organizing Map (SOM) are arrogant. A so-called arrogance, for a human, means that his decision, which even is a mistake, overstates his actual experience. Accordingly, we say that he is a arrogant if he frequently makes arrogant decisions. Likewise, some classical pattern classifiers represent the similar characteristic of arrogance. Given an input feature vector, we say a classifier is arrogant in its classification if its veracity is high yet its experience is low. Typically, for a new sample which is distinguishable from original training samples, traditional classifiers recognize it as one of the known targets. Clearly, arrogance in classification is an undesirable attribute. Conversely, a classifier is non-arrogant in its classification if there is a reasonable balance between its veracity and its experience. Inquisitiveness is, in many ways, the opposite of arrogance. In nature, inquisitiveness is an eagerness for knowledge characterized by the drive to question, to seek a deeper understanding. The human capacity to doubt present beliefs allows us to acquire new experiences and to learn from our mistakes. Within the discrete world of computers, inquisitive pattern recognition is the constructive investigation and exploitation of conflict in information. Thus, we quantify this balance and discuss new techniques that will detect arrogance in a classifier.

  18. Detection of Road Markings Recorded in In-Vehicle Camera Images by Using Position-Dependent Classifiers

    NASA Astrophysics Data System (ADS)

    Noda, Masafumi; Takahashi, Tomokazu; Deguchi, Daisuke; Ide, Ichiro; Murase, Hiroshi; Kojima, Yoshiko; Naito, Takashi

    In this study, we propose a method for detecting road markings recorded in an image captured by an in-vehicle camera by using a position-dependent classifier. Road markings are symbols painted on the road surface that help in preventing traffic accidents and in ensuring traffic smooth. Therefore, driver support systems for detecting road markings, such as a system that provides warning in the case when traffic signs are overlooked, and supporting the stopping of a vehicle are required. It is difficult to detect road markings because their appearance changes with the actual traffic conditions, e. g. the shape and resolution change. The variation in these appearances depend on the positional relation between the vehicle and the road markings, and on the vehicle posture. Although these variations are quite large in an entire image, they are relatively small in a local area of the image. Therefore, we try to improve the detection performance by taking into account the local variations in these appearances. We propose a method in which a position-dependent classifier is used to detect road markings recorded in images captured by an in-vehicle camera. Further, to train the classifier efficiently, we propose a generative learning method that takes into consideration the positional relation between the vehicle and road markings, and also the vehicle posture. Experimental results showed that the detection performance when the proposed method was used was better than when a method involving a single classifier was used.

  19. Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords.

    PubMed

    Koyabu, Shun; Phan, Thi Thanh Thuy; Ohkawa, Takenao

    2015-01-01

    For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as "bind" or "interact" plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.

  20. Using methods from the data mining and machine learning literature for disease classification and prediction: A case study examining classification of heart failure sub-types

    PubMed Central

    Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.

    2014-01-01

    Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592

  1. Ensemble positive unlabeled learning for disease gene identification.

    PubMed

    Yang, Peng; Li, Xiaoli; Chua, Hon-Nian; Kwoh, Chee-Keong; Ng, See-Kiong

    2014-01-01

    An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario. Using only a single source of data for prediction can be susceptible to bias due to incompleteness and noise in the genomic data and a single machine learning predictor prone to bias caused by inherent limitations of individual methods. In this paper, we propose an effective PU learning framework that integrates multiple biological data sources and an ensemble of powerful machine learning classifiers for disease gene identification. Our proposed method integrates data from multiple biological sources for training PU learning classifiers. A novel ensemble-based PU learning method EPU is then used to integrate multiple PU learning classifiers to achieve accurate and robust disease gene predictions. Our evaluation experiments across six disease groups showed that EPU achieved significantly better results compared with various state-of-the-art prediction methods as well as ensemble learning classifiers. Through integrating multiple biological data sources for training and the outputs of an ensemble of PU learning classifiers for prediction, we are able to minimize the potential bias and errors in individual data sources and machine learning algorithms to achieve more accurate and robust disease gene predictions. In the future, our EPU method provides an effective framework to integrate the additional biological and computational resources for better disease gene predictions.

  2. Self-similarity Clustering Event Detection Based on Triggers Guidance

    NASA Astrophysics Data System (ADS)

    Zhang, Xianfei; Li, Bicheng; Tian, Yuxuan

    Traditional method of Event Detection and Characterization (EDC) regards event detection task as classification problem. It makes words as samples to train classifier, which can lead to positive and negative samples of classifier imbalance. Meanwhile, there is data sparseness problem of this method when the corpus is small. This paper doesn't classify event using word as samples, but cluster event in judging event types. It adopts self-similarity to convergence the value of K in K-means algorithm by the guidance of event triggers, and optimizes clustering algorithm. Then, combining with named entity and its comparative position information, the new method further make sure the pinpoint type of event. The new method avoids depending on template of event in tradition methods, and its result of event detection can well be used in automatic text summarization, text retrieval, and topic detection and tracking.

  3. Evolving cell models for systems and synthetic biology.

    PubMed

    Cao, Hongqing; Romero-Campero, Francisco J; Heeb, Stephan; Cámara, Miguel; Krasnogor, Natalio

    2010-03-01

    This paper proposes a new methodology for the automated design of cell models for systems and synthetic biology. Our modelling framework is based on P systems, a discrete, stochastic and modular formal modelling language. The automated design of biological models comprising the optimization of the model structure and its stochastic kinetic constants is performed using an evolutionary algorithm. The evolutionary algorithm evolves model structures by combining different modules taken from a predefined module library and then it fine-tunes the associated stochastic kinetic constants. We investigate four alternative objective functions for the fitness calculation within the evolutionary algorithm: (1) equally weighted sum method, (2) normalization method, (3) randomly weighted sum method, and (4) equally weighted product method. The effectiveness of the methodology is tested on four case studies of increasing complexity including negative and positive autoregulation as well as two gene networks implementing a pulse generator and a bandwidth detector. We provide a systematic analysis of the evolutionary algorithm's results as well as of the resulting evolved cell models.

  4. Exploiting the systematic review protocol for classification of medical abstracts.

    PubMed

    Frunza, Oana; Inkpen, Diana; Matwin, Stan; Klement, William; O'Blenis, Peter

    2011-01-01

    To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers. The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload. For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%. The per-question method that combines classifiers following the specific protocol of the review leads to better results than the global method in terms of recall. Because neither method is efficient enough to classify abstracts reliably by itself, the technology should be applied in a semi-automatic way, with a human expert still involved. When the workflow includes one human expert and the trained automatic classifier, recall improves to an acceptable level, showing that automatic classification techniques can reduce the human workload in the process of building a systematic review. Copyright © 2010 Elsevier B.V. All rights reserved.

  5. Robust Feature Selection Technique using Rank Aggregation.

    PubMed

    Sarkar, Chandrima; Cooley, Sarah; Srivastava, Jaideep

    2014-01-01

    Although feature selection is a well-developed research area, there is an ongoing need to develop methods to make classifiers more efficient. One important challenge is the lack of a universal feature selection technique which produces similar outcomes with all types of classifiers. This is because all feature selection techniques have individual statistical biases while classifiers exploit different statistical properties of data for evaluation. In numerous situations this can put researchers into dilemma as to which feature selection method and a classifiers to choose from a vast range of choices. In this paper, we propose a technique that aggregates the consensus properties of various feature selection methods to develop a more optimal solution. The ensemble nature of our technique makes it more robust across various classifiers. In other words, it is stable towards achieving similar and ideally higher classification accuracy across a wide variety of classifiers. We quantify this concept of robustness with a measure known as the Robustness Index (RI). We perform an extensive empirical evaluation of our technique on eight data sets with different dimensions including Arrythmia, Lung Cancer, Madelon, mfeat-fourier, internet-ads, Leukemia-3c and Embryonal Tumor and a real world data set namely Acute Myeloid Leukemia (AML). We demonstrate not only that our algorithm is more robust, but also that compared to other techniques our algorithm improves the classification accuracy by approximately 3-4% (in data set with less than 500 features) and by more than 5% (in data set with more than 500 features), across a wide range of classifiers.

  6. Noninvasive Dissection of Mouse Sleep Using a Piezoelectric Motion Sensor

    PubMed Central

    Yaghouby, Farid; Donohue, Kevin D.; O’Hara, Bruce F.; Sunderam, Sridhar

    2015-01-01

    Background Changes in autonomic control cause regular breathing during NREM sleep to fluctuate during REM. Piezoelectric cage-floor sensors have been used to successfully discriminate sleep and wake states in mice based on signal features related to respiration and other movements. This study presents a classifier for noninvasively classifying REM and NREM using a piezoelectric sensor. New Method Vigilance state was scored manually in 4-second epochs for 24-hour EEG/EMG recordings in twenty mice. An unsupervised classifier clustered piezoelectric signal features quantifying movement and respiration into three states: one active; and two inactive with regular and irregular breathing respectively. These states were hypothesized to correspond to Wake, NREM, and REM respectively. States predicted by the classifier were compared against manual EEG/EMG scores to test this hypothesis. Results Using only piezoelectric signal features, an unsupervised classifier distinguished Wake with high (89% sensitivity, 96% specificity) and REM with moderate (73% sensitivity, 75% specificity) accuracy, but NREM with poor sensitivity (51%) and high specificity (96%). The classifier sometimes confused light NREM sleep—characterized by irregular breathing and moderate delta EEG power—with REM. A supervised classifier improved sensitivities to 90, 81, and 67% and all specificities to over 90% for Wake, NREM, and REM respectively. Comparison with Existing Methods Unlike most actigraphic techniques, which only differentiate sleep from wake, the proposed piezoelectric method further dissects sleep based on breathing regularity into states strongly correlated with REM and NREM. Conclusions This approach could facilitate large-sample screening for genes influencing different sleep traits, besides drug studies or other manipulations. PMID:26582569

  7. Evolving neural networks through augmenting topologies.

    PubMed

    Stanley, Kenneth O; Miikkulainen, Risto

    2002-01-01

    An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixed-topology method on a challenging benchmark reinforcement learning task. We claim that the increased efficiency is due to (1) employing a principled method of crossover of different topologies, (2) protecting structural innovation using speciation, and (3) incrementally growing from minimal structure. We test this claim through a series of ablation studies that demonstrate that each component is necessary to the system as a whole and to each other. What results is significantly faster learning. NEAT is also an important contribution to GAs because it shows how it is possible for evolution to both optimize and complexify solutions simultaneously, offering the possibility of evolving increasingly complex solutions over generations, and strengthening the analogy with biological evolution.

  8. A multiscale curvature algorithm for classifying discrete return LiDAR in forested environments

    Treesearch

    Jeffrey S. Evans; Andrew T. Hudak

    2007-01-01

    One prerequisite to the use of light detection and ranging (LiDAR) across disciplines is differentiating ground from nonground returns. The objective was to automatically and objectively classify points within unclassified LiDAR point clouds, with few model parameters and minimal postprocessing. Presented is an automated method for classifying LiDAR returns as ground...

  9. Fisher classifier and its probability of error estimation

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.

  10. Detection of inter-patient left and right bundle branch block heartbeats in ECG using ensemble classifiers.

    PubMed

    Huang, Huifang; Liu, Jie; Zhu, Qiang; Wang, Ruiping; Hu, Guangshu

    2014-06-05

    Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients.

  11. Link Prediction in Evolving Networks Based on Popularity of Nodes.

    PubMed

    Wang, Tong; He, Xing-Sheng; Zhou, Ming-Yang; Fu, Zhong-Qian

    2017-08-02

    Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict missing edges or identify the spurious edges. The key issue of link prediction is to estimate the likelihood of potential links in networks. Most classical static-structure based methods ignore the temporal aspects of networks, limited by the time-varying features, such approaches perform poorly in evolving networks. In this paper, we propose a hypothesis that the ability of each node to attract links depends not only on its structural importance, but also on its current popularity (activeness), since active nodes have much more probability to attract future links. Then a novel approach named popularity based structural perturbation method (PBSPM) and its fast algorithm are proposed to characterize the likelihood of an edge from both existing connectivity structure and current popularity of its two endpoints. Experiments on six evolving networks show that the proposed methods outperform state-of-the-art methods in accuracy and robustness. Besides, visual results and statistical analysis reveal that the proposed methods are inclined to predict future edges between active nodes, rather than edges between inactive nodes.

  12. Thermography based diagnosis of ruptured anterior cruciate ligament (ACL) in canines

    NASA Astrophysics Data System (ADS)

    Lama, Norsang; Umbaugh, Scott E.; Mishra, Deependra; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph

    2016-09-01

    Anterior cruciate ligament (ACL) rupture in canines is a common orthopedic injury in veterinary medicine. Veterinarians use both imaging and non-imaging methods to diagnose the disease. Common imaging methods such as radiography, computed tomography (CT scan) and magnetic resonance imaging (MRI) have some disadvantages: expensive setup, high dose of radiation, and time-consuming. In this paper, we present an alternative diagnostic method based on feature extraction and pattern classification (FEPC) to diagnose abnormal patterns in ACL thermograms. The proposed method was experimented with a total of 30 thermograms for each camera view (anterior, lateral and posterior) including 14 disease and 16 non-disease cases provided from Long Island Veterinary Specialists. The normal and abnormal patterns in thermograms are analyzed in two steps: feature extraction and pattern classification. Texture features based on gray level co-occurrence matrices (GLCM), histogram features and spectral features are extracted from the color normalized thermograms and the computed feature vectors are applied to Nearest Neighbor (NN) classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier with leave-one-out validation method. The algorithm gives the best classification success rate of 86.67% with a sensitivity of 85.71% and a specificity of 87.5% in ACL rupture detection using NN classifier for the lateral view and Norm-RGB-Lum color normalization method. Our results show that the proposed method has the potential to detect ACL rupture in canines.

  13. The evolution of distributed sensing and collective computation in animal populations

    PubMed Central

    Hein, Andrew M; Rosenthal, Sara Brin; Hagstrom, George I; Berdahl, Andrew; Torney, Colin J; Couzin, Iain D

    2015-01-01

    Many animal groups exhibit rapid, coordinated collective motion. Yet, the evolutionary forces that cause such collective responses to evolve are poorly understood. Here, we develop analytical methods and evolutionary simulations based on experimental data from schooling fish. We use these methods to investigate how populations evolve within unpredictable, time-varying resource environments. We show that populations evolve toward a distinctive regime in behavioral phenotype space, where small responses of individuals to local environmental cues cause spontaneous changes in the collective state of groups. These changes resemble phase transitions in physical systems. Through these transitions, individuals evolve the emergent capacity to sense and respond to resource gradients (i.e. individuals perceive gradients via social interactions, rather than sensing gradients directly), and to allocate themselves among distinct, distant resource patches. Our results yield new insight into how natural selection, acting on selfish individuals, results in the highly effective collective responses evident in nature. DOI: http://dx.doi.org/10.7554/eLife.10955.001 PMID:26652003

  14. Online boosting for vehicle detection.

    PubMed

    Chang, Wen-Chung; Cho, Chih-Wei

    2010-06-01

    This paper presents a real-time vision-based vehicle detection system employing an online boosting algorithm. It is an online AdaBoost approach for a cascade of strong classifiers instead of a single strong classifier. Most existing cascades of classifiers must be trained offline and cannot effectively be updated when online tuning is required. The idea is to develop a cascade of strong classifiers for vehicle detection that is capable of being online trained in response to changing traffic environments. To make the online algorithm tractable, the proposed system must efficiently tune parameters based on incoming images and up-to-date performance of each weak classifier. The proposed online boosting method can improve system adaptability and accuracy to deal with novel types of vehicles and unfamiliar environments, whereas existing offline methods rely much more on extensive training processes to reach comparable results and cannot further be updated online. Our approach has been successfully validated in real traffic environments by performing experiments with an onboard charge-coupled-device camera in a roadway vehicle.

  15. Study design in high-dimensional classification analysis.

    PubMed

    Sánchez, Brisa N; Wu, Meihua; Song, Peter X K; Wang, Wen

    2016-10-01

    Advances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large [Formula: see text] small [Formula: see text]) classification analysis. Our method utilizes the probability of correct classification (PCC) as the optimization objective function and incorporates the higher criticism thresholding procedure for classifier development. Further, we derive the theoretical bound of maximal PCC gain from feature augmentation (e.g. when molecular and clinical predictors are combined in classifier development). Our methods are motivated and illustrated by a study using proteomics markers to classify post-kidney transplantation patients into stable and rejecting classes. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. The Capacity Profile: A Method to Classify Additional Care Needs in Children with Neurodevelopmental Disabilities

    ERIC Educational Resources Information Center

    Meester-Delver, Anke; Beelen, Anita; Hennekam, Raoul; Nollet, Frans; Hadders-Algra, Mijna

    2007-01-01

    The aim of this study was to determine the interrater reliability and stability over time of the Capacity Profile (CAP). The CAP is a standardized method for classifying additional care needs indicated by current impairments in five domains of body functions: physical health, neuromusculoskeletal and movement-related, sensory, mental, and voice…

  17. Application of the Covalent Bond Classification Method for the Teaching of Inorganic Chemistry

    ERIC Educational Resources Information Center

    Green, Malcolm L. H.; Parkin, Gerard

    2014-01-01

    The Covalent Bond Classification (CBC) method provides a means to classify covalent molecules according to the number and types of bonds that surround an atom of interest. This approach is based on an elementary molecular orbital analysis of the bonding involving the central atom (M), with the various interactions being classified according to the…

  18. Super resolution reconstruction of infrared images based on classified dictionary learning

    NASA Astrophysics Data System (ADS)

    Liu, Fei; Han, Pingli; Wang, Yi; Li, Xuan; Bai, Lu; Shao, Xiaopeng

    2018-05-01

    Infrared images always suffer from low-resolution problems resulting from limitations of imaging devices. An economical approach to combat this problem involves reconstructing high-resolution images by reasonable methods without updating devices. Inspired by compressed sensing theory, this study presents and demonstrates a Classified Dictionary Learning method to reconstruct high-resolution infrared images. It classifies features of the samples into several reasonable clusters and trained a dictionary pair for each cluster. The optimal pair of dictionaries is chosen for each image reconstruction and therefore, more satisfactory results is achieved without the increase in computational complexity and time cost. Experiments and results demonstrated that it is a viable method for infrared images reconstruction since it improves image resolution and recovers detailed information of targets.

  19. Criticality and big brake singularities in the tachyonic evolutions of closed Friedmann universes with cold dark matter

    NASA Astrophysics Data System (ADS)

    Horváth, Zsolt; Keresztes, Zoltán; Kamenshchik, Alexander Yu.; Gergely, László Á.

    2015-05-01

    The evolution of a closed Friedmann universe filled by a tachyon scalar field with a trigonometric potential and cold dark matter (CDM) is investigated. A subset of the evolutions consistent to 1 σ confidence level with the Union 2.1 supernova data set is identified. The evolutions of the tachyon field are classified. Some of them evolve into a de Sitter attractor, while others proceed through a pseudotachyonic regime into a sudden future singularity. Critical evolutions leading to big brake singularities in the presence of CDM are found and a new type of cosmological evolution characterized by singularity avoidance in the pseudotachyon regime is presented.

  20. Classifier performance prediction for computer-aided diagnosis using a limited dataset.

    PubMed

    Sahiner, Berkman; Chan, Heang-Ping; Hadjiiski, Lubomir

    2008-04-01

    In a practical classifier design problem, the true population is generally unknown and the available sample is finite-sized. A common approach is to use a resampling technique to estimate the performance of the classifier that will be trained with the available sample. We conducted a Monte Carlo simulation study to compare the ability of the different resampling techniques in training the classifier and predicting its performance under the constraint of a finite-sized sample. The true population for the two classes was assumed to be multivariate normal distributions with known covariance matrices. Finite sets of sample vectors were drawn from the population. The true performance of the classifier is defined as the area under the receiver operating characteristic curve (AUC) when the classifier designed with the specific sample is applied to the true population. We investigated methods based on the Fukunaga-Hayes and the leave-one-out techniques, as well as three different types of bootstrap methods, namely, the ordinary, 0.632, and 0.632+ bootstrap. The Fisher's linear discriminant analysis was used as the classifier. The dimensionality of the feature space was varied from 3 to 15. The sample size n2 from the positive class was varied between 25 and 60, while the number of cases from the negative class was either equal to n2 or 3n2. Each experiment was performed with an independent dataset randomly drawn from the true population. Using a total of 1000 experiments for each simulation condition, we compared the bias, the variance, and the root-mean-squared error (RMSE) of the AUC estimated using the different resampling techniques relative to the true AUC (obtained from training on a finite dataset and testing on the population). Our results indicated that, under the study conditions, there can be a large difference in the RMSE obtained using different resampling methods, especially when the feature space dimensionality is relatively large and the sample size is small. Under this type of conditions, the 0.632 and 0.632+ bootstrap methods have the lowest RMSE, indicating that the difference between the estimated and the true performances obtained using the 0.632 and 0.632+ bootstrap will be statistically smaller than those obtained using the other three resampling methods. Of the three bootstrap methods, the 0.632+ bootstrap provides the lowest bias. Although this investigation is performed under some specific conditions, it reveals important trends for the problem of classifier performance prediction under the constraint of a limited dataset.

  1. Use of Chest Wall Electromyography to Detect Respiratory Effort during Polysomnography

    PubMed Central

    Berry, Richard B.; Ryals, Scott; Girdhar, Ankur; Wagner, Mary H.

    2016-01-01

    Study Objectives: To evaluate the ability of chest wall EMG (CW-EMG) using surface electrodes to classify apneas as obstructive, mixed, or central compared to classification using dual channel uncalibrated respiratory inductance plethysmography (RIP). Methods: CW-EMG was recorded from electrodes in the eighth intercostal space at the right mid-axillary line. Consecutive adult clinical sleep studies were retrospectively reviewed, and the first 60 studies with at least 10 obstructive and 10 mixed or central apneas and technically adequate tracings were selected. Four obstructive and six central or mixed apneas (as classified by previous clinical scoring) were randomly selected. A blinded experienced scorer classified the apneas on the basis of tracings showing either RIP channels or the CW-EMG channel. The agreement using the two classification methods was determined by kappa analysis and intraclass correlation. Results: The percentage agreement was 89.5%, the kappa statistic was 0.83 (95% confidence interval 0.79 to 0.87), and the intraclass correlation was 0.83, showing good agreement. Of the 249 apneas classified as central by RIP, 26 were classified as obstructive (10.4%) and 7 as mixed (2.8%) by CW-EMG. Of the 229 events classified as central by CW-EMG, 7 (3.1%) were classified as obstructive and 6 (2.6%) as mixed by RIP. Conclusions: Monitoring CW-EMG may provide a clinically useful method of detection of respiratory effort when used with RIP and can prevent false classification of apneas as central. RIP can rarely detect respiratory effort not easily discernible by CW-EMG and the combination of the two methods is more likely to avoid apnea misclassification. Citation: Berry RB, Ryals S, Girdhar A, Wagner MH. Use of chest wall electromyography to detect respiratory effort during polysomnography. J Clin Sleep Med 2016;12(9):1239–1244. PMID:27306391

  2. Efficient Radiative Transfer for Dynamically Evolving Stratified Atmospheres

    NASA Astrophysics Data System (ADS)

    Judge, Philip G.

    2017-12-01

    We present a fast multi-level and multi-atom non-local thermodynamic equilibrium radiative transfer method for dynamically evolving stratified atmospheres, such as the solar atmosphere. The preconditioning method of Rybicki & Hummer (RH92) is adopted. But, pressed for the need of speed and stability, a “second-order escape probability” scheme is implemented within the framework of the RH92 method, in which frequency- and angle-integrals are carried out analytically. While minimizing the computational work needed, this comes at the expense of numerical accuracy. The iteration scheme is local, the formal solutions for the intensities are the only non-local component. At present the methods have been coded for vertical transport, applicable to atmospheres that are highly stratified. The probabilistic method seems adequately fast, stable, and sufficiently accurate for exploring dynamical interactions between the evolving MHD atmosphere and radiation using current computer hardware. Current 2D and 3D dynamics codes do not include this interaction as consistently as the current method does. The solutions generated may ultimately serve as initial conditions for dynamical calculations including full 3D radiative transfer. The National Center for Atmospheric Research is sponsored by the National Science Foundation.

  3. A Hybrid Template-Based Composite Classification System

    DTIC Science & Technology

    2009-02-01

    Hybrid Classifier: Forced Decision . . . . 116 5.3.2 Forced Decision Experimental Results . . . . . 119 5.3.3 Test for Statistical Significance ...Results . . . . . . . . . . 127 5.4.2 Test for Statistical Significance : NDEC Option 129 5.5 Implementing the Hyrid Classifier with OOL Targets . 130...comple- mentary in nature . Complementary classifiers are observed by finding an optimal method for partitioning the problem space. For example, the

  4. Classification of cirrhotic liver in Gadolinium-enhanced MR images

    NASA Astrophysics Data System (ADS)

    Lee, Gobert; Uchiyama, Yoshikazu; Zhang, Xuejun; Kanematsu, Masayuki; Zhou, Xiangrong; Hara, Takeshi; Kato, Hiroki; Kondo, Hiroshi; Fujita, Hiroshi; Hoshi, Hiroaki

    2007-03-01

    Cirrhosis of the liver is characterized by the presence of widespread nodules and fibrosis in the liver. The fibrosis and nodules formation causes distortion of the normal liver architecture, resulting in characteristic texture patterns. Texture patterns are commonly analyzed with the use of co-occurrence matrix based features measured on regions-of-interest (ROIs). A classifier is subsequently used for the classification of cirrhotic or non-cirrhotic livers. Problem arises if the classifier employed falls into the category of supervised classifier which is a popular choice. This is because the 'true disease states' of the ROIs are required for the training of the classifier but is, generally, not available. A common approach is to adopt the 'true disease state' of the liver as the 'true disease state' of all ROIs in that liver. This paper investigates the use of a nonsupervised classifier, the k-means clustering method in classifying livers as cirrhotic or non-cirrhotic using unlabelled ROI data. A preliminary result with a sensitivity and specificity of 72% and 60%, respectively, demonstrates the feasibility of using the k-means non-supervised clustering method in generating a characteristic cluster structure that could facilitate the classification of cirrhotic and non-cirrhotic livers.

  5. Automated anatomical labeling of bronchial branches using multiple classifiers and its application to bronchoscopy guidance based on fusion of virtual and real bronchoscopy

    NASA Astrophysics Data System (ADS)

    Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Mori, Kensaku; Suenaga, Yasuhito; Hasegawa, Yoshinori; Imaizumi, Kazuyoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi

    2008-03-01

    This paper presents a method for automated anatomical labeling of bronchial branches (ALBB) extracted from 3D CT datasets. The proposed method constructs classifiers that output anatomical names of bronchial branches by employing the machine-learning approach. We also present its application to a bronchoscopy guidance system. Since the bronchus has a complex tree structure, bronchoscopists easily tend to get disoriented and lose the way to a target location. A bronchoscopy guidance system is strongly expected to be developed to assist bronchoscopists. In such guidance system, automated presentation of anatomical names is quite useful information for bronchoscopy. Although several methods for automated ALBB were reported, most of them constructed models taking only variations of branching patterns into account and did not consider those of running directions. Since the running directions of bronchial branches differ greatly in individuals, they could not perform ALBB accurately when running directions of bronchial branches were different from those of models. Our method tries to solve such problems by utilizing the machine-learning approach. Actual procedure consists of three steps: (a) extraction of bronchial tree structures from 3D CT datasets, (b) construction of classifiers using the multi-class AdaBoost technique, and (c) automated classification of bronchial branches by using the constructed classifiers. We applied the proposed method to 51 cases of 3D CT datasets. The constructed classifiers were evaluated by leave-one-out scheme. The experimental results showed that the proposed method could assign correct anatomical names to bronchial branches of 89.1% up to segmental lobe branches. Also, we confirmed that it was quite useful to assist the bronchoscopy by presenting anatomical names of bronchial branches on real bronchoscopic views.

  6. Mobile robots traversability awareness based on terrain visual sensory data fusion

    NASA Astrophysics Data System (ADS)

    Shirkhodaie, Amir

    2007-04-01

    In this paper, we have presented methods that significantly improve the robot awareness of its terrain traversability conditions. The terrain traversability awareness is achieved by association of terrain image appearances from different poses and fusion of extracted information from multimodality imaging and range sensor data for localization and clustering environment landmarks. Initially, we describe methods for extraction of salient features of the terrain for the purpose of landmarks registration from two or more images taken from different via points along the trajectory path of the robot. The method of image registration is applied as a means of overlaying (two or more) of the same terrain scene at different viewpoints. The registration geometrically aligns salient landmarks of two images (the reference and sensed images). A Similarity matching techniques is proposed for matching the terrain salient landmarks. Secondly, we present three terrain classifier models based on rule-based, supervised neural network, and fuzzy logic for classification of terrain condition under uncertainty and mapping the robot's terrain perception to apt traversability measures. This paper addresses the technical challenges and navigational skill requirements of mobile robots for traversability path planning in natural terrain environments similar to Mars surface terrains. We have described different methods for detection of salient terrain features based on imaging texture analysis techniques. We have also presented three competing techniques for terrain traversability assessment of mobile robots navigating in unstructured natural terrain environments. These three techniques include: a rule-based terrain classifier, a neural network-based terrain classifier, and a fuzzy-logic terrain classifier. Each proposed terrain classifier divides a region of natural terrain into finite sub-terrain regions and classifies terrain condition exclusively within each sub-terrain region based on terrain spatial and textural cues.

  7. Unsupervised machine-learning method for improving the performance of ambulatory fall-detection systems

    PubMed Central

    2012-01-01

    Background Falls can cause trauma, disability and death among older people. Ambulatory accelerometer devices are currently capable of detecting falls in a controlled environment. However, research suggests that most current approaches can tend to have insufficient sensitivity and specificity in non-laboratory environments, in part because impacts can be experienced as part of ordinary daily living activities. Method We used a waist-worn wireless tri-axial accelerometer combined with digital signal processing, clustering and neural network classifiers. The method includes the application of Discrete Wavelet Transform, Regrouping Particle Swarm Optimization, Gaussian Distribution of Clustered Knowledge and an ensemble of classifiers including a multilayer perceptron and Augmented Radial Basis Function (ARBF) neural networks. Results Preliminary testing with 8 healthy individuals in a home environment yields 98.6% sensitivity to falls and 99.6% specificity for routine Activities of Daily Living (ADL) data. Single ARB and MLP classifiers were compared with a combined classifier. The combined classifier offers the greatest sensitivity, with a slight reduction in specificity for routine ADL and an increased specificity for exercise activities. In preliminary tests, the approach achieves 100% sensitivity on in-group falls, 97.65% on out-group falls, 99.33% specificity on routine ADL, and 96.59% specificity on exercise ADL. Conclusion The pre-processing and feature-extraction steps appear to simplify the signal while successfully extracting the essential features that are required to characterize a fall. The results suggest this combination of classifiers can perform better than MLP alone. Preliminary testing suggests these methods may be useful for researchers who are attempting to improve the performance of ambulatory fall-detection systems. PMID:22336100

  8. Optimization of Support Vector Machine (SVM) for Object Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  9. Analysis of QoS Requirements for e-Health Services and Mapping to Evolved Packet System QoS Classes

    PubMed Central

    Skorin-Kapov, Lea; Matijasevic, Maja

    2010-01-01

    E-Health services comprise a broad range of healthcare services delivered by using information and communication technology. In order to support existing as well as emerging e-Health services over converged next generation network (NGN) architectures, there is a need for network QoS control mechanisms that meet the often stringent requirements of such services. In this paper, we evaluate the QoS support for e-Health services in the context of the Evolved Packet System (EPS), specified by the Third Generation Partnership Project (3GPP) as a multi-access all-IP NGN. We classify heterogeneous e-Health services based on context and network QoS requirements and propose a mapping to existing 3GPP QoS Class Identifiers (QCIs) that serve as a basis for the class-based QoS concept of the EPS. The proposed mapping aims to provide network operators with guidelines for meeting heterogeneous e-Health service requirements. As an example, we present the QoS requirements for a prototype e-Health service supporting tele-consultation between a patient and a doctor and illustrate the use of the proposed mapping to QCIs in standardized QoS control procedures. PMID:20976301

  10. Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time.

    PubMed

    Jack, Rachael E; Garrod, Oliver G B; Schyns, Philippe G

    2014-01-20

    Designed by biological and social evolutionary pressures, facial expressions of emotion comprise specific facial movements to support a near-optimal system of signaling and decoding. Although highly dynamical, little is known about the form and function of facial expression temporal dynamics. Do facial expressions transmit diagnostic signals simultaneously to optimize categorization of the six classic emotions, or sequentially to support a more complex communication system of successive categorizations over time? Our data support the latter. Using a combination of perceptual expectation modeling, information theory, and Bayesian classifiers, we show that dynamic facial expressions of emotion transmit an evolving hierarchy of "biologically basic to socially specific" information over time. Early in the signaling dynamics, facial expressions systematically transmit few, biologically rooted face signals supporting the categorization of fewer elementary categories (e.g., approach/avoidance). Later transmissions comprise more complex signals that support categorization of a larger number of socially specific categories (i.e., the six classic emotions). Here, we show that dynamic facial expressions of emotion provide a sophisticated signaling system, questioning the widely accepted notion that emotion communication is comprised of six basic (i.e., psychologically irreducible) categories, and instead suggesting four. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Relationship between urban eco-environment and competitiveness with the background of globalization: statistical explanation based on industry type newly classified with environment demand and environment pressure.

    PubMed

    Kang, Xiao-guang; Ma, Qing-Bin

    2005-01-01

    Within the global urban system, the statistical relationship between urban eco-environment (UE) and urban competitiveness (UC) (RUEC) is researched. Data showed that there is a statistically inverted-U relationship between UE and UC. Eco-environmental factor is put into the classification of industries, and gets six industrial types by two indexes viz. industries' eco-environmental demand and pressure. The statistical results showed that there is a strong relationship, for new industrial classification, between the changes of industrial structure and evolvement of UE. The drive mechanism of the evolvement of urban eco-environment, with human demand and global work division was analyzed. The conclusion is that the development stratege, industrial policies of cities, and environmental policies fo cities must be fit with their ranks among the global urban system. At the era of globalization, so far as the environmental policies, their rationality could not be assessed with the level of strictness, but it can enhance cities' competitiveness when they are fit with cities' capabilities to attract and control some sections of the industry's value-chain. None but these kinds of environmental policies can probably enhance the UC.

  12. Classification as clustering: a Pareto cooperative-competitive GP approach.

    PubMed

    McIntyre, Andrew R; Heywood, Malcolm I

    2011-01-01

    Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

  13. Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing.

    PubMed

    Riniker, Sereina; Fechner, Nikolas; Landrum, Gregory A

    2013-11-25

    The concept of data fusion - the combination of information from different sources describing the same object with the expectation to generate a more accurate representation - has found application in a very broad range of disciplines. In the context of ligand-based virtual screening (VS), data fusion has been applied to combine knowledge from either different active molecules or different fingerprints to improve similarity search performance. Machine-learning (ML) methods based on fusion of multiple homogeneous classifiers, in particular random forests, have also been widely applied in the ML literature. The heterogeneous version of classifier fusion - fusing the predictions from different model types - has been less explored. Here, we investigate heterogeneous classifier fusion for ligand-based VS using three different ML methods, RF, naïve Bayes (NB), and logistic regression (LR), with four 2D fingerprints, atom pairs, topological torsions, RDKit fingerprint, and circular fingerprint. The methods are compared using a previously developed benchmarking platform for 2D fingerprints which is extended to ML methods in this article. The original data sets are filtered for difficulty, and a new set of challenging data sets from ChEMBL is added. Data sets were also generated for a second use case: starting from a small set of related actives instead of diverse actives. The final fused model consistently outperforms the other approaches across the broad variety of targets studied, indicating that heterogeneous classifier fusion is a very promising approach for ligand-based VS. The new data sets together with the adapted source code for ML methods are provided in the Supporting Information .

  14. Real-data comparison of data mining methods in prediction of diabetes in iran.

    PubMed

    Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal

    2013-09-01

    Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.

  15. Combining multiple decisions: applications to bioinformatics

    NASA Astrophysics Data System (ADS)

    Yukinawa, N.; Takenouchi, T.; Oba, S.; Ishii, S.

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods.

  16. Understanding of the naive Bayes classifier in spam filtering

    NASA Astrophysics Data System (ADS)

    Wei, Qijia

    2018-05-01

    Along with the development of the Internet, the information stream is experiencing an unprecedented burst. The methods of information transmission become more and more important and people receiving effective information is a hot topic in the both research and industry field. As one of the most common methods of information communication, email has its own advantages. However, spams always flood the inbox and automatic filtering is needed. This paper is going to discuss this issue from the perspective of Naive Bayes Classifier, which is one of the applications of Bayes Theorem. Concepts and process of Naive Bayes Classifier will be introduced, followed by two examples. Discussion with Machine Learning is made in the last section. Naive Bayes Classifier has been proved to be surprisingly effective, with the limitation of the interdependence among attributes which are usually email words or phrases.

  17. Computer-aided diagnosis of early knee osteoarthritis based on MRI T2 mapping.

    PubMed

    Wu, Yixiao; Yang, Ran; Jia, Sen; Li, Zhanjun; Zhou, Zhiyang; Lou, Ting

    2014-01-01

    This work was aimed at studying the method of computer-aided diagnosis of early knee OA (OA: osteoarthritis). Based on the technique of MRI (MRI: Magnetic Resonance Imaging) T2 Mapping, through computer image processing, feature extraction, calculation and analysis via constructing a classifier, an effective computer-aided diagnosis method for knee OA was created to assist doctors in their accurate, timely and convenient detection of potential risk of OA. In order to evaluate this method, a total of 1380 data from the MRI images of 46 samples of knee joints were collected. These data were then modeled through linear regression on an offline general platform by the use of the ImageJ software, and a map of the physical parameter T2 was reconstructed. After the image processing, the T2 values of ten regions in the WORMS (WORMS: Whole-organ Magnetic Resonance Imaging Score) areas of the articular cartilage were extracted to be used as the eigenvalues in data mining. Then,a RBF (RBF: Radical Basis Function) network classifier was built to classify and identify the collected data. The classifier exhibited a final identification accuracy of 75%, indicating a good result of assisting diagnosis. Since the knee OA classifier constituted by a weights-directly-determined RBF neural network didn't require any iteration, our results demonstrated that the optimal weights, appropriate center and variance could be yielded through simple procedures. Furthermore, the accuracy for both the training samples and the testing samples from the normal group could reach 100%. Finally, the classifier was superior both in time efficiency and classification performance to the frequently used classifiers based on iterative learning. Thus it was suitable to be used as an aid to computer-aided diagnosis of early knee OA.

  18. A Pairwise Naïve Bayes Approach to Bayesian Classification.

    PubMed

    Asafu-Adjei, Josephine K; Betensky, Rebecca A

    2015-10-01

    Despite the relatively high accuracy of the naïve Bayes (NB) classifier, there may be several instances where it is not optimal, i.e. does not have the same classification performance as the Bayes classifier utilizing the joint distribution of the examined attributes. However, the Bayes classifier can be computationally intractable due to its required knowledge of the joint distribution. Therefore, we introduce a "pairwise naïve" Bayes (PNB) classifier that incorporates all pairwise relationships among the examined attributes, but does not require specification of the joint distribution. In this paper, we first describe the necessary and sufficient conditions under which the PNB classifier is optimal. We then discuss sufficient conditions for which the PNB classifier, and not NB, is optimal for normal attributes. Through simulation and actual studies, we evaluate the performance of our proposed classifier relative to the Bayes and NB classifiers, along with the HNB, AODE, LBR and TAN classifiers, using normal density and empirical estimation methods. Our applications show that the PNB classifier using normal density estimation yields the highest accuracy for data sets containing continuous attributes. We conclude that it offers a useful compromise between the Bayes and NB classifiers.

  19. Application of texture analysis method for mammogram density classification

    NASA Astrophysics Data System (ADS)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  20. Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords

    PubMed Central

    Koyabu, Shun; Phan, Thi Thanh Thuy; Ohkawa, Takenao

    2015-01-01

    For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as “bind” or “interact” plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction. PMID:26783534

  1. Optimal aggregation of binary classifiers for multiclass cancer diagnosis using gene expression profiles.

    PubMed

    Yukinawa, Naoto; Oba, Shigeyuki; Kato, Kikuya; Ishii, Shin

    2009-01-01

    Multiclass classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. There have been many studies of aggregating binary classifiers to construct a multiclass classifier based on one-versus-the-rest (1R), one-versus-one (11), or other coding strategies, as well as some comparison studies between them. However, the studies found that the best coding depends on each situation. Therefore, a new problem, which we call the "optimal coding problem," has arisen: how can we determine which coding is the optimal one in each situation? To approach this optimal coding problem, we propose a novel framework for constructing a multiclass classifier, in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. Although there is no a priori answer to the optimal coding problem, our weight tuning method can be a consistent answer to the problem. We apply this method to various classification problems including a synthesized data set and some cancer diagnosis data sets from gene expression profiling. The results demonstrate that, in most situations, our method can improve classification accuracy over simple voting heuristics and is better than or comparable to state-of-the-art multiclass predictors.

  2. Multiwavelet grading of prostate pathological images

    NASA Astrophysics Data System (ADS)

    Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh

    2002-05-01

    We have developed image analysis methods to automatically grade pathological images of prostate. The proposed method generates Gleason grades to images, where each image is assigned a grade between 1 and 5. This is done using features extracted from multiwavelet transformations. We extract energy and entropy features from submatrices obtained in the decomposition. Next, we apply a k-NN classifier to grade the image. To find optimal multiwavelet basis, preprocessing, and classifier, we use features extracted by different multiwavelets with either critically sampled preprocessing or repeated row preprocessing and different k-NN classifiers and compare their performances, evaluated by total misclassification rate (TMR). To evaluate sensitivity to noise, we add white Gaussian noise to images and compare the results (TMR's). We applied proposed methods to 100 images. We evaluated the first and second levels of decomposition using Geronimo, Hardin, and Massopust (GHM), Chui and Lian (CL), and Shen (SA4) multiwavelets. We also evaluated k-NN classifier for k=1,2,3,4,5. Experimental results illustrate that first level of decomposition is quite noisy. They also show that critically sampled preprocessing outperforms repeated row preprocessing and has less sensitivity to noise. Finally, comparison studies indicate that SA4 multiwavelet and k-NN classifier (k=1) generates optimal results (with smallest TMR of 3%).

  3. Research Notes - Openness and Evolvability - Documentation Quality Assessment

    DTIC Science & Technology

    2016-08-01

    UNCLASSIFIED UNCLASSIFIED Notes – Openness and Evolvability – Documentation Quality Assessment Michael Haddy* and Adam Sbrana...Methods and Processes. This set of Research Notes focusses on Documentation Quality Assessment. This work was undertaken from the late 1990s to 2007...1 2. DOCUMENTATION QUALITY ASSESSMENT ......................................................... 1 2.1 Documentation Quality Assessment

  4. A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update

    NASA Astrophysics Data System (ADS)

    Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F.

    2018-06-01

    Objective. Most current electroencephalography (EEG)-based brain–computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. Approach. We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. Main results. We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. Significance. This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.

  5. Non-negative matrix factorization in texture feature for classification of dementia with MRI data

    NASA Astrophysics Data System (ADS)

    Sarwinda, D.; Bustamam, A.; Ardaneswari, G.

    2017-07-01

    This paper investigates applications of non-negative matrix factorization as feature selection method to select the features from gray level co-occurrence matrix. The proposed approach is used to classify dementia using MRI data. In this study, texture analysis using gray level co-occurrence matrix is done to feature extraction. In the feature extraction process of MRI data, we found seven features from gray level co-occurrence matrix. Non-negative matrix factorization selected three features that influence of all features produced by feature extractions. A Naïve Bayes classifier is adapted to classify dementia, i.e. Alzheimer's disease, Mild Cognitive Impairment (MCI) and normal control. The experimental results show that non-negative factorization as feature selection method able to achieve an accuracy of 96.4% for classification of Alzheimer's and normal control. The proposed method also compared with other features selection methods i.e. Principal Component Analysis (PCA).

  6. Identifying Audiences of E-Infrastructures - Tools for Measuring Impact

    PubMed Central

    van den Besselaar, Peter

    2012-01-01

    Research evaluation should take into account the intended scholarly and non-scholarly audiences of the research output. This holds too for research infrastructures, which often aim at serving a large variety of audiences. With research and research infrastructures moving to the web, new possibilities are emerging for evaluation metrics. This paper proposes a feasible indicator for measuring the scope of audiences who use web-based e-infrastructures, as well as the frequency of use. In order to apply this indicator, a method is needed for classifying visitors to e-infrastructures into relevant user categories. The paper proposes such a method, based on an inductive logic program and a Bayesian classifier. The method is tested, showing that the visitors are efficiently classified with 90% accuracy into the selected categories. Consequently, the method can be used to evaluate the use of the e-infrastructure within and outside academia. PMID:23239995

  7. Global Optimization Ensemble Model for Classification Methods

    PubMed Central

    Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab

    2014-01-01

    Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382

  8. Intelligent Automatic Classification of True and Counterfeit Notes Based on Spectrum Analysis

    NASA Astrophysics Data System (ADS)

    Matsunaga, Shohei; Omatu, Sigeru; Kosaka, Toshohisa

    The purpose of this paper is to classify bank notes into “true” or “counterfeit” ones faster and more precisely compared with a conventional method. We note that thin lines are represented by direct lines in the images of true notes while they are represented in the counterfeit notes by dotted lines. This is due to properties of dot printers or scanner levels. To use the properties, we propose two method to classify a note into true or counterfeited one by checking whether there exist thin lines or dotted lines of the note. First, we use Fourier transform of the note to find quantity of features for classification and we classify a note into true or counterfeit one by using the features by Fourier transform. Then we propose a classification method by using wavelet transform in place of Fourier transform. Finally, some classification results are illustrated to show the effectiveness of the proposed methods.

  9. Argumentation Based Joint Learning: A Novel Ensemble Learning Approach

    PubMed Central

    Xu, Junyi; Yao, Li; Li, Le

    2015-01-01

    Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification. PMID:25966359

  10. Black hole evolution by spectral methods

    NASA Astrophysics Data System (ADS)

    Kidder, Lawrence E.; Scheel, Mark A.; Teukolsky, Saul A.; Carlson, Eric D.; Cook, Gregory B.

    2000-10-01

    Current methods of evolving a spacetime containing one or more black holes are plagued by instabilities that prohibit long-term evolution. Some of these instabilities may be due to the numerical method used, traditionally finite differencing. In this paper, we explore the use of a pseudospectral collocation (PSC) method for the evolution of a spherically symmetric black hole spacetime in one dimension using a hyperbolic formulation of Einstein's equations. We demonstrate that our PSC method is able to evolve a spherically symmetric black hole spacetime forever without enforcing constraints, even if we add dynamics via a Klein-Gordon scalar field. We find that, in contrast with finite-differencing methods, black hole excision is a trivial operation using PSC applied to a hyperbolic formulation of Einstein's equations. We discuss the extension of this method to three spatial dimensions.

  11. Neutrinoless double beta decay in chiral effective field theory: lepton number violation at dimension seven

    DOE PAGES

    Cirigliano, V.; Dekens, W.; de Vries, J.; ...

    2017-12-15

    Here, we analyze neutrinoless double beta decay (0νββ) within the framework of the Standard Model Effective Field Theory. Apart from the dimension-five Weinberg operator, the first contributions appear at dimension seven. We classify the operators and evolve them to the electroweak scale, where we match them to effective dimension-six, -seven, and -nine operators. In the next step, after renormalization group evolution to the QCD scale, we construct the chiral Lagrangian arising from these operators. We then develop a power-counting scheme and derive the two-nucleon 0νββ currents up to leading order in the power counting for each lepton-number-violating operator. We arguemore » that the leading-order contribution to the decay rate depends on a relatively small number of nuclear matrix elements. We test our power counting by comparing nuclear matrix elements obtained by various methods and by different groups. We find that the power counting works well for nuclear matrix elements calculated from a specific method, while, as in the case of light Majorana neutrino exchange, the overall magnitude of the matrix elements can differ by factors of two to three between methods. We also calculate the constraints that can be set on dimension-seven lepton-number-violating operators from 0νββ experiments and study the interplay between dimension-five and -seven operators, discussing how dimension-seven contributions affect the interpretation of 0νββ in terms of the effective Majorana mass m ββ .« less

  12. Neutrinoless double beta decay in chiral effective field theory: lepton number violation at dimension seven

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cirigliano, V.; Dekens, W.; de Vries, J.

    Here, we analyze neutrinoless double beta decay (0νββ) within the framework of the Standard Model Effective Field Theory. Apart from the dimension-five Weinberg operator, the first contributions appear at dimension seven. We classify the operators and evolve them to the electroweak scale, where we match them to effective dimension-six, -seven, and -nine operators. In the next step, after renormalization group evolution to the QCD scale, we construct the chiral Lagrangian arising from these operators. We then develop a power-counting scheme and derive the two-nucleon 0νββ currents up to leading order in the power counting for each lepton-number-violating operator. We arguemore » that the leading-order contribution to the decay rate depends on a relatively small number of nuclear matrix elements. We test our power counting by comparing nuclear matrix elements obtained by various methods and by different groups. We find that the power counting works well for nuclear matrix elements calculated from a specific method, while, as in the case of light Majorana neutrino exchange, the overall magnitude of the matrix elements can differ by factors of two to three between methods. We also calculate the constraints that can be set on dimension-seven lepton-number-violating operators from 0νββ experiments and study the interplay between dimension-five and -seven operators, discussing how dimension-seven contributions affect the interpretation of 0νββ in terms of the effective Majorana mass m ββ .« less

  13. Physician and Stakeholder Perceptions of Conflict of Interest Policies in Oncology

    PubMed Central

    Lockhart, A. Craig; Brose, Marcia S.; Kim, Edward S.; Johnson, David H.; Peppercorn, Jeffrey M.; Michels, Dina L.; Storm, Courtney D.; Schuchter, Lynn M.; Rathmell, W. Kimryn

    2013-01-01

    Purpose The landscape of managing potential conflicts of interest (COIs) has evolved substantially across many disciplines in recent years, but rarely are the issues more intertwined with financial and ethical implications than in the health care setting. Cancer care is a highly technologic arena, with numerous physician-industry interactions. The American Society of Clinical Oncology (ASCO) recognizes the role of a professional organization to facilitate management of these interactions and the need for periodic review of its COI policy (Policy). Methods To gauge the sentiments of ASCO members and nonphysician stakeholders, two surveys were performed. The first asked ASCO members to estimate opinions of the Policy as it relates to presentation of industry-sponsored research. Respondents were classified as consumers or producers of research material based on demographic responses. A similar survey solicited opinions of nonphysician stakeholders, including patients with cancer, survivors, family members, and advocates. Results The ASCO survey was responded to by 1,967 members (1% of those solicited); 80% were producers, and 20% were consumers. Most respondents (93% of producers; 66% of consumers) reported familiarity with the Policy. Only a small proportion regularly evaluated COIs for presented research. Members favored increased transparency about relationships over restrictions on presentations of research. Stakeholders (n = 264) indicated that disclosure was “very important” to “extremely important” and preferred written disclosure (77%) over other methods. Conclusion COI policies are an important and relevant topic among physicians and patient advocates. Methods to simplify the disclosure process, improve transparency, and facilitate responsiveness are critical for COI management. PMID:23530092

  14. An Intercomparison Between Radar Reflectivity and the IR Cloud Classification Technique for the TOGA-COARE Area

    NASA Technical Reports Server (NTRS)

    Carvalho, L. M. V.; Rickenbach, T.

    1999-01-01

    Satellite infrared (IR) and visible (VIS) images from the Tropical Ocean Global Atmosphere - Coupled Ocean Atmosphere Response Experiment (TOGA-COARE) experiment are investigated through the use of Clustering Analysis. The clusters are obtained from the values of IR and VIS counts and the local variance for both channels. The clustering procedure is based on the standardized histogram of each variable obtained from 179 pairs of images. A new approach to classify high clouds using only IR and the clustering technique is proposed. This method allows the separation of the enhanced convection in two main classes: convective tops, more closely related to the most active core of the storm, and convective systems, which produce regions of merged, thick anvil clouds. The resulting classification of different portions of cloudiness is compared to the radar reflectivity field for intensive events. Convective Systems and Convective Tops are followed during their life cycle using the IR clustering method. The areal coverage of precipitation and features related to convective and stratiform rain is obtained from the radar for each stage of the evolving Mesoscale Convective Systems (MCS). In order to compare the IR clustering method with a simple threshold technique, two IR thresholds (Tir) were used to identify different portions of cloudiness, Tir=240K which roughly defines the extent of all cloudiness associated with the MCS, and Tir=220K which indicates the presence of deep convection. It is shown that the IR clustering technique can be used as a simple alternative to identify the actual portion of convective and stratiform rainfall.

  15. Using Trained Pixel Classifiers to Select Images of Interest

    NASA Technical Reports Server (NTRS)

    Mazzoni, D.; Wagstaff, K.; Castano, R.

    2004-01-01

    We present a machine-learning-based approach to ranking images based on learned priorities. Unlike previous methods for image evaluation, which typically assess the value of each image based on the presence of predetermined specific features, this method involves using two levels of machine-learning classifiers: one level is used to classify each pixel as belonging to one of a group of rather generic classes, and another level is used to rank the images based on these pixel classifications, given some example rankings from a scientist as a guide. Initial results indicate that the technique works well, producing new rankings that match the scientist's rankings significantly better than would be expected by chance. The method is demonstrated for a set of images collected by a Mars field-test rover.

  16. The visualCMAT: A web-server to select and interpret correlated mutations/co-evolving residues in protein families.

    PubMed

    Suplatov, Dmitry; Sharapova, Yana; Timonina, Daria; Kopylov, Kirill; Švedas, Vytas

    2018-04-01

    The visualCMAT web-server was designed to assist experimental research in the fields of protein/enzyme biochemistry, protein engineering, and drug discovery by providing an intuitive and easy-to-use interface to the analysis of correlated mutations/co-evolving residues. Sequence and structural information describing homologous proteins are used to predict correlated substitutions by the Mutual information-based CMAT approach, classify them into spatially close co-evolving pairs, which either form a direct physical contact or interact with the same ligand (e.g. a substrate or a crystallographic water molecule), and long-range correlations, annotate and rank binding sites on the protein surface by the presence of statistically significant co-evolving positions. The results of the visualCMAT are organized for a convenient visual analysis and can be downloaded to a local computer as a content-rich all-in-one PyMol session file with multiple layers of annotation corresponding to bioinformatic, statistical and structural analyses of the predicted co-evolution, or further studied online using the built-in interactive analysis tools. The online interactivity is implemented in HTML5 and therefore neither plugins nor Java are required. The visualCMAT web-server is integrated with the Mustguseal web-server capable of constructing large structure-guided sequence alignments of protein families and superfamilies using all available information about their structures and sequences in public databases. The visualCMAT web-server can be used to understand the relationship between structure and function in proteins, implemented at selecting hotspots and compensatory mutations for rational design and directed evolution experiments to produce novel enzymes with improved properties, and employed at studying the mechanism of selective ligand's binding and allosteric communication between topologically independent sites in protein structures. The web-server is freely available at https://biokinet.belozersky.msu.ru/visualcmat and there are no login requirements.

  17. Impact of study design on development and evaluation of an activity-type classifier.

    PubMed

    van Hees, Vincent T; Golubic, Rajna; Ekelund, Ulf; Brage, Søren

    2013-04-01

    Methods to classify activity types are often evaluated with an experimental protocol involving prescribed physical activities under confined (laboratory) conditions, which may not reflect real-life conditions. The present study aims to evaluate how study design may impact on classifier performance in real life. Twenty-eight healthy participants (21-53 yr) were asked to wear nine triaxial accelerometers while performing 58 activity types selected to simulate activities in real life. For each sensor location, logistic classifiers were trained in subsets of up to 8 activities to distinguish between walking and nonwalking activities and were then evaluated in all 58 activities. Different weighting factors were used to convert the resulting confusion matrices into an estimation of the confusion matrix as would apply in the real-life setting by creating four different real-life scenarios, as well as one traditional laboratory scenario. The sensitivity of a classifier estimated with a traditional laboratory protocol is within the range of estimates derived from real-life scenarios for any body location. The specificity, however, was systematically overestimated by the traditional laboratory scenario. Walking time was systematically overestimated, except for lower back sensor data (range: 7-757%). In conclusion, classifier performance under confined conditions may not accurately reflect classifier performance in real life. Future studies that aim to evaluate activity classification methods are warranted to pay special attention to the representativeness of experimental conditions for real-life conditions.

  18. Optimizing computer-aided colonic polyp detection for CT colonography by evolving the Pareto front1

    PubMed Central

    Li, Jiang; Huang, Adam; Yao, Jack; Liu, Jiamin; Van Uitert, Robert L.; Petrick, Nicholas; Summers, Ronald M.

    2009-01-01

    A multiobjective genetic algorithm is designed to optimize a computer-aided detection (CAD) system for identifying colonic polyps. Colonic polyps appear as elliptical protrusions on the inner surface of the colon. Curvature-based features for colonic polyp detection have proved to be successful in several CT colonography (CTC) CAD systems. Our CTC CAD program uses a sequential classifier to form initial polyp detections on the colon surface. The classifier utilizes a set of thresholds on curvature-based features to cluster suspicious colon surface regions into polyp candidates. The thresholds were previously chosen experimentally by using feature histograms. The chosen thresholds were effective for detecting polyps sized 10 mm or larger in diameter. However, many medium-sized polyps, 6–9 mm in diameter, were missed in the initial detection procedure. In this paper, the task of finding optimal thresholds as a multiobjective optimization problem was formulated, and a genetic algorithm to solve it was utilized by evolving the Pareto front of the Pareto optimal set. The new CTC CAD system was tested on 792 patients. The sensitivities of the optimized system improved significantly, from 61.68% to 74.71% with an increase of 13.03% (95% CI [6.57%, 19.5%], p=7.78×10−5) for the size category of 6–9 mm polyps, from 65.02% to 77.4% with an increase of 12.38% (95% CI [6.23%, 18.53%], p=7.95×10−5) for polyps 6 mm or larger, and from 82.2% to 90.58% with an increase of 8.38% (95%CI [0.75%, 16%], p=0.03) for polyps 8 mm or larger at comparable false positive rates. The sensitivities of the optimized system are nearly equivalent to those of expert radiologists. PMID:19235388

  19. Classification of burn wounds using support vector machines

    NASA Astrophysics Data System (ADS)

    Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

    2004-05-01

    The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.

  20. Intelligent reservoir operation system based on evolving artificial neural networks

    NASA Astrophysics Data System (ADS)

    Chaves, Paulo; Chang, Fi-John

    2008-06-01

    We propose a novel intelligent reservoir operation system based on an evolving artificial neural network (ANN). Evolving means the parameters of the ANN model are identified by the GA evolutionary optimization technique. Accordingly, the ANN model should represent the operational strategies of reservoir operation. The main advantages of the Evolving ANN Intelligent System (ENNIS) are as follows: (i) only a small number of parameters to be optimized even for long optimization horizons, (ii) easy to handle multiple decision variables, and (iii) the straightforward combination of the operation model with other prediction models. The developed intelligent system was applied to the operation of the Shihmen Reservoir in North Taiwan, to investigate its applicability and practicability. The proposed method is first built to a simple formulation for the operation of the Shihmen Reservoir, with single objective and single decision. Its results were compared to those obtained by dynamic programming. The constructed network proved to be a good operational strategy. The method was then built and applied to the reservoir with multiple (five) decision variables. The results demonstrated that the developed evolving neural networks improved the operation performance of the reservoir when compared to its current operational strategy. The system was capable of successfully simultaneously handling various decision variables and provided reasonable and suitable decisions.

  1. Testing for genetically modified organisms (GMOs): Past, present and future perspectives.

    PubMed

    Holst-Jensen, Arne

    2009-01-01

    This paper presents an overview of GMO testing methodologies and how these have evolved and may evolve in the next decade. Challenges and limitations for the application of the test methods as well as to the interpretation of results produced with the methods are highlighted and discussed, bearing in mind the various interests and competences of the involved stakeholders. To better understand the suitability and limitations of detection methodologies the evolution of transformation processes for creation of GMOs is briefly reviewed.

  2. Improving ensemble decision tree performance using Adaboost and Bagging

    NASA Astrophysics Data System (ADS)

    Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie

    2015-12-01

    Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.

  3. Improving ECG Classification Accuracy Using an Ensemble of Neural Network Modules

    PubMed Central

    Javadi, Mehrdad; Ebrahimpour, Reza; Sajedin, Atena; Faridi, Soheil; Zakernejad, Shokoufeh

    2011-01-01

    This paper illustrates the use of a combined neural network model based on Stacked Generalization method for classification of electrocardiogram (ECG) beats. In conventional Stacked Generalization method, the combiner learns to map the base classifiers' outputs to the target data. We claim adding the input pattern to the base classifiers' outputs helps the combiner to obtain knowledge about the input space and as the result, performs better on the same task. Experimental results support our claim that the additional knowledge according to the input space, improves the performance of the proposed method which is called Modified Stacked Generalization. In particular, for classification of 14966 ECG beats that were not previously seen during training phase, the Modified Stacked Generalization method reduced the error rate for 12.41% in comparison with the best of ten popular classifier fusion methods including Max, Min, Average, Product, Majority Voting, Borda Count, Decision Templates, Weighted Averaging based on Particle Swarm Optimization and Stacked Generalization. PMID:22046232

  4. Semi-supervised vibration-based classification and condition monitoring of compressors

    NASA Astrophysics Data System (ADS)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  5. Target discrimination method for SAR images based on semisupervised co-training

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Du, Lan; Dai, Hui

    2018-01-01

    Synthetic aperture radar (SAR) target discrimination is usually performed in a supervised manner. However, supervised methods for SAR target discrimination may need lots of labeled training samples, whose acquirement is costly, time consuming, and sometimes impossible. This paper proposes an SAR target discrimination method based on semisupervised co-training, which utilizes a limited number of labeled samples and an abundant number of unlabeled samples. First, Lincoln features, widely used in SAR target discrimination, are extracted from the training samples and partitioned into two sets according to their physical meanings. Second, two support vector machine classifiers are iteratively co-trained with the extracted two feature sets based on the co-training algorithm. Finally, the trained classifiers are exploited to classify the test data. The experimental results on real SAR images data not only validate the effectiveness of the proposed method compared with the traditional supervised methods, but also demonstrate the superiority of co-training over self-training, which only uses one feature set.

  6. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients

    NASA Astrophysics Data System (ADS)

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  7. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    PubMed

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  8. An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics

    PubMed Central

    Torii, Manabu; Yin, Lanlan; Nguyen, Thang; Mazumdar, Chand T.; Liu, Hongfang; Hartley, David M.; Nelson, Noele P.

    2014-01-01

    Purpose Early detection of infectious disease outbreaks is crucial to protecting the public health of a society. Online news articles provide timely information on disease outbreaks worldwide. In this study, we investigated automated detection of articles relevant to disease outbreaks using machine learning classifiers. In a real-life setting, it is expensive to prepare a training data set for classifiers, which usually consists of manually labeled relevant and irrelevant articles. To mitigate this challenge, we examined the use of randomly sampled unlabeled articles as well as labeled relevant articles. Methods Naïve Bayes and Support Vector Machine (SVM) classifiers were trained on 149 relevant and 149 or more randomly sampled unlabeled articles. Diverse classifiers were trained by varying the number of sampled unlabeled articles and also the number of word features. The trained classifiers were applied to 15 thousand articles published over 15 days. Top-ranked articles from each classifier were pooled and the resulting set of 1337 articles was reviewed by an expert analyst to evaluate the classifiers. Results Daily averages of areas under ROC curves (AUCs) over the 15-day evaluation period were 0.841 and 0.836, respectively, for the naïve Bayes and SVM classifier. We referenced a database of disease outbreak reports to confirm that this evaluation data set resulted from the pooling method indeed covered incidents recorded in the database during the evaluation period. Conclusions The proposed text classification framework utilizing randomly sampled unlabeled articles can facilitate a cost-effective approach to training machine learning classifiers in a real-life Internet-based biosurveillance project. We plan to examine this framework further using larger data sets and using articles in non-English languages. PMID:21134784

  9. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients.

    PubMed

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  10. Evaluation of several schemes for classification of remotely sensed data: Their parameters and performance. [Foster County, North Dakota; Grant County, Kansas; Iroquois County, Illinois, Tippecanoe County, Indiana; and Pottawattamie and Shelby Counties, Iowa

    NASA Technical Reports Server (NTRS)

    Scholz, D.; Fuhs, N.; Hixson, M.; Akiyama, T. (Principal Investigator)

    1979-01-01

    The author has identified the following significant results. Data sets for corn, soybeans, winter wheat, and spring wheat were used to evaluate the following schemes for crop identification: (1) per point Gaussian maximum classifier; (2) per point sum of normal densities classifiers; (3) per point linear classifier; (4) per point Gaussian maximum likelihood decision tree classifiers; and (5) texture sensitive per field Gaussian maximum likelihood classifier. Test site location and classifier both had significant effects on classification accuracy of small grains; classifiers did not differ significantly in overall accuracy, with the majority of the difference among classifiers being attributed to training method rather than to the classification algorithm applied. The complexity of use and computer costs for the classifiers varied significantly. A linear classification rule which assigns each pixel to the class whose mean is closest in Euclidean distance was the easiest for the analyst and cost the least per classification.

  11. A method for classification of transient events in EEG recordings: application to epilepsy diagnosis.

    PubMed

    Tzallas, A T; Karvelis, P S; Katsis, C D; Fotiadis, D I; Giannopoulos, S; Konitsiotis, S

    2006-01-01

    The aim of the paper is to analyze transient events in inter-ictal EEG recordings, and classify epileptic activity into focal or generalized epilepsy using an automated method. A two-stage approach is proposed. In the first stage the observed transient events of a single channel are classified into four categories: epileptic spike (ES), muscle activity (EMG), eye blinking activity (EOG), and sharp alpha activity (SAA). The process is based on an artificial neural network. Different artificial neural network architectures have been tried and the network having the lowest error has been selected using the hold out approach. In the second stage a knowledge-based system is used to produce diagnosis for focal or generalized epileptic activity. The classification of transient events reported high overall accuracy (84.48%), while the knowledge-based system for epilepsy diagnosis correctly classified nine out of ten cases. The proposed method is advantageous since it effectively detects and classifies the undesirable activity into appropriate categories and produces a final outcome related to the existence of epilepsy.

  12. A Novel Data-Driven Learning Method for Radar Target Detection in Nonstationary Environments

    DTIC Science & Technology

    2016-05-01

    Classifier ensembles for changing environments,” in Multiple Classifier Systems, vol. 3077, F. Roli, J. Kittler and T. Windeatt, Eds. New York, NY...Dec. 2006, pp. 1113–1118. [21] J. Z. Kolter and M. A. Maloof, “Dynamic weighted majority: An ensemble method for drifting concepts,” J. Mach. Learn...Trans. Neural Netw., vol. 22, no. 10, pp. 1517–1531, Oct. 2011. [23] R. Polikar, “ Ensemble learning,” in Ensemble Machine Learning: Methods and

  13. Testing of the Support Vector Machine for Binary-Class Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew

    2011-01-01

    The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results

  14. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems.

    PubMed

    Cho, Ming-Yuan; Hoang, Thi Thom

    2017-01-01

    Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  15. PyEvolve: a toolkit for statistical modelling of molecular evolution.

    PubMed

    Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A

    2004-01-05

    Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.

  16. Biodegradation of Poly(butylene succinate) Powder in a Controlled Compost at 58 °C Evaluated by Naturally-Occurring Carbon 14 Amounts in Evolved CO2 Based on the ISO 14855-2 Method

    PubMed Central

    Kunioka, Masao; Ninomiya, Fumi; Funabashi, Masahiro

    2009-01-01

    The biodegradabilities of poly(butylene succinate) (PBS) powders in a controlled compost at 58 °C have been studied using a Microbial Oxidative Degradation Analyzer (MODA) based on the ISO 14855-2 method, entitled “Determination of the ultimate aerobic biodegradability of plastic materials under controlled composting conditions—Method by analysis of evolved carbon dioxide—Part 2: Gravimetric measurement of carbon dioxide evolved in a laboratory-scale test”. The evolved CO2 was trapped by an additional aqueous Ba(OH)2 solution. The trapped BaCO3 was transformed into graphite via a serial vaporization and reduction reaction using a gas-tight tube and vacuum manifold system. This graphite was analyzed by accelerated mass spectrometry (AMS) to determine the percent modern carbon [pMC (sample)] based on the 14C radiocarbon concentration. By using the theory that pMC (sample) was the sum of the pMC (compost) (109.87%) and pMC (PBS) (0%) as the respective ratio in the determined period, the CO2 (respiration) was calculated from only one reaction vessel. It was found that the biodegradabilities determined by the CO2 amount from PBS in the sample vessel were about 30% lower than those based on the ISO method. These differences between the ISO and AMS methods are caused by the fact that part of the carbons from PBS are changed into metabolites by the microorganisms in the compost, and not changed into CO2. PMID:20057944

  17. Automatic Estimation of Osteoporotic Fracture Cases by Using Ensemble Learning Approaches.

    PubMed

    Kilic, Niyazi; Hosgormez, Erkan

    2016-03-01

    Ensemble learning methods are one of the most powerful tools for the pattern classification problems. In this paper, the effects of ensemble learning methods and some physical bone densitometry parameters on osteoporotic fracture detection were investigated. Six feature set models were constructed including different physical parameters and they fed into the ensemble classifiers as input features. As ensemble learning techniques, bagging, gradient boosting and random subspace (RSM) were used. Instance based learning (IBk) and random forest (RF) classifiers applied to six feature set models. The patients were classified into three groups such as osteoporosis, osteopenia and control (healthy), using ensemble classifiers. Total classification accuracy and f-measure were also used to evaluate diagnostic performance of the proposed ensemble classification system. The classification accuracy has reached to 98.85 % by the combination of model 6 (five BMD + five T-score values) using RSM-RF classifier. The findings of this paper suggest that the patients will be able to be warned before a bone fracture occurred, by just examining some physical parameters that can easily be measured without invasive operations.

  18. Non-Mutually Exclusive Deep Neural Network Classifier for Combined Modes of Bearing Fault Diagnosis.

    PubMed

    Duong, Bach Phi; Kim, Jong-Myon

    2018-04-07

    The simultaneous occurrence of various types of defects in bearings makes their diagnosis more challenging owing to the resultant complexity of the constituent parts of the acoustic emission (AE) signals. To address this issue, a new approach is proposed in this paper for the detection of multiple combined faults in bearings. The proposed methodology uses a deep neural network (DNN) architecture to effectively diagnose the combined defects. The DNN structure is based on the stacked denoising autoencoder non-mutually exclusive classifier (NMEC) method for combined modes. The NMEC-DNN is trained using data for a single fault and it classifies both single faults and multiple combined faults. The results of experiments conducted on AE data collected through an experimental test-bed demonstrate that the DNN achieves good classification performance with a maximum accuracy of 95%. The proposed method is compared with a multi-class classifier based on support vector machines (SVMs). The NMEC-DNN yields better diagnostic performance in comparison to the multi-class classifier based on SVM. The NMEC-DNN reduces the number of necessary data collections and improves the bearing fault diagnosis performance.

  19. Classification of breast abnormalities using artificial neural network

    NASA Astrophysics Data System (ADS)

    Zaman, Nur Atiqah Kamarul; Rahman, Wan Eny Zarina Wan Abdul; Jumaat, Abdul Kadir; Yasiran, Siti Salmah

    2015-05-01

    Classification is the process of recognition, differentiation and categorizing objects into groups. Breast abnormalities are calcifications which are tumor markers that indicate the presence of cancer in the breast. The aims of this research are to classify the types of breast abnormalities using artificial neural network (ANN) classifier and to evaluate the accuracy performance using receiver operating characteristics (ROC) curve. The methods used in this research are ANN for breast abnormalities classifications and Canny edge detector as a feature extraction method. Previously the ANN classifier provides only the number of benign and malignant cases without providing information for specific cases. However in this research, the type of abnormality for each image can be obtained. The existing MIAS MiniMammographic database classified the mammogram images into three features only namely characteristic of background tissues, class of abnormality and radius of abnormality. However, in this research three other features are added-in. These three features are number of spots, area and shape of abnormalities. Lastly the performance of the ANN classifier is evaluated using ROC curve. It is found that ANN has an accuracy of 97.9% which is considered acceptable.

  20. Dynamic Blowout Risk Analysis Using Loss Functions.

    PubMed

    Abimbola, Majeed; Khan, Faisal

    2018-02-01

    Most risk analysis approaches are static; failing to capture evolving conditions. Blowout, the most feared accident during a drilling operation, is a complex and dynamic event. The traditional risk analysis methods are useful in the early design stage of drilling operation while falling short during evolving operational decision making. A new dynamic risk analysis approach is presented to capture evolving situations through dynamic probability and consequence models. The dynamic consequence models, the focus of this study, are developed in terms of loss functions. These models are subsequently integrated with the probability to estimate operational risk, providing a real-time risk analysis. The real-time evolving situation is considered dependent on the changing bottom-hole pressure as drilling progresses. The application of the methodology and models are demonstrated with a case study of an offshore drilling operation evolving to a blowout. © 2017 Society for Risk Analysis.

  1. Experiences Using Lightweight Formal Methods for Requirements Modeling

    NASA Technical Reports Server (NTRS)

    Easterbrook, Steve; Lutz, Robyn; Covington, Rick; Kelly, John; Ampo, Yoko; Hamilton, David

    1997-01-01

    This paper describes three case studies in the lightweight application of formal methods to requirements modeling for spacecraft fault protection systems. The case studies differ from previously reported applications of formal methods in that formal methods were applied very early in the requirements engineering process, to validate the evolving requirements. The results were fed back into the projects, to improve the informal specifications. For each case study, we describe what methods were applied, how they were applied, how much effort was involved, and what the findings were. In all three cases, formal methods enhanced the existing verification and validation processes, by testing key properties of the evolving requirements, and helping to identify weaknesses. We conclude that the benefits gained from early modeling of unstable requirements more than outweigh the effort needed to maintain multiple representations.

  2. Rapid methods for the detection of foodborne bacterial pathogens: principles, applications, advantages and limitations

    PubMed Central

    Law, Jodi Woan-Fei; Ab Mutalib, Nurul-Syakima; Chan, Kok-Gan; Lee, Learn-Han

    2015-01-01

    The incidence of foodborne diseases has increased over the years and resulted in major public health problem globally. Foodborne pathogens can be found in various foods and it is important to detect foodborne pathogens to provide safe food supply and to prevent foodborne diseases. The conventional methods used to detect foodborne pathogen are time consuming and laborious. Hence, a variety of methods have been developed for rapid detection of foodborne pathogens as it is required in many food analyses. Rapid detection methods can be categorized into nucleic acid-based, biosensor-based and immunological-based methods. This review emphasizes on the principles and application of recent rapid methods for the detection of foodborne bacterial pathogens. Detection methods included are simple polymerase chain reaction (PCR), multiplex PCR, real-time PCR, nucleic acid sequence-based amplification (NASBA), loop-mediated isothermal amplification (LAMP) and oligonucleotide DNA microarray which classified as nucleic acid-based methods; optical, electrochemical and mass-based biosensors which classified as biosensor-based methods; enzyme-linked immunosorbent assay (ELISA) and lateral flow immunoassay which classified as immunological-based methods. In general, rapid detection methods are generally time-efficient, sensitive, specific and labor-saving. The developments of rapid detection methods are vital in prevention and treatment of foodborne diseases. PMID:25628612

  3. Virtually Endless Possibilities for Business Communication

    ERIC Educational Resources Information Center

    Jennings, Susan Evans

    2010-01-01

    Business communication educators need to realize that as technology changes and evolves, they must also change and evolve their teaching methods and content. Cell phones, email, blogs, wikis, and text messaging are just a few examples of business communication technologies that not so long ago were viewed as entertainment for teens or techies, but…

  4. Reconstruction of instantaneous surface normal velocity of a vibrating structure using interpolated time-domain equivalent source method

    NASA Astrophysics Data System (ADS)

    Geng, Lin; Bi, Chuan-Xing; Xie, Feng; Zhang, Xiao-Zheng

    2018-07-01

    Interpolated time-domain equivalent source method is extended to reconstruct the instantaneous surface normal velocity of a vibrating structure by using the time-evolving particle velocity as the input, which provides a non-contact way to overall understand the instantaneous vibration behavior of the structure. In this method, the time-evolving particle velocity in the near field is first modeled by a set of equivalent sources positioned inside the vibrating structure, and then the integrals of equivalent source strengths are solved by an iterative solving process and are further used to calculate the instantaneous surface normal velocity. An experiment of a semi-cylindrical steel plate impacted by a steel ball is investigated to examine the ability of the extended method, where the time-evolving normal particle velocity and pressure on the hologram surface measured by a Microflown pressure-velocity probe are used as the inputs of the extended method and the method based on pressure measurements, respectively, and the instantaneous surface normal velocity of the plate measured by a laser Doppler vibrometry is used as the reference for comparison. The experimental results demonstrate that the extended method is a powerful tool to visualize the instantaneous surface normal velocity of a vibrating structure in both time and space domains and can obtain more accurate results than that of the method based on pressure measurements.

  5. Detection of urban expansion in an urban-rural landscape with multitemporal QuickBird images

    PubMed Central

    Lu, Dengsheng; Hetrick, Scott; Moran, Emilio; Li, Guiying

    2011-01-01

    Accurately detecting urban expansion with remote sensing techniques is a challenge due to the complexity of urban landscapes. This paper explored methods for detecting urban expansion with multitemporal QuickBird images in Lucas do Rio Verde, Mato Grosso, Brazil. Different techniques, including image differencing, principal component analysis (PCA), and comparison of classified impervious surface images with the matched filtering method, were used to examine urbanization detection. An impervious surface image classified with the hybrid method was used to modify the urbanization detection results. As a comparison, the original multispectral image and segmentation-based mean-spectral images were used during the detection of urbanization. This research indicates that the comparison of classified impervious surface images with matched filtering method provides the best change detection performance, followed by the image differencing method based on segmentation-based mean spectral images. The PCA is not a good method for urban change detection in this study. Shadows and high spectral variation within the impervious surfaces represent major challenges to the detection of urban expansion when high spatial resolution images are used. PMID:21799706

  6. Health condition identification of multi-stage planetary gearboxes using a mRVM-based method

    NASA Astrophysics Data System (ADS)

    Lei, Yaguo; Liu, Zongyao; Wu, Xionghui; Li, Naipeng; Chen, Wu; Lin, Jing

    2015-08-01

    Multi-stage planetary gearboxes are widely applied in aerospace, automotive and heavy industries. Their key components, such as gears and bearings, can easily suffer from damage due to tough working environment. Health condition identification of planetary gearboxes aims to prevent accidents and save costs. This paper proposes a method based on multiclass relevance vector machine (mRVM) to identify health condition of multi-stage planetary gearboxes. In this method, a mRVM algorithm is adopted as a classifier, and two features, i.e. accumulative amplitudes of carrier orders (AACO) and energy ratio based on difference spectra (ERDS), are used as the input of the classifier to classify different health conditions of multi-stage planetary gearboxes. To test the proposed method, seven health conditions of a two-stage planetary gearbox are considered and vibration data is acquired from the planetary gearbox under different motor speeds and loading conditions. The results of three tests based on different data show that the proposed method obtains an improved identification performance and robustness compared with the existing method.

  7. Generative Models for Similarity-based Classification

    DTIC Science & Technology

    2007-01-01

    NC), local nearest centroid (local NC), k-nearest neighbors ( kNN ), and condensed nearest neighbors (CNN) are all similarity-based classifiers which...vector machine to the k nearest neighbors of the test sample [80]. The SVM- KNN method was developed to address the robustness and dimensionality...concerns that afflict nearest neighbors and SVMs. Similarly to the nearest-means classifier, the SVM- KNN is a hybrid local and global classifier developed

  8. Ensemble Clustering Classification compete SVM and One-Class classifiers applied on plant microRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai

    2016-12-22

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  9. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai

    2016-12-01

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  10. Intratumor heterogeneity of DCE-MRI reveals Ki-67 proliferation status in breast cancer

    NASA Astrophysics Data System (ADS)

    Cheng, Hu; Fan, Ming; Zhang, Peng; Liu, Bin; Shao, Guoliang; Li, Lihua

    2018-03-01

    Breast cancer is a highly heterogeneous disease both biologically and clinically, and certain pathologic parameters, i.e., Ki67 expression, are useful in predicting the prognosis of patients. The aim of the study is to identify intratumor heterogeneity of breast cancer for predicting Ki-67 proliferation status in estrogen receptor (ER)-positive breast cancer patients. A dataset of 77 patients was collected who underwent dynamic contrast enhancement magnetic resonance imaging (DCE-MRI) examination. Of these patients, 51 were high-Ki-67 expression and 26 were low-Ki-67 expression. We partitioned the breast tumor into subregions using two methods based on the values of time to peak (TTP) and peak enhancement rate (PER). Within each tumor subregion, image features were extracted including statistical and morphological features from DCE-MRI. The classification models were applied on each region separately to assess whether the classifiers based on features extracted from various subregions features could have different performance for prediction. An area under a receiver operating characteristic curve (AUC) was computed using leave-one-out cross-validation (LOOCV) method. The classifier using features related with moderate time to peak achieved best performance with AUC of 0.826 than that based on the other regions. While using multi-classifier fusion method, the AUC value was significantly (P=0.03) increased to 0.858+/-0.032 compare to classifier with AUC of 0.778 using features from the entire tumor. The results demonstrated that features reflect heterogeneity in intratumoral subregions can improve the classifier performance to predict the Ki-67 proliferation status than the classifier using features from entire tumor alone.

  11. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification.

    PubMed

    Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.

  12. Please Don't Move-Evaluating Motion Artifact From Peripheral Quantitative Computed Tomography Scans Using Textural Features.

    PubMed

    Rantalainen, Timo; Chivers, Paola; Beck, Belinda R; Robertson, Sam; Hart, Nicolas H; Nimphius, Sophia; Weeks, Benjamin K; McIntyre, Fleur; Hands, Beth; Siafarikas, Aris

    Most imaging methods, including peripheral quantitative computed tomography (pQCT), are susceptible to motion artifacts particularly in fidgety pediatric populations. Methods currently used to address motion artifact include manual screening (visual inspection) and objective assessments of the scans. However, previously reported objective methods either cannot be applied on the reconstructed image or have not been tested for distal bone sites. Therefore, the purpose of the present study was to develop and validate motion artifact classifiers to quantify motion artifact in pQCT scans. Whether textural features could provide adequate motion artifact classification performance in 2 adolescent datasets with pQCT scans from tibial and radial diaphyses and epiphyses was tested. The first dataset was split into training (66% of sample) and validation (33% of sample) datasets. Visual classification was used as the ground truth. Moderate to substantial classification performance (J48 classifier, kappa coefficients from 0.57 to 0.80) was observed in the validation dataset with the novel texture-based classifier. In applying the same classifier to the second cross-sectional dataset, a slight-to-fair (κ = 0.01-0.39) classification performance was observed. Overall, this novel textural analysis-based classifier provided a moderate-to-substantial classification of motion artifact when the classifier was specifically trained for the measurement device and population. Classification based on textural features may be used to prescreen obviously acceptable and unacceptable scans, with a subsequent human-operated visual classification of any remaining scans. Copyright © 2017 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.

  13. Segmentation and analysis of mouse pituitary cells with graphic user interface (GUI)

    NASA Astrophysics Data System (ADS)

    González, Erika; Medina, Lucía.; Hautefeuille, Mathieu; Fiordelisio, Tatiana

    2018-02-01

    In this work we present a method to perform pituitary cell segmentation in image stacks acquired by fluorescence microscopy from pituitary slice preparations. Although there exist many procedures developed to achieve cell segmentation tasks, they are generally based on the edge detection and require high resolution images. However in the biological preparations that we worked on, the cells are not well defined as experts identify their intracellular calcium activity due to fluorescence intensity changes in different regions over time. This intensity changes were associated with time series over regions, and because they present a particular behavior they were used into a classification procedure in order to perform cell segmentation. Two logistic regression classifiers were implemented for the time series classification task using as features the area under the curve and skewness in the first classifier and skewness and kurtosis in the second classifier. Once we have found both decision boundaries in two different feature spaces by training using 120 time series, the decision boundaries were tested over 12 image stacks through a python graphical user interface (GUI), generating binary images where white pixels correspond to cells and the black ones to background. Results show that area-skewness classifier reduces the time an expert dedicates in locating cells by up to 75% in some stacks versus a 92% for the kurtosis-skewness classifier, this evaluated on the number of regions the method found. Due to the promising results, we expect that this method will be improved adding more relevant features to the classifier.

  14. Optimizing a machine learning based glioma grading system using multi-parametric MRI histogram and texture features

    PubMed Central

    Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin

    2017-01-01

    Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization. PMID:28599282

  15. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification

    PubMed Central

    Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases. PMID:26764911

  16. Optimizing a machine learning based glioma grading system using multi-parametric MRI histogram and texture features.

    PubMed

    Zhang, Xin; Yan, Lin-Feng; Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin

    2017-07-18

    Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization.

  17. Evolving mobile robots able to display collective behaviors.

    PubMed

    Baldassarre, Gianluca; Nolfi, Stefano; Parisi, Domenico

    2003-01-01

    We present a set of experiments in which simulated robots are evolved for the ability to aggregate and move together toward a light target. By developing and using quantitative indexes that capture the structural properties of the emerged formations, we show that evolved individuals display interesting behavioral patterns in which groups of robots act as a single unit. Moreover, evolved groups of robots with identical controllers display primitive forms of situated specialization and play different behavioral functions within the group according to the circumstances. Overall, the results presented in the article demonstrate that evolutionary techniques, by exploiting the self-organizing behavioral properties that emerge from the interactions between the robots and between the robots and the environment, are a powerful method for synthesizing collective behavior.

  18. Population Analysis of Disabled Children by Departments in France

    NASA Astrophysics Data System (ADS)

    Meidatuzzahra, Diah; Kuswanto, Heri; Pech, Nicolas; Etchegaray, Amélie

    2017-06-01

    In this study, a statistical analysis is performed by model the variations of the disabled about 0-19 years old population among French departments. The aim is to classify the departments according to their profile determinants (socioeconomic and behavioural profiles). The analysis is focused on two types of methods: principal component analysis (PCA) and multiple correspondences factorial analysis (MCA) to review which one is the best methods for interpretation of the correlation between the determinants of disability (independent variable). The PCA is the best method for interpretation of the correlation between the determinants of disability (independent variable). The PCA reduces 14 determinants of disability to 4 axes, keeps 80% of total information, and classifies them into 7 classes. The MCA reduces the determinants to 3 axes, retains only 30% of information, and classifies them into 4 classes.

  19. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

    PubMed

    S K, Somasundaram; P, Alli

    2017-11-09

    The main complication of diabetes is Diabetic retinopathy (DR), retinal vascular disease and it leads to the blindness. Regular screening for early DR disease detection is considered as an intensive labor and resource oriented task. Therefore, automatic detection of DR diseases is performed only by using the computational technique is the great solution. An automatic method is more reliable to determine the presence of an abnormality in Fundus images (FI) but, the classification process is poorly performed. Recently, few research works have been designed for analyzing texture discrimination capacity in FI to distinguish the healthy images. However, the feature extraction (FE) process was not performed well, due to the high dimensionality. Therefore, to identify retinal features for DR disease diagnosis and early detection using Machine Learning and Ensemble Classification method, called, Machine Learning Bagging Ensemble Classifier (ML-BEC) is designed. The ML-BEC method comprises of two stages. The first stage in ML-BEC method comprises extraction of the candidate objects from Retinal Images (RI). The candidate objects or the features for DR disease diagnosis include blood vessels, optic nerve, neural tissue, neuroretinal rim, optic disc size, thickness and variance. These features are initially extracted by applying Machine Learning technique called, t-distributed Stochastic Neighbor Embedding (t-SNE). Besides, t-SNE generates a probability distribution across high-dimensional images where the images are separated into similar and dissimilar pairs. Then, t-SNE describes a similar probability distribution across the points in the low-dimensional map. This lessens the Kullback-Leibler divergence among two distributions regarding the locations of the points on the map. The second stage comprises of application of ensemble classifiers to the extracted features for providing accurate analysis of digital FI using machine learning. In this stage, an automatic detection of DR screening system using Bagging Ensemble Classifier (BEC) is investigated. With the help of voting the process in ML-BEC, bagging minimizes the error due to variance of the base classifier. With the publicly available retinal image databases, our classifier is trained with 25% of RI. Results show that the ensemble classifier can achieve better classification accuracy (CA) than single classification models. Empirical experiments suggest that the machine learning-based ensemble classifier is efficient for further reducing DR classification time (CT).

  20. Physician and stakeholder perceptions of conflict of interest policies in oncology.

    PubMed

    Lockhart, A Craig; Brose, Marcia S; Kim, Edward S; Johnson, David H; Peppercorn, Jeffrey M; Michels, Dina L; Storm, Courtney D; Schuchter, Lynn M; Rathmell, W Kimryn

    2013-05-01

    The landscape of managing potential conflicts of interest (COIs) has evolved substantially across many disciplines in recent years, but rarely are the issues more intertwined with financial and ethical implications than in the health care setting. Cancer care is a highly technologic arena, with numerous physician-industry interactions. The American Society of Clinical Oncology (ASCO) recognizes the role of a professional organization to facilitate management of these interactions and the need for periodic review of its COI policy (Policy). To gauge the sentiments of ASCO members and nonphysician stakeholders, two surveys were performed. The first asked ASCO members to estimate opinions of the Policy as it relates to presentation of industry-sponsored research. Respondents were classified as consumers or producers of research material based on demographic responses. A similar survey solicited opinions of nonphysician stakeholders, including patients with cancer, survivors, family members, and advocates. The ASCO survey was responded to by 1,967 members (1% of those solicited); 80% were producers, and 20% were consumers. Most respondents (93% of producers; 66% of consumers) reported familiarity with the Policy. Only a small proportion regularly evaluated COIs for presented research. Members favored increased transparency about relationships over restrictions on presentations of research. Stakeholders (n = 264) indicated that disclosure was "very important" to "extremely important" and preferred written disclosure (77%) over other methods. COI policies are an important and relevant topic among physicians and patient advocates. Methods to simplify the disclosure process, improve transparency, and facilitate responsiveness are critical for COI management.

  1. Case base classification on digital mammograms: improving the performance of case base classifier

    NASA Astrophysics Data System (ADS)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  2. Irreversibility of financial time series: A graph-theoretical approach

    NASA Astrophysics Data System (ADS)

    Flanagan, Ryan; Lacasa, Lucas

    2016-04-01

    The relation between time series irreversibility and entropy production has been recently investigated in thermodynamic systems operating away from equilibrium. In this work we explore this concept in the context of financial time series. We make use of visibility algorithms to quantify, in graph-theoretical terms, time irreversibility of 35 financial indices evolving over the period 1998-2012. We show that this metric is complementary to standard measures based on volatility and exploit it to both classify periods of financial stress and to rank companies accordingly. We then validate this approach by finding that a projection in principal components space of financial years, based on time irreversibility features, clusters together periods of financial stress from stable periods. Relations between irreversibility, efficiency and predictability are briefly discussed.

  3. Future Research Directions in Asthma. An NHLBI Working Group Report.

    PubMed

    Levy, Bruce D; Noel, Patricia J; Freemer, Michelle M; Cloutier, Michelle M; Georas, Steve N; Jarjour, Nizar N; Ober, Carole; Woodruff, Prescott G; Barnes, Kathleen C; Bender, Bruce G; Camargo, Carlos A; Chupp, Geoff L; Denlinger, Loren C; Fahy, John V; Fitzpatrick, Anne M; Fuhlbrigge, Anne; Gaston, Ben M; Hartert, Tina V; Kolls, Jay K; Lynch, Susan V; Moore, Wendy C; Morgan, Wayne J; Nadeau, Kari C; Ownby, Dennis R; Solway, Julian; Szefler, Stanley J; Wenzel, Sally E; Wright, Rosalind J; Smith, Robert A; Erzurum, Serpil C

    2015-12-01

    Asthma is a common chronic disease without cure. Our understanding of asthma onset, pathobiology, classification, and management has evolved substantially over the past decade; however, significant asthma-related morbidity and excess healthcare use and costs persist. To address this important clinical condition, the NHLBI convened a group of extramural investigators for an Asthma Research Strategic Planning workshop on September 18-19, 2014, to accelerate discoveries and their translation to patients. The workshop focused on (1) in utero and early-life origins of asthma, (2) the use of phenotypes and endotypes to classify disease, (3) defining disease modification, (4) disease management, and (5) implementation research. This report summarizes the workshop and produces recommendations to guide future research in asthma.

  4. Classification of proteins with shared motifs and internal repeats in the ECOD database

    PubMed Central

    Kinch, Lisa N.; Liao, Yuxing

    2016-01-01

    Abstract Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain‐like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade. PMID:26833690

  5. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    PubMed Central

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  6. Diatoms in comets

    NASA Technical Reports Server (NTRS)

    Hoover, R.; Hoyle, F.; Wallis, M. K.; Wickramasinghe, N. C.

    1986-01-01

    The fossil record of the microscopic algae classified as diatoms suggests they were injected to earth at the Cretaceous boundary. Not only could diatoms remain viable in the cometary environment, but also many species might replicate in illuminated surface layers or early interior layers of cometary ice. Presumably they reached the solar system on an interstellar comet as an already-evolved assemblage of organisms. Diatoms might cause color changes to comet nuclei while their outgassing decays and revives around highly elliptical orbits. Just as for interstellar absorption, high-resolution IR observations are capable of distinguishing whether the 10-micron feature arises from siliceous diatom material or mineral silicates. The 10-30-micron band and the UV 220-nm region can also provide evidence of biological material.

  7. Review of calcium methodologies.

    PubMed

    Zak, B; Epstein, E; Baginski, E S

    1975-01-01

    A review of calcium methodologies for serum has been described. The analytical systems developed over the past century have been classified as to type beginning with gravimetry and extending to isotope dilution-mass spectrometry by covering all of the commonly used technics which have evolved during that period. Screening and referee procedures are discussed along with comparative sensitivities encountered between atomic absorption spectrophotometry and molecular absorption spectrophotometry. A procedure involving a simple direct reaction for serum calcium using cresolphthalein complexone is recommended in which high blanks are minimized by repressing the ionization of the color reagent on lowering the dielectric constant characteristics of the mixture with dimethylsulfoxide. Reaction characteristics, errors which can be encountered, normal ranges and an interpretative resume are included in its discussion.

  8. X-ray induced mutations in jute (Corchorus capsularis L. and Corchorus olitorius L.)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, D.P.; Sharma, B.K.; Banerjee, S.C.

    1973-09-30

    Dry dormant seeds of three varieties of jute (C. capsularis & b. olitorius), which yield commercial fibers, were irradiated with different doses of x-rays ranging from 10 kR to 100 kR at 10 kR intervals. The percentage of germination, survival rates, the resulting morphological abnormalities in different generations, the total abnormalities, the total mutation frequency including chlorophyll mutations, and the complete, mutation spectrum are described in detail. Mutations were classified into different groups and each mutant was briefly described. Several directly useful mutations were observed with emphasis on the fiber yield. Interesting results were obtained after crossing mutants, where themore » first high yielding hybrid was evolved by the senior author. (auth)« less

  9. Future Research Directions in Asthma. An NHLBI Working Group Report

    PubMed Central

    Levy, Bruce D.; Freemer, Michelle M.; Cloutier, Michelle M.; Georas, Steve N.; Jarjour, Nizar N.; Ober, Carole; Woodruff, Prescott G.; Barnes, Kathleen C.; Bender, Bruce G.; Camargo, Carlos A.; Chupp, Geoff L.; Denlinger, Loren C.; Fahy, John V.; Fitzpatrick, Anne M.; Fuhlbrigge, Anne; Gaston, Ben M.; Hartert, Tina V.; Kolls, Jay K.; Lynch, Susan V.; Moore, Wendy C.; Morgan, Wayne J.; Nadeau, Kari C.; Ownby, Dennis R.; Solway, Julian; Szefler, Stanley J.; Wenzel, Sally E.; Wright, Rosalind J.; Smith, Robert A.; Erzurum, Serpil C.

    2015-01-01

    Asthma is a common chronic disease without cure. Our understanding of asthma onset, pathobiology, classification, and management has evolved substantially over the past decade; however, significant asthma-related morbidity and excess healthcare use and costs persist. To address this important clinical condition, the NHLBI convened a group of extramural investigators for an Asthma Research Strategic Planning workshop on September 18–19, 2014, to accelerate discoveries and their translation to patients. The workshop focused on (1) in utero and early-life origins of asthma, (2) the use of phenotypes and endotypes to classify disease, (3) defining disease modification, (4) disease management, and (5) implementation research. This report summarizes the workshop and produces recommendations to guide future research in asthma. PMID:26305520

  10. Methodological approaches of health technology assessment.

    PubMed

    Goodman, C S; Ahn, R

    1999-12-01

    In this era of evolving health care systems throughout the world, technology remains the substance of health care. Medical informatics comprises a growing contribution to the technologies used in the delivery and management of health care. Diverse, evolving technologies include artificial neural networks, computer-assisted surgery, computer-based patient records, hospital information systems, and more. Decision-makers increasingly demand well-founded information to determine whether or how to develop these technologies, allow them on the market, acquire them, use them, pay for their use, and more. The development and wider use of health technology assessment (HTA) reflects this demand. While HTA offers systematic, well-founded approaches for determining the value of medical informatics technologies, HTA must continue to adapt and refine its methods in response to these evolving technologies. This paper provides a basic overview of HTA principles and methods.

  11. Classification and identification of molecules through factor analysis method based on terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Huang, Jianglou; Liu, Jinsong; Wang, Kejia; Yang, Zhengang; Liu, Xiaming

    2018-06-01

    By means of factor analysis approach, a method of molecule classification is built based on the measured terahertz absorption spectra of the molecules. A data matrix can be obtained by sampling the absorption spectra at different frequency points. The data matrix is then decomposed into the product of two matrices: a weight matrix and a characteristic matrix. By using the K-means clustering to deal with the weight matrix, these molecules can be classified. A group of samples (spirobenzopyran, indole, styrene derivatives and inorganic salts) has been prepared, and measured via a terahertz time-domain spectrometer. These samples are classified with 75% accuracy compared to that directly classified via their molecular formulas.

  12. Automated Decision Tree Classification of Corneal Shape

    PubMed Central

    Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.

    2011-01-01

    Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification problems. PMID:16357645

  13. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy

    PubMed Central

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    2016-01-01

    Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651

  14. [Object-oriented segmentation and classification of forest gap based on QuickBird remote sensing image.

    PubMed

    Mao, Xue Gang; Du, Zi Han; Liu, Jia Qian; Chen, Shu Xin; Hou, Ji Yu

    2018-01-01

    Traditional field investigation and artificial interpretation could not satisfy the need of forest gaps extraction at regional scale. High spatial resolution remote sensing image provides the possibility for regional forest gaps extraction. In this study, we used object-oriented classification method to segment and classify forest gaps based on QuickBird high resolution optical remote sensing image in Jiangle National Forestry Farm of Fujian Province. In the process of object-oriented classification, 10 scales (10-100, with a step length of 10) were adopted to segment QuickBird remote sensing image; and the intersection area of reference object (RA or ) and intersection area of segmented object (RA os ) were adopted to evaluate the segmentation result at each scale. For segmentation result at each scale, 16 spectral characteristics and support vector machine classifier (SVM) were further used to classify forest gaps, non-forest gaps and others. The results showed that the optimal segmentation scale was 40 when RA or was equal to RA os . The accuracy difference between the maximum and minimum at different segmentation scales was 22%. At optimal scale, the overall classification accuracy was 88% (Kappa=0.82) based on SVM classifier. Combining high resolution remote sensing image data with object-oriented classification method could replace the traditional field investigation and artificial interpretation method to identify and classify forest gaps at regional scale.

  15. An automatic aerosol classification for earlinet: application and results

    NASA Astrophysics Data System (ADS)

    Papagiannopoulos, Nikolaos; Mona, Lucia; Amiridis, Vassilis; Binietoglou, Ioannis; D'Amico, Giuseppe; Guma-Claramunt, P.; Schwarz, Anja; Alados-Arboledas, Lucas; Amodeo, Aldo; Apituley, Arnoud; Baars, Holger; Bortoli, Daniele; Comeron, Adolfo; Guerrero-Rascado, Juan Luis; Kokkalis, Panos; Nicolae, Doina; Papayannis, Alex; Pappalardo, Gelsomina; Wandinger, Ulla; Wiegner, Matthias

    2018-04-01

    Aerosol typing is essential for understanding the impact of the different aerosol sources on climate, weather system and air quality. An aerosol classification method for EARLINET (European Aerosol Research Lidar Network) measurements is introduced which makes use the Mahalanobis distance classifier. The performance of the automatic classification is tested against manually classified EARLINET data. Results of the application of the method to an extensive aerosol dataset will be presented.

  16. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    PubMed

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  17. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  18. The Impact of Theoretical Orientation and Training on Preference for Diagnostic Models of Personality Pathology.

    PubMed

    Paggeot, Amy; Nelson, Sharon; Huprich, Steven

    2017-01-01

    The role of theoretical orientation in determining preference for different methods of diagnosis has been largely unexplored. The goal of the present study was to explore ratings of the usefulness of 4 diagnostic methods after applying them to a patient: prototype ratings derived from the SWAP-II, the DSM-5 Section III specific personality disorders, the DSM-5 Section III trait model, and prototype ratings derived from the Psychodynamic Diagnostic Manual (PDM). Three hundred and twenty-nine trainees in APA-accredited doctoral programs and internships rated one of their current patients with each of the 4 diagnostic methods. Individuals who classified their theoretical orientation as "cognitive- behavioral" displayed a significantly greater preference for the proposed DSM-5 personality disorder prototypes when compared to individuals who classified their orientation as "psychodynamic/psychoanalytic," while individuals who considered themselves psychodynamic or psychoanalytic rated the PDM as significantly more useful than those who considered themselves cognitive-behavioral. Individuals who classified their graduate program as a PsyD program were also more likely to rate the DSM-5 Section III and PDM models as more useful diagnostic methods than individuals who classified their graduate program as a PhD program. Implications and future directions will be discussed. © 2017 S. Karger AG, Basel.

  19. Selection-Fusion Approach for Classification of Datasets with Missing Values

    PubMed Central

    Ghannad-Rezaie, Mostafa; Soltanian-Zadeh, Hamid; Ying, Hao; Dong, Ming

    2010-01-01

    This paper proposes a new approach based on missing value pattern discovery for classifying incomplete data. This approach is particularly designed for classification of datasets with a small number of samples and a high percentage of missing values where available missing value treatment approaches do not usually work well. Based on the pattern of the missing values, the proposed approach finds subsets of samples for which most of the features are available and trains a classifier for each subset. Then, it combines the outputs of the classifiers. Subset selection is translated into a clustering problem, allowing derivation of a mathematical framework for it. A trade off is established between the computational complexity (number of subsets) and the accuracy of the overall classifier. To deal with this trade off, a numerical criterion is proposed for the prediction of the overall performance. The proposed method is applied to seven datasets from the popular University of California, Irvine data mining archive and an epilepsy dataset from Henry Ford Hospital, Detroit, Michigan (total of eight datasets). Experimental results show that classification accuracy of the proposed method is superior to those of the widely used multiple imputations method and four other methods. They also show that the level of superiority depends on the pattern and percentage of missing values. PMID:20212921

  20. A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.

    PubMed

    Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed

    2014-01-01

    With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

  1. Evolved atmospheric entry corridor with safety factor

    NASA Astrophysics Data System (ADS)

    Liang, Zixuan; Ren, Zhang; Li, Qingdong

    2018-02-01

    Atmospheric entry corridors are established in previous research based on the equilibrium glide condition which assumes the flight-path angle to be zero. To get a better understanding of the highly constrained entry flight, an evolved entry corridor that considers the exact flight-path angle is developed in this study. Firstly, the conventional corridor in the altitude vs. velocity plane is extended into a three-dimensional one in the space of altitude, velocity, and flight-path angle. The three-dimensional corridor is generated by a series of constraint boxes. Then, based on a simple mapping method, an evolved two-dimensional entry corridor with safety factor is obtained. The safety factor is defined to describe the flexibility of the flight-path angle for a state within the corridor. Finally, the evolved entry corridor is simulated for the Space Shuttle and the Common Aero Vehicle (CAV) to demonstrate the effectiveness of the corridor generation approach. Compared with the conventional corridor, the evolved corridor is much wider and provides additional information. Therefore, the evolved corridor would benefit more to the entry trajectory design and analysis.

  2. Classifying Cereal Data (Earlier Methods)

    Cancer.gov

    The DSQ includes questions about cereal intake and allows respondents up to two responses on which cereals they consume. We classified each cereal reported first by hot or cold, and then along four dimensions: density of added sugars, whole grains, fiber, and calcium.

  3. Classification of chemical substances, reactions, and interactions: The effect of expertise

    NASA Astrophysics Data System (ADS)

    Stains, Marilyne Nicole Olivia

    2007-12-01

    This project explored the strategies that undergraduate and graduate chemistry students engaged in when solving classification tasks involving microscopic (particulate) representations of chemical substances and microscopic and symbolic representations of different chemical reactions. We were specifically interested in characterizing the basic features to which students pay attention while classifying, identifying the patterns of reasoning that they follow, and comparing the performance of students with different levels of preparation in the discipline. In general, our results suggest that advanced levels of expertise in chemical classification do not necessarily evolve in a linear and continuous way with academic training. Novice students had a tendency to reduce the cognitive demand of the task and rely on common-sense reasoning; they had difficulties differentiating concepts (conceptual undifferentiation) and based their classification decisions on only one variable (reduction). These ways of thinking lead them to consider extraneous features, pay more attention to explicit or surface features than implicit features and to overlook important and relevant features. However, unfamiliar levels of representations (microscopic level) seemed to trigger deeper and more meaningful thinking processes. On the other hand, expert students classified entities using a specific set of rules that they applied throughout the classification tasks. They considered a larger variety of implicit features and the unfamiliarity with the microscopic level of representation did not affect their reasoning processes. Consequently, novices created numerous small groups, few of them being chemically meaningful, while experts created few but large chemically meaningful groups. Novices also had difficulties correctly classifying entities in chemically meaningful groups. Finally, expert chemists in our study used classification schemes that are not necessarily traditionally taught in classroom chemistry (e.g. the structure of substances is more relevant to them than their composition when classifying substances as compounds or elements). This result suggests that practice in the field may develop different types of knowledge framework than those usually presented in chemistry textbooks.

  4. Utilizing Electronic Medical Records to Discover Changing Trends of Medical Behaviors Over Time*

    PubMed Central

    Yin, Liangying; Dong, Wei; He, Chunhua; Duan, Huilong

    2017-01-01

    Summary Objectives Medical behaviors are playing significant roles in the delivery of high quality and cost-effective health services. Timely discovery of changing frequencies of medical behaviors is beneficial for the improvement of health services. The main objective of this work is to discover the changing trends of medical behaviors over time. Methods This study proposes a two-steps approach to detect essential changing patterns of medical behaviors from Electronic Medical Records (EMRs). In detail, a probabilistic topic model, i.e., Latent Dirichlet allocation (LDA), is firstly applied to disclose yearly treatment patterns in regard to the risk stratification of patients from a large volume of EMRs. After that, the changing trends by comparing essential/critical medical behaviors in a specific time period are detected and analyzed, including changes of significant patient features with their values, and changes of critical treatment interventions with their occurring time stamps. Results We verify the effectiveness of the proposed approach on a clinical dataset containing 12,152 patient cases with a time range of 10 years. Totally, 135 patients features and 234 treatment interventions in three treatment patterns were selected to detect their changing trends. In particular, evolving trends of yearly occurring probabilities of the selected medical behaviors were categorized into six content changing patterns (i.e, 112 growing, 123 declining, 43 up-down, 16 down-up, 35 steady, and 40 jumping), using the proposed approach. Besides, changing trends of execution time of treatment interventions were classified into three occurring time changing patterns (i.e., 175 early-implemented, 50 steady-implemented and 9 delay-implemented). Conclusions Experimental results show that our approach has an ability to utilize EMRs to discover essential evolving trends of medical behaviors, and thus provide significant potential to be further explored for health services redesign and improvement. PMID:28474729

  5. Early home-based recognition of anaemia via general danger signs, in young children, in a malaria endemic community in north-east Tanzania

    PubMed Central

    Ringsted, Frank M; Bygbjerg, Ib C; Samuelsen, Helle

    2006-01-01

    Background Ethnographic studies from East Africa suggest that cerebral malaria and anaemia are not classified in local knowledge as malaria complications, but as illnesses in their own right. Cerebral malaria 'degedege' has been most researched, in spite of anaemia being a much more frequent complication in infants, and not much is known on how this is interpreted by caretakers. Anaemia is difficult to recognize clinically, even by health workers. Methods Ethnographic longitudinal cohort field study for 14 months, with monthly home-visits in families of 63 newborn babies, identified by community census, followed throughout April – November 2003 and during follow-up in April-May 2004. Interviews with care-takers (mostly mothers) and observational studies of infants and social environment were combined with three haemoglobin (Hb) screenings, supplemented with reports from mothers after health facility use. Results General danger signs, reported by mothers, e.g. infant unable to breast-feed or sit, too weak to be carried on back – besides of more alarming signs such as sleeping all time, loosing consciousness or convulsing – were well associated with actual or evolving moderate to severe anaemia (Hb ≤ 5–8 g/dl). By integrating the local descriptions of danger symptoms and signs, and comparing with actual or evolving low Hb, an algorithm to detect anaemia was developed, with significant sensitivity and specificity. For most danger signs, mothers twice as often took young children to traditional healers for herbal treatment, rather than having their children admitted to hospital. As expected, pallor was more rarely recognized by mothers, or primary reason for treatment seeking. Conclusion Mothers do recognize and respond to symptoms and danger signs related to development of anaemia, the most frequent complication of malaria in young children in malaria endemic areas. Mothers' observations and actions should be reconsidered and integrated in management of childhood illness programmes. PMID:17116250

  6. Active machine learning for rapid landslide inventory mapping with VHR satellite images (Invited)

    NASA Astrophysics Data System (ADS)

    Stumpf, A.; Lachiche, N.; Malet, J.; Kerle, N.; Puissant, A.

    2013-12-01

    VHR satellite images have become a primary source for landslide inventory mapping after major triggering events such as earthquakes and heavy rainfalls. Visual image interpretation is still the prevailing standard method for operational purposes but is time-consuming and not well suited to fully exploit the increasingly better supply of remote sensing data. Recent studies have addressed the development of more automated image analysis workflows for landslide inventory mapping. In particular object-oriented approaches that account for spatial and textural image information have been demonstrated to be more adequate than pixel-based classification but manually elaborated rule-based classifiers are difficult to adapt under changing scene characteristics. Machine learning algorithm allow learning classification rules for complex image patterns from labelled examples and can be adapted straightforwardly with available training data. In order to reduce the amount of costly training data active learning (AL) has evolved as a key concept to guide the sampling for many applications. The underlying idea of AL is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and data structure to iteratively select the most valuable samples that should be labelled by the user. With relatively few queries and labelled samples, an AL strategy yields higher accuracies than an equivalent classifier trained with many randomly selected samples. This study addressed the development of an AL method for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. Our approach [1] is based on the Random Forest algorithm and considers the classifier uncertainty as well as the variance of potential sampling regions to guide the user towards the most valuable sampling areas. The algorithm explicitly searches for compact regions and thereby avoids a spatially disperse sampling pattern inherent to most other AL methods. The accuracy, the sampling time and the computational runtime of the algorithm were evaluated on multiple satellite images capturing recent large scale landslide events. Sampling between 1-4% of the study areas the accuracies between 74% and 80% were achieved, whereas standard sampling schemes yielded only accuracies between 28% and 50% with equal sampling costs. Compared to commonly used point-wise AL algorithm the proposed approach significantly reduces the number of iterations and hence the computational runtime. Since the user can focus on relatively few compact areas (rather than on hundreds of distributed points) the overall labeling time is reduced by more than 50% compared to point-wise queries. An experimental evaluation of multiple expert mappings demonstrated strong relationships between the uncertainties of the experts and the machine learning model. It revealed that the achieved accuracies are within the range of the inter-expert disagreement and that it will be indispensable to consider ground truth uncertainties to truly achieve further enhancements in the future. The proposed method is generally applicable to a wide range of optical satellite images and landslide types. [1] A. Stumpf, N. Lachiche, J.-P. Malet, N. Kerle, and A. Puissant, Active learning in the spatial domain for remote sensing image classification, IEEE Transactions on Geosciece and Remote Sensing. 2013, DOI 10.1109/TGRS.2013.2262052.

  7. Inference of Time-Evolving Coupled Dynamical Systems in the Presence of Noise

    NASA Astrophysics Data System (ADS)

    Stankovski, Tomislav; Duggento, Andrea; McClintock, Peter V. E.; Stefanovska, Aneta

    2012-07-01

    A new method is introduced for analysis of interactions between time-dependent coupled oscillators, based on the signals they generate. It distinguishes unsynchronized dynamics from noise-induced phase slips and enables the evolution of the coupling functions and other parameters to be followed. It is based on phase dynamics, with Bayesian inference of the time-evolving parameters achieved by shaping the prior densities to incorporate knowledge of previous samples. The method is tested numerically and applied to reveal and quantify the time-varying nature of cardiorespiratory interactions.

  8. An integrated multi-label classifier with chemical-chemical interactions for prediction of chemical toxicity effects.

    PubMed

    Liu, Tao; Chen, Lei; Pan, Xiaoyong

    2018-05-31

    Chemical toxicity effect is one of the major reasons for declining candidate drugs. Detecting the toxicity effects of all chemicals can accelerate the procedures of drug discovery. However, it is time-consuming and expensive to identify the toxicity effects of a given chemical through traditional experiments. Designing quick, reliable and non-animal-involved computational methods is an alternative way. In this study, a novel integrated multi-label classifier was proposed. First, based on five types of chemical-chemical interactions retrieved from STITCH, each of which is derived from one aspect of chemicals, five individual classifiers were built. Then, several integrated classifiers were built by integrating some or all individual classifiers. By testing the integrated classifiers on a dataset with chemicals and their toxicity effects in Accelrys Toxicity database and non-toxic chemicals with their performance evaluated by jackknife test, an optimal integrated classifier was selected as the proposed classifier, which provided quite high prediction accuracies and wide applications. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. Decimated Input Ensembles for Improved Generalization

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Oza, Nikunj C.; Norvig, Peter (Technical Monitor)

    1999-01-01

    Recently, many researchers have demonstrated that using classifier ensembles (e.g., averaging the outputs of multiple classifiers before reaching a classification decision) leads to improved performance for many difficult generalization problems. However, in many domains there are serious impediments to such "turnkey" classification accuracy improvements. Most notable among these is the deleterious effect of highly correlated classifiers on the ensemble performance. One particular solution to this problem is generating "new" training sets by sampling the original one. However, with finite number of patterns, this causes a reduction in the training patterns each classifier sees, often resulting in considerably worsened generalization performance (particularly for high dimensional data domains) for each individual classifier. Generally, this drop in the accuracy of the individual classifier performance more than offsets any potential gains due to combining, unless diversity among classifiers is actively promoted. In this work, we introduce a method that: (1) reduces the correlation among the classifiers; (2) reduces the dimensionality of the data, thus lessening the impact of the 'curse of dimensionality'; and (3) improves the classification performance of the ensemble.

  10. Hybrid ANN optimized artificial fish swarm algorithm based classifier for classification of suspicious lesions in breast DCE-MRI

    NASA Astrophysics Data System (ADS)

    Janaki Sathya, D.; Geetha, K.

    2017-12-01

    Automatic mass or lesion classification systems are developed to aid in distinguishing between malignant and benign lesions present in the breast DCE-MR images, the systems need to improve both the sensitivity and specificity of DCE-MR image interpretation in order to be successful for clinical use. A new classifier (a set of features together with a classification method) based on artificial neural networks trained using artificial fish swarm optimization (AFSO) algorithm is proposed in this paper. The basic idea behind the proposed classifier is to use AFSO algorithm for searching the best combination of synaptic weights for the neural network. An optimal set of features based on the statistical textural features is presented. The investigational outcomes of the proposed suspicious lesion classifier algorithm therefore confirm that the resulting classifier performs better than other such classifiers reported in the literature. Therefore this classifier demonstrates that the improvement in both the sensitivity and specificity are possible through automated image analysis.

  11. Visual terrain mapping for traversable path planning of mobile robots

    NASA Astrophysics Data System (ADS)

    Shirkhodaie, Amir; Amrani, Rachida; Tunstel, Edward W.

    2004-10-01

    In this paper, we have primarily discussed technical challenges and navigational skill requirements of mobile robots for traversability path planning in natural terrain environments similar to Mars surface terrains. We have described different methods for detection of salient terrain features based on imaging texture analysis techniques. We have also presented three competing techniques for terrain traversability assessment of mobile robots navigating in unstructured natural terrain environments. These three techniques include: a rule-based terrain classifier, a neural network-based terrain classifier, and a fuzzy-logic terrain classifier. Each proposed terrain classifier divides a region of natural terrain into finite sub-terrain regions and classifies terrain condition exclusively within each sub-terrain region based on terrain visual clues. The Kalman Filtering technique is applied for aggregative fusion of sub-terrain assessment results. The last two terrain classifiers are shown to have remarkable capability for terrain traversability assessment of natural terrains. We have conducted a comparative performance evaluation of all three terrain classifiers and presented the results in this paper.

  12. Soft computing-based terrain visual sensing and data fusion for unmanned ground robotic systems

    NASA Astrophysics Data System (ADS)

    Shirkhodaie, Amir

    2006-05-01

    In this paper, we have primarily discussed technical challenges and navigational skill requirements of mobile robots for traversability path planning in natural terrain environments similar to Mars surface terrains. We have described different methods for detection of salient terrain features based on imaging texture analysis techniques. We have also presented three competing techniques for terrain traversability assessment of mobile robots navigating in unstructured natural terrain environments. These three techniques include: a rule-based terrain classifier, a neural network-based terrain classifier, and a fuzzy-logic terrain classifier. Each proposed terrain classifier divides a region of natural terrain into finite sub-terrain regions and classifies terrain condition exclusively within each sub-terrain region based on terrain visual clues. The Kalman Filtering technique is applied for aggregative fusion of sub-terrain assessment results. The last two terrain classifiers are shown to have remarkable capability for terrain traversability assessment of natural terrains. We have conducted a comparative performance evaluation of all three terrain classifiers and presented the results in this paper.

  13. A comparative study of nonparametric methods for pattern recognition

    NASA Technical Reports Server (NTRS)

    Hahn, S. F.; Nelson, G. D.

    1972-01-01

    The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal.

  14. Automated detection of neovascularization for proliferative diabetic retinopathy screening.

    PubMed

    Roychowdhury, Sohini; Koozekanani, Dara D; Parhi, Keshab K

    2016-08-01

    Neovascularization is the primary manifestation of proliferative diabetic retinopathy (PDR) that can lead to acquired blindness. This paper presents a novel method that classifies neovascularizations in the 1-optic disc (OD) diameter region (NVD) and elsewhere (NVE) separately to achieve low false positive rates of neovascularization classification. First, the OD region and blood vessels are extracted. Next, the major blood vessel segments in the 1-OD diameter region are classified for NVD, and minor blood vessel segments elsewhere are classified for NVE. For NVD and NVE classifications, optimal region-based feature sets of 10 and 6 features, respectively, are used. The proposed method achieves classification sensitivity, specificity and accuracy for NVD and NVE of 74%, 98.2%, 87.6%, and 61%, 97.5%, 92.1%, respectively. Also, the proposed method achieves 86.4% sensitivity and 76% specificity for screening images with PDR from public and local data sets. Thus, the proposed NVD and NVE detection methods can play a key role in automated screening and prioritization of patients with diabetic retinopathy.

  15. Automated segmentation of dental CBCT image with prior-guided sequential random forests

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Li; Gao, Yaozong; Shi, Feng

    Purpose: Cone-beam computed tomography (CBCT) is an increasingly utilized imaging modality for the diagnosis and treatment planning of the patients with craniomaxillofacial (CMF) deformities. Accurate segmentation of CBCT image is an essential step to generate 3D models for the diagnosis and treatment planning of the patients with CMF deformities. However, due to the image artifacts caused by beam hardening, imaging noise, inhomogeneity, truncation, and maximal intercuspation, it is difficult to segment the CBCT. Methods: In this paper, the authors present a new automatic segmentation method to address these problems. Specifically, the authors first employ a majority voting method to estimatemore » the initial segmentation probability maps of both mandible and maxilla based on multiple aligned expert-segmented CBCT images. These probability maps provide an important prior guidance for CBCT segmentation. The authors then extract both the appearance features from CBCTs and the context features from the initial probability maps to train the first-layer of random forest classifier that can select discriminative features for segmentation. Based on the first-layer of trained classifier, the probability maps are updated, which will be employed to further train the next layer of random forest classifier. By iteratively training the subsequent random forest classifier using both the original CBCT features and the updated segmentation probability maps, a sequence of classifiers can be derived for accurate segmentation of CBCT images. Results: Segmentation results on CBCTs of 30 subjects were both quantitatively and qualitatively validated based on manually labeled ground truth. The average Dice ratios of mandible and maxilla by the authors’ method were 0.94 and 0.91, respectively, which are significantly better than the state-of-the-art method based on sparse representation (p-value < 0.001). Conclusions: The authors have developed and validated a novel fully automated method for CBCT segmentation.« less

  16. A dictionary learning approach for human sperm heads classification.

    PubMed

    Shaker, Fariba; Monadjemi, S Amirhassan; Alirezaie, Javad; Naghsh-Nilchi, Ahmad Reza

    2017-12-01

    To diagnose infertility in men, semen analysis is conducted in which sperm morphology is one of the factors that are evaluated. Since manual assessment of sperm morphology is time-consuming and subjective, automatic classification methods are being developed. Automatic classification of sperm heads is a complicated task due to the intra-class differences and inter-class similarities of class objects. In this research, a Dictionary Learning (DL) technique is utilized to construct a dictionary of sperm head shapes. This dictionary is used to classify the sperm heads into four different classes. Square patches are extracted from the sperm head images. Columnized patches from each class of sperm are used to learn class-specific dictionaries. The patches from a test image are reconstructed using each class-specific dictionary and the overall reconstruction error for each class is used to select the best matching class. Average accuracy, precision, recall, and F-score are used to evaluate the classification method. The method is evaluated using two publicly available datasets of human sperm head shapes. The proposed DL based method achieved an average accuracy of 92.2% on the HuSHeM dataset, and an average recall of 62% on the SCIAN-MorphoSpermGS dataset. The results show a significant improvement compared to a previously published shape-feature-based method. We have achieved high-performance results. In addition, our proposed approach offers a more balanced classifier in which all four classes are recognized with high precision and recall. In this paper, we use a Dictionary Learning approach in classifying human sperm heads. It is shown that the Dictionary Learning method is far more effective in classifying human sperm heads than classifiers using shape-based features. Also, a dataset of human sperm head shapes is introduced to facilitate future research. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. A comprehensive simulation study on classification of RNA-Seq data.

    PubMed

    Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet

    2017-01-01

    RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.

  18. An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data.

    PubMed

    Stromatias, Evangelos; Soto, Miguel; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé

    2017-01-01

    This paper introduces a novel methodology for training an event-driven classifier within a Spiking Neural Network (SNN) System capable of yielding good classification results when using both synthetic input data and real data captured from Dynamic Vision Sensor (DVS) chips. The proposed supervised method uses the spiking activity provided by an arbitrary topology of prior SNN layers to build histograms and train the classifier in the frame domain using the stochastic gradient descent algorithm. In addition, this approach can cope with leaky integrate-and-fire neuron models within the SNN, a desirable feature for real-world SNN applications, where neural activation must fade away after some time in the absence of inputs. Consequently, this way of building histograms captures the dynamics of spikes immediately before the classifier. We tested our method on the MNIST data set using different synthetic encodings and real DVS sensory data sets such as N-MNIST, MNIST-DVS, and Poker-DVS using the same network topology and feature maps. We demonstrate the effectiveness of our approach by achieving the highest classification accuracy reported on the N-MNIST (97.77%) and Poker-DVS (100%) real DVS data sets to date with a spiking convolutional network. Moreover, by using the proposed method we were able to retrain the output layer of a previously reported spiking neural network and increase its performance by 2%, suggesting that the proposed classifier can be used as the output layer in works where features are extracted using unsupervised spike-based learning methods. In addition, we also analyze SNN performance figures such as total event activity and network latencies, which are relevant for eventual hardware implementations. In summary, the paper aggregates unsupervised-trained SNNs with a supervised-trained SNN classifier, combining and applying them to heterogeneous sets of benchmarks, both synthetic and from real DVS chips.

  19. A scaling transformation for classifier output based on likelihood ratio: Applications to a CAD workstation for diagnosis of breast cancer

    PubMed Central

    Horsch, Karla; Pesce, Lorenzo L.; Giger, Maryellen L.; Metz, Charles E.; Jiang, Yulei

    2012-01-01

    Purpose: The authors developed scaling methods that monotonically transform the output of one classifier to the “scale” of another. Such transformations affect the distribution of classifier output while leaving the ROC curve unchanged. In particular, they investigated transformations between radiologists and computer classifiers, with the goal of addressing the problem of comparing and interpreting case-specific values of output from two classifiers. Methods: Using both simulated and radiologists’ rating data of breast imaging cases, the authors investigated a likelihood-ratio-scaling transformation, based on “matching” classifier likelihood ratios. For comparison, three other scaling transformations were investigated that were based on matching classifier true positive fraction, false positive fraction, or cumulative distribution function, respectively. The authors explored modifying the computer output to reflect the scale of the radiologist, as well as modifying the radiologist’s ratings to reflect the scale of the computer. They also evaluated how dataset size affects the transformations. Results: When ROC curves of two classifiers differed substantially, the four transformations were found to be quite different. The likelihood-ratio scaling transformation was found to vary widely from radiologist to radiologist. Similar results were found for the other transformations. Our simulations explored the effect of database sizes on the accuracy of the estimation of our scaling transformations. Conclusions: The likelihood-ratio-scaling transformation that the authors have developed and evaluated was shown to be capable of transforming computer and radiologist outputs to a common scale reliably, thereby allowing the comparison of the computer and radiologist outputs on the basis of a clinically relevant statistic. PMID:22559651

  20. Multicentre prospective validation of a urinary peptidome-based classifier for the diagnosis of type 2 diabetic nephropathy

    PubMed Central

    Siwy, Justyna; Schanstra, Joost P.; Argiles, Angel; Bakker, Stephan J.L.; Beige, Joachim; Boucek, Petr; Brand, Korbinian; Delles, Christian; Duranton, Flore; Fernandez-Fernandez, Beatriz; Jankowski, Marie-Luise; Al Khatib, Mohammad; Kunt, Thomas; Lajer, Maria; Lichtinghagen, Ralf; Lindhardt, Morten; Maahs, David M; Mischak, Harald; Mullen, William; Navis, Gerjan; Noutsou, Marina; Ortiz, Alberto; Persson, Frederik; Petrie, John R.; Roob, Johannes M.; Rossing, Peter; Ruggenenti, Piero; Rychlik, Ivan; Serra, Andreas L.; Snell-Bergeon, Janet; Spasovski, Goce; Stojceva-Taneva, Olivera; Trillini, Matias; von der Leyen, Heiko; Winklhofer-Roob, Brigitte M.; Zürbig, Petra; Jankowski, Joachim

    2014-01-01

    Background Diabetic nephropathy (DN) is one of the major late complications of diabetes. Treatment aimed at slowing down the progression of DN is available but methods for early and definitive detection of DN progression are currently lacking. The ‘Proteomic prediction and Renin angiotensin aldosterone system Inhibition prevention Of early diabetic nephRopathy In TYpe 2 diabetic patients with normoalbuminuria trial’ (PRIORITY) aims to evaluate the early detection of DN in patients with type 2 diabetes (T2D) using a urinary proteome-based classifier (CKD273). Methods In this ancillary study of the recently initiated PRIORITY trial we aimed to validate for the first time the CKD273 classifier in a multicentre (9 different institutions providing samples from 165 T2D patients) prospective setting. In addition we also investigated the influence of sample containers, age and gender on the CKD273 classifier. Results We observed a high consistency of the CKD273 classification scores across the different centres with areas under the curves ranging from 0.95 to 1.00. The classifier was independent of age (range tested 16–89 years) and gender. Furthermore, the use of different urine storage containers did not affect the classification scores. Analysis of the distribution of the individual peptides of the classifier over the nine different centres showed that fragments of blood-derived and extracellular matrix proteins were the most consistently found. Conclusion We provide for the first time validation of this urinary proteome-based classifier in a multicentre prospective setting and show the suitability of the CKD273 classifier to be used in the PRIORITY trial. PMID:24589724

  1. Antimicrobial Peptides from Plants

    PubMed Central

    Tam, James P.; Wang, Shujing; Wong, Ka H.; Tan, Wei Liang

    2015-01-01

    Plant antimicrobial peptides (AMPs) have evolved differently from AMPs from other life forms. They are generally rich in cysteine residues which form multiple disulfides. In turn, the disulfides cross-braced plant AMPs as cystine-rich peptides to confer them with extraordinary high chemical, thermal and proteolytic stability. The cystine-rich or commonly known as cysteine-rich peptides (CRPs) of plant AMPs are classified into families based on their sequence similarity, cysteine motifs that determine their distinctive disulfide bond patterns and tertiary structure fold. Cystine-rich plant AMP families include thionins, defensins, hevein-like peptides, knottin-type peptides (linear and cyclic), lipid transfer proteins, α-hairpinin and snakins family. In addition, there are AMPs which are rich in other amino acids. The ability of plant AMPs to organize into specific families with conserved structural folds that enable sequence variation of non-Cys residues encased in the same scaffold within a particular family to play multiple functions. Furthermore, the ability of plant AMPs to tolerate hypervariable sequences using a conserved scaffold provides diversity to recognize different targets by varying the sequence of the non-cysteine residues. These properties bode well for developing plant AMPs as potential therapeutics and for protection of crops through transgenic methods. This review provides an overview of the major families of plant AMPs, including their structures, functions, and putative mechanisms. PMID:26580629

  2. Analysis of herbaceous plant succession and dispersal mechanisms in deglaciated terrain on Mt. Yulong, China.

    PubMed

    Chang, Li; He, Yuanqing; Yang, Taibao; Du, Jiankuo; Niu, Hewen; Pu, Tao

    2014-01-01

    Ecological succession itself could be a theoretical reference for ecosystem restoration and reconstruction. Glacier forelands are ideal places for investigating plant succession because there are representative ecological succession records at long temporal scales. Based on field observations and experimental data on the foreland of Baishui number 1 Glacier on Mt. Yulong, the succession and dispersal mechanisms of dominant plant species were examined by using numerical classification and ordination methods. Fifty samples were first classified into nine community types and then into three succession stages. The three succession stages occurred about 9-13, 13-102, and 110-400 years ago, respectively. The earliest succession stage contained the association of Arenaria delavayi + Meconopsis horridula. The middle stage contained the associations of Arenaria delavayi + Kobresia fragilis, Carex capilliformis + Polygonum macrophyllum, Carex kansuensis, and also Pedicularis rupicola. The last stage included the associations of Kobresia fragilis + Carex capilliformis, Kobresia fragilis, Kobresia fragilis + Ligusticum rechingerana, and Kobresia fragilis + Ligusticum sikiangense. The tendency of the succession was from bare land to sparse vegetation and then to alpine meadow. In addition, three modes of dispersal were observed, namely, anemochory, mammalichory, and myrmecochory. The dispersal modes of dominant species in plant succession process were evolved from anemochory to zoochory.

  3. Analysis of Herbaceous Plant Succession and Dispersal Mechanisms in Deglaciated Terrain on Mt. Yulong, China

    PubMed Central

    He, Yuanqing; Yang, Taibao; Du, Jiankuo; Niu, Hewen; Pu, Tao

    2014-01-01

    Ecological succession itself could be a theoretical reference for ecosystem restoration and reconstruction. Glacier forelands are ideal places for investigating plant succession because there are representative ecological succession records at long temporal scales. Based on field observations and experimental data on the foreland of Baishui number 1 Glacier on Mt. Yulong, the succession and dispersal mechanisms of dominant plant species were examined by using numerical classification and ordination methods. Fifty samples were first classified into nine community types and then into three succession stages. The three succession stages occurred about 9–13, 13–102, and 110–400 years ago, respectively. The earliest succession stage contained the association of Arenaria delavayi + Meconopsis horridula. The middle stage contained the associations of Arenaria delavayi + Kobresia fragilis, Carex capilliformis + Polygonum macrophyllum, Carex kansuensis, and also Pedicularis rupicola. The last stage included the associations of Kobresia fragilis + Carex capilliformis, Kobresia fragilis, Kobresia fragilis + Ligusticum rechingerana, and Kobresia fragilis + Ligusticum sikiangense. The tendency of the succession was from bare land to sparse vegetation and then to alpine meadow. In addition, three modes of dispersal were observed, namely, anemochory, mammalichory, and myrmecochory. The dispersal modes of dominant species in plant succession process were evolved from anemochory to zoochory. PMID:25401125

  4. Evolution of transversus abdominis plane infiltration techniques for postsurgical analgesia following abdominal surgeries.

    PubMed

    Gadsden, Jeffrey; Ayad, Sabry; Gonzales, Jeffrey J; Mehta, Jaideep; Boublik, Jan; Hutchins, Jacob

    2015-01-01

    Transversus abdominis plane (TAP) infiltration is a regional anesthesia technique that has been demonstrated to be effective for management of postsurgical pain after abdominal surgery. There are several different clinical variations in the approaches used for achieving analgesia via TAP infiltration, and methods for identification of the TAP have evolved considerably since the landmark-guided technique was first described in 2001. There are many factors that impact the analgesic outcomes following TAP infiltration, and the various nuances of this technique have led to debate regarding procedural classification of TAP infiltration. Based on our current understanding of fascial and neuronal anatomy of the anterior abdominal wall, as well as available evidence from studies assessing local anesthetic spread and cutaneous sensory block following TAP infiltration, it is clear that TAP infiltration techniques are appropriately classified as field blocks. While the objective of peripheral nerve block and TAP infiltration are similar in that both approaches block sensory response in order to achieve analgesia, the technical components of the two procedures are different. Unlike peripheral nerve block, which involves identification or stimulation of a specific nerve or nerve plexus, followed by administration of a local anesthetic in close proximity, TAP infiltration involves administration and spread of local anesthetic within an anatomical plane of the surgical site.

  5. Drug designs fulfilling the requirements of clinical trials aiming at personalizing medicine

    PubMed Central

    Mandrekar, Sumithra J.; Sargent, Daniel J.

    2014-01-01

    In the current era of stratified medicine and biomarker-driven therapies, the focus has shifted from predictions based on the traditional anatomic staging systems to guide the choice of treatment for an individual patient to an integrated approach using the genetic makeup of the tumor and the genotype of the patient. The clinical trial designs utilized in the developmental pathway for biomarkers and biomarker-directed therapies from discovery to clinical practice are rapidly evolving. While several issues need careful consideration, two critical issues that surround the validation of biomarkers are the choice of the clinical trial design (which is based on the strength of the preliminary evidence and marker prevalence), and biomarker assay related issues surrounding the marker assessment methods such as the reliability and reproducibility of the assay. In this review, we focus on trial designs aiming at personalized medicine in the context of early phase trials for initial marker validation, as well as in the context of larger definitive trials. Designs for biomarker validation are broadly classified as retrospective (i.e., using data from previously well-conducted randomized controlled trials (RCTs) versus prospective (enrichment, all-comers, hybrid or adaptive). We believe that the systematic evaluation and implementation of these design strategies are essential to accelerate the clinical validation of biomarker guided therapy. PMID:25414851

  6. Seed germination in parasitic plants: what insights can we expect from strigolactone research?

    PubMed

    Brun, Guillaume; Braem, Lukas; Thoiron, Séverine; Gevaert, Kris; Goormachtig, Sofie; Delavault, Philippe

    2018-04-23

    Obligate root-parasitic plants belonging to the Orobanchaceae family are deadly pests for major crops all over the world. Because these heterotrophic plants severely damage their hosts even before emerging from the soil, there is an unequivocal need to design early and efficient methods for their control. The germination process of these species has probably undergone numerous selective pressure events in the course of evolution, in that the perception of host-derived molecules is a necessary condition for seeds to germinate. Although most of these molecules belong to the strigolactones, structurally different molecules have been identified. Since strigolactones are also classified as novel plant hormones that regulate several physiological processes other than germination, the use of autotrophic model plant species has allowed the identification of many actors involved in the strigolactone biosynthesis, perception, and signal transduction pathways. Nevertheless, many questions remain to be answered regarding the germination process of parasitic plants. For instance, how did parasitic plants evolve to germinate in response to a wide variety of molecules, while autotrophic plants do not? What particular features are associated with their lack of spontaneous germination? In this review, we attempt to illustrate to what extent conclusions from research into strigolactones could be applied to better understand the biology of parasitic plants.

  7. Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data.

    PubMed

    Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos

    2017-04-13

    Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.

  8. Research on classified real-time flood forecasting framework based on K-means cluster and rough set.

    PubMed

    Xu, Wei; Peng, Yong

    2015-01-01

    This research presents a new classified real-time flood forecasting framework. In this framework, historical floods are classified by a K-means cluster according to the spatial and temporal distribution of precipitation, the time variance of precipitation intensity and other hydrological factors. Based on the classified results, a rough set is used to extract the identification rules for real-time flood forecasting. Then, the parameters of different categories within the conceptual hydrological model are calibrated using a genetic algorithm. In real-time forecasting, the corresponding category of parameters is selected for flood forecasting according to the obtained flood information. This research tests the new classified framework on Guanyinge Reservoir and compares the framework with the traditional flood forecasting method. It finds that the performance of the new classified framework is significantly better in terms of accuracy. Furthermore, the framework can be considered in a catchment with fewer historical floods.

  9. Quantifying selection and diversity in viruses by entropy methods, with application to the haemagglutinin of H3N2 influenza

    PubMed Central

    Pan, Keyao; Deem, Michael W.

    2011-01-01

    Many viruses evolve rapidly. For example, haemagglutinin (HA) of the H3N2 influenza A virus evolves to escape antibody binding. This evolution of the H3N2 virus means that people who have previously been exposed to an influenza strain may be infected by a newly emerged virus. In this paper, we use Shannon entropy and relative entropy to measure the diversity and selection pressure by an antibody in each amino acid site of H3 HA between the 1992–1993 season and the 2009–2010 season. Shannon entropy and relative entropy are two independent state variables that we use to characterize H3N2 evolution. The entropy method estimates future H3N2 evolution and migration using currently available H3 HA sequences. First, we show that the rate of evolution increases with the virus diversity in the current season. The Shannon entropy of the sequence in the current season predicts relative entropy between sequences in the current season and those in the next season. Second, a global migration pattern of H3N2 is assembled by comparing the relative entropy flows of sequences sampled in China, Japan, the USA and Europe. We verify this entropy method by describing two aspects of historical H3N2 evolution. First, we identify 54 amino acid sites in HA that have evolved in the past to evade the immune system. Second, the entropy method shows that epitopes A and B on the top of HA evolve most vigorously to escape antibody binding. Our work provides a novel entropy-based method to predict and quantify future H3N2 evolution and to describe the evolutionary history of H3N2. PMID:21543352

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wattimanela, H. J., E-mail: hwattimaela@yahoo.com; Institute of Technology Bandung, Bandung; Pasaribu, U. S.

    In this paper, we present a model to classify the earthquakes occurred in Molluca Province. We use K-Means clustering method to classify the earthquake based on the magnitude and the depth of the earthquake. The result can be used for disaster mitigation and for designing evacuation route in Molluca Province.

  11. Classifying Web Pages by Using Knowledge Bases for Entity Retrieval

    NASA Astrophysics Data System (ADS)

    Kiritani, Yusuke; Ma, Qiang; Yoshikawa, Masatoshi

    In this paper, we propose a novel method to classify Web pages by using knowledge bases for entity search, which is a kind of typical Web search for information related to a person, location or organization. First, we map a Web page to entities according to the similarities between the page and the entities. Various methods for computing such similarity are applied. For example, we can compute the similarity between a given page and a Wikipedia article describing a certain entity. The frequency of an entity appearing in the page is another factor used in computing the similarity. Second, we construct a directed acyclic graph, named PEC graph, based on the relations among Web pages, entities, and categories, by referring to YAGO, a knowledge base built on Wikipedia and WordNet. Finally, by analyzing the PEC graph, we classify Web pages into categories. The results of some preliminary experiments validate the methods proposed in this paper.

  12. Using the concept of pseudo amino acid composition to predict resistance gene against Xanthomonas oryzae pv. oryzae in rice: an approach from chaos games representation.

    PubMed

    Jingbo, Xia; Silan, Zhang; Feng, Shi; Huijuan, Xiong; Xuehai, Hu; Xiaohui, Niu; Zhi, Li

    2011-09-07

    To evaluate the possibility of an unknown protein to be a resistant gene against Xanthomonas oryzae pv. oryzae, a different mode of pseudo amino acid composition (PseAAC) is proposed to formulate the protein samples by integrating the amino acid composition, as well as the Chaos games representation (CGR) method. Some numerical comparisons of triangle, quadrangle and 12-vertex polygon CGR are carried to evaluate the efficiency of using these fractal figures in classifiers. The numerical results show that among the three polygon methods, triangle method owns a good fractal visualization and performs the best in the classifier construction. By using triangle + 12-vertex polygon CGR as the mathematical feature, the classifier achieves 98.13% in Jackknife test and MCC achieves 0.8462. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Improvement of training set structure in fusion data cleaning using Time-Domain Global Similarity method

    NASA Astrophysics Data System (ADS)

    Liu, J.; Lan, T.; Qin, H.

    2017-10-01

    Traditional data cleaning identifies dirty data by classifying original data sequences, which is a class-imbalanced problem since the proportion of incorrect data is much less than the proportion of correct ones for most diagnostic systems in Magnetic Confinement Fusion (MCF) devices. When using machine learning algorithms to classify diagnostic data based on class-imbalanced training set, most classifiers are biased towards the major class and show very poor classification rates on the minor class. By transforming the direct classification problem about original data sequences into a classification problem about the physical similarity between data sequences, the class-balanced effect of Time-Domain Global Similarity (TDGS) method on training set structure is investigated in this paper. Meanwhile, the impact of improved training set structure on data cleaning performance of TDGS method is demonstrated with an application example in EAST POlarimetry-INTerferometry (POINT) system.

  14. Random forests for classification in ecology

    USGS Publications Warehouse

    Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.

    2007-01-01

    Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.

  15. A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra

    PubMed Central

    Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole

    2010-01-01

    A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455

  16. A mutual information-Dempster-Shafer based decision ensemble system for land cover classification of hyperspectral data

    NASA Astrophysics Data System (ADS)

    Pahlavani, Parham; Bigdeli, Behnaz

    2017-12-01

    Hyperspectral images contain extremely rich spectral information that offer great potential to discriminate between various land cover classes. However, these images are usually composed of tens or hundreds of spectrally close bands, which result in high redundancy and great amount of computation time in hyperspectral classification. Furthermore, in the presence of mixed coverage pixels, crisp classifiers produced errors, omission and commission. This paper presents a mutual information-Dempster-Shafer system through an ensemble classification approach for classification of hyperspectral data. First, mutual information is applied to split data into a few independent partitions to overcome high dimensionality. Then, a fuzzy maximum likelihood classifies each band subset. Finally, Dempster-Shafer is applied to fuse the results of the fuzzy classifiers. In order to assess the proposed method, a crisp ensemble system based on a support vector machine as the crisp classifier and weighted majority voting as the crisp fusion method are applied on hyperspectral data. Furthermore, a dimension reduction system is utilized to assess the effectiveness of mutual information band splitting of the proposed method. The proposed methodology provides interesting conclusions on the effectiveness and potentiality of mutual information-Dempster-Shafer based classification of hyperspectral data.

  17. Modelling cell motility and chemotaxis with evolving surface finite elements

    PubMed Central

    Elliott, Charles M.; Stinner, Björn; Venkataraman, Chandrasekhar

    2012-01-01

    We present a mathematical and a computational framework for the modelling of cell motility. The cell membrane is represented by an evolving surface, with the movement of the cell determined by the interaction of various forces that act normal to the surface. We consider external forces such as those that may arise owing to inhomogeneities in the medium and a pressure that constrains the enclosed volume, as well as internal forces that arise from the reaction of the cells' surface to stretching and bending. We also consider a protrusive force associated with a reaction–diffusion system (RDS) posed on the cell membrane, with cell polarization modelled by this surface RDS. The computational method is based on an evolving surface finite-element method. The general method can account for the large deformations that arise in cell motility and allows the simulation of cell migration in three dimensions. We illustrate applications of the proposed modelling framework and numerical method by reporting on numerical simulations of a model for eukaryotic chemotaxis and a model for the persistent movement of keratocytes in two and three space dimensions. Movies of the simulated cells can be obtained from http://homepages.warwick.ac.uk/∼maskae/CV_Warwick/Chemotaxis.html. PMID:22675164

  18. Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification.

    PubMed

    Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie

    2014-11-01

    User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.

  19. Multiple disturbances classifier for electric signals using adaptive structuring neural networks

    NASA Astrophysics Data System (ADS)

    Lu, Yen-Ling; Chuang, Cheng-Long; Fahn, Chin-Shyurng; Jiang, Joe-Air

    2008-07-01

    This work proposes a novel classifier to recognize multiple disturbances for electric signals of power systems. The proposed classifier consists of a series of pipeline-based processing components, including amplitude estimator, transient disturbance detector, transient impulsive detector, wavelet transform and a brand-new neural network for recognizing multiple disturbances in a power quality (PQ) event. Most of the previously proposed methods usually treated a PQ event as a single disturbance at a time. In practice, however, a PQ event often consists of various types of disturbances at the same time. Therefore, the performances of those methods might be limited in real power systems. This work considers the PQ event as a combination of several disturbances, including steady-state and transient disturbances, which is more analogous to the real status of a power system. Six types of commonly encountered power quality disturbances are considered for training and testing the proposed classifier. The proposed classifier has been tested on electric signals that contain single disturbance or several disturbances at a time. Experimental results indicate that the proposed PQ disturbance classification algorithm can achieve a high accuracy of more than 97% in various complex testing cases.

  20. Non-Mutually Exclusive Deep Neural Network Classifier for Combined Modes of Bearing Fault Diagnosis

    PubMed Central

    Kim, Jong-Myon

    2018-01-01

    The simultaneous occurrence of various types of defects in bearings makes their diagnosis more challenging owing to the resultant complexity of the constituent parts of the acoustic emission (AE) signals. To address this issue, a new approach is proposed in this paper for the detection of multiple combined faults in bearings. The proposed methodology uses a deep neural network (DNN) architecture to effectively diagnose the combined defects. The DNN structure is based on the stacked denoising autoencoder non-mutually exclusive classifier (NMEC) method for combined modes. The NMEC-DNN is trained using data for a single fault and it classifies both single faults and multiple combined faults. The results of experiments conducted on AE data collected through an experimental test-bed demonstrate that the DNN achieves good classification performance with a maximum accuracy of 95%. The proposed method is compared with a multi-class classifier based on support vector machines (SVMs). The NMEC-DNN yields better diagnostic performance in comparison to the multi-class classifier based on SVM. The NMEC-DNN reduces the number of necessary data collections and improves the bearing fault diagnosis performance. PMID:29642466

  1. A mapping closure for turbulent scalar mixing using a time-evolving reference field

    NASA Technical Reports Server (NTRS)

    Girimaji, Sharath S.

    1992-01-01

    A general mapping-closure approach for modeling scalar mixing in homogeneous turbulence is developed. This approach is different from the previous methods in that the reference field also evolves according to the same equations as the physical scalar field. The use of a time-evolving Gaussian reference field results in a model that is similar to the mapping closure model of Pope (1991), which is based on the methodology of Chen et al. (1989). Both models yield identical relationships between the scalar variance and higher-order moments, which are in good agreement with heat conduction simulation data and can be consistent with any type of epsilon(phi) evolution. The present methodology can be extended to any reference field whose behavior is known. The possibility of a beta-pdf reference field is explored. The shortcomings of the mapping closure methods are discussed, and the limit at which the mapping becomes invalid is identified.

  2. Using Chou's pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location.

    PubMed

    Jiang, Xiaoying; Wei, Rong; Zhao, Yanjun; Zhang, Tongliang

    2008-05-01

    The knowledge of subnuclear localization in eukaryotic cells is essential for understanding the life function of nucleus. Developing prediction methods and tools for proteins subnuclear localization become important research fields in protein science for special characteristics in cell nuclear. In this study, a novel approach has been proposed to predict protein subnuclear localization. Sample of protein is represented by Pseudo Amino Acid (PseAA) composition based on approximate entropy (ApEn) concept, which reflects the complexity of time series. A novel ensemble classifier is designed incorporating three AdaBoost classifiers. The base classifier algorithms in three AdaBoost are decision stumps, fuzzy K nearest neighbors classifier, and radial basis-support vector machines, respectively. Different PseAA compositions are used as input data of different AdaBoost classifier in ensemble. Genetic algorithm is used to optimize the dimension and weight factor of PseAA composition. Two datasets often used in published works are used to validate the performance of the proposed approach. The obtained results of Jackknife cross-validation test are higher and more balance than them of other methods on same datasets. The promising results indicate that the proposed approach is effective and practical. It might become a useful tool in protein subnuclear localization. The software in Matlab and supplementary materials are available freely by contacting the corresponding author.

  3. Searching for δ Scuti-type pulsation and characterising northern pre-main-sequence field stars

    NASA Astrophysics Data System (ADS)

    Díaz-Fraile, D.; Rodríguez, E.; Amado, P. J.

    2014-08-01

    Context. Pre-main-sequence (PMS) stars are objects evolving from the birthline to the zero-age main sequence (ZAMS). Given a mass range near the ZAMS, the temperatures and luminosities of PMS and main-sequence stars are very similar. Moreover, their evolutionary tracks intersect one another causing some ambiguity in the determination of their evolutionary status. In this context, the detection and study of pulsations in PMS stars is crucial for differentiating between both types of stars by obtaining information of their interiors via asteroseismic techniques. Aims: A photometric variability study of a sample of northern field stars, which previously classified as either PMS or Herbig Ae/Be objects, has been undertaken with the purpose of detecting δ Scuti-type pulsations. Determination of physical parameters for these stars has also been carried out to locate them on the Hertzsprung-Russell diagram and check the instability strip for this type of pulsators. Methods: Multichannel photomultiplier and CCD time series photometry in the uvby Strömgren and BVI Johnson bands were obtained during four consecutive years from 2007 to 2010. The light curves have been analysed, and a variability criterion has been established. Among the objects classified as variable stars, we have selected those which present periodicities above 4 d-1, which was established as the lowest limit for δ Scuti-type pulsations in this investigation. Finally, these variable stars have been placed in a colour-magnitude diagram using the physical parameters derived with the collected uvbyβ Strömgren-Crawford photometry. Results: Five PMS δ Scuti- and three probable β Cephei-type stars have been detected. Two additional PMS δ Scuti stars are also confirmed in this work. Moreover, three new δ Scuti- and two γ Doradus-type stars have been detected among the main-sequence objects used as comparison or check stars.

  4. Multimodal fusion of polynomial classifiers for automatic person recgonition

    NASA Astrophysics Data System (ADS)

    Broun, Charles C.; Zhang, Xiaozheng

    2001-03-01

    With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with AWGN in the audio domain over a range of signal-to-noise ratios.

  5. Boosting drug named entity recognition using an aggregate classifier.

    PubMed

    Korkontzelos, Ioannis; Piliouras, Dimitrios; Dowsey, Andrew W; Ananiadou, Sophia

    2015-10-01

    Drug named entity recognition (NER) is a critical step for complex biomedical NLP tasks such as the extraction of pharmacogenomic, pharmacodynamic and pharmacokinetic parameters. Large quantities of high quality training data are almost always a prerequisite for employing supervised machine-learning techniques to achieve high classification performance. However, the human labour needed to produce and maintain such resources is a significant limitation. In this study, we improve the performance of drug NER without relying exclusively on manual annotations. We perform drug NER using either a small gold-standard corpus (120 abstracts) or no corpus at all. In our approach, we develop a voting system to combine a number of heterogeneous models, based on dictionary knowledge, gold-standard corpora and silver annotations, to enhance performance. To improve recall, we employed genetic programming to evolve 11 regular-expression patterns that capture common drug suffixes and used them as an extra means for recognition. Our approach uses a dictionary of drug names, i.e. DrugBank, a small manually annotated corpus, i.e. the pharmacokinetic corpus, and a part of the UKPMC database, as raw biomedical text. Gold-standard and silver annotated data are used to train maximum entropy and multinomial logistic regression classifiers. Aggregating drug NER methods, based on gold-standard annotations, dictionary knowledge and patterns, improved the performance on models trained on gold-standard annotations, only, achieving a maximum F-score of 95%. In addition, combining models trained on silver annotations, dictionary knowledge and patterns are shown to achieve comparable performance to models trained exclusively on gold-standard data. The main reason appears to be the morphological similarities shared among drug names. We conclude that gold-standard data are not a hard requirement for drug NER. Combining heterogeneous models build on dictionary knowledge can achieve similar or comparable classification performance with that of the best performing model trained on gold-standard annotations. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Link prediction boosted psychiatry disorder classification for functional connectivity network

    NASA Astrophysics Data System (ADS)

    Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang

    2017-02-01

    Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.

  7. Authentication of Organically and Conventionally Grown Basils by Gas Chromatography/Mass Spectrometry Chemical Profiles

    PubMed Central

    Wang, Zhengfang; Chen, Pei; Yu, Liangli; Harrington, Peter de B.

    2013-01-01

    Basil plants cultivated by organic and conventional farming practices were accurately classified by pattern recognition of gas chromatography/mass spectrometry (GC/MS) data. A novel extraction procedure was devised to extract characteristic compounds from ground basil powders. Two in-house fuzzy classifiers, i.e., the fuzzy rule-building expert system (FuRES) and the fuzzy optimal associative memory (FOAM) for the first time, were used to build classification models. Two crisp classifiers, i.e., soft independent modeling by class analogy (SIMCA) and the partial least-squares discriminant analysis (PLS-DA), were used as control methods. Prior to data processing, baseline correction and retention time alignment were performed. Classifiers were built with the two-way data sets, the total ion chromatogram representation of data sets, and the total mass spectrum representation of data sets, separately. Bootstrapped Latin partition (BLP) was used as an unbiased evaluation of the classifiers. By using two-way data sets, average classification rates with FuRES, FOAM, SIMCA, and PLS-DA were 100 ± 0%, 94.4 ± 0.4%, 93.3 ± 0.4%, and 100 ± 0%, respectively, for 100 independent evaluations. The established classifiers were used to classify a new validation set collected 2.5 months later with no parametric changes except that the training set and validation set were individually mean-centered. For the new two-way validation set, classification rates with FuRES, FOAM, SIMCA, and PLS-DA were 100%, 83%, 97%, and 100%, respectively. Thereby, the GC/MS analysis was demonstrated as a viable approach for organic basil authentication. It is the first time that a FOAM has been applied to classification. A novel baseline correction method was used also for the first time. The FuRES and the FOAM are demonstrated as powerful tools for modeling and classifying GC/MS data of complex samples and the data pretreatments are demonstrated to be useful to improve the performance of classifiers. PMID:23398171

  8. Ship localization in Santa Barbara Channel using machine learning classifiers.

    PubMed

    Niu, Haiqiang; Ozanich, Emma; Gerstoft, Peter

    2017-11-01

    Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.

  9. A Method of Neighbor Classes Based SVM Classification for Optical Printed Chinese Character Recognition

    PubMed Central

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR. PMID:23536777

  10. Subsocial behaviour and brood adoption in mixed-species colonies of two theridiid spiders

    NASA Astrophysics Data System (ADS)

    Grinsted, Lena; Agnarsson, Ingi; Bilde, Trine

    2012-12-01

    Cooperation and group living often evolves through kin selection. However, associations between unrelated organisms, such as different species, can evolve if both parties benefit from the interaction. Group living is rare in spiders, but occurs in cooperative, permanently social spiders, as well as in territorial, colonial spiders. Mixed species spider colonies, involving closely related species, have rarely been documented. We examined social interactions in newly discovered mixed-species colonies of theridiid spiders on Bali, Indonesia. Our aim was to test the degree of intra- and interspecific tolerance, aggression and cooperation through behavioural experiments and examine the potential for adoption of foreign brood. Morphological and genetic analyses confirmed that colonies consisted of two related species Chikunia nigra (O.P. Cambridge, 1880) new combination (previously Chrysso nigra) and a yet undescribed Chikunia sp. Females defended territories and did not engage in cooperative prey capture, but interestingly, both species seemed to provide extended maternal care of young and indiscriminate care for foreign brood. Future studies may reveal whether these species adopt only intra-specific young, or also inter-specifically. We classify both Chikunia species subsocial and intra- and interspecifically colonial, and discuss the evolutionary significance of a system where one or both species may potentially benefit from mutual tolerance and brood adoption.

  11. Evolutionary Analysis and Expression Profiling of Zebra Finch Immune Genes

    PubMed Central

    Ekblom, Robert; French, Lisa; Slate, Jon; Burke, Terry

    2010-01-01

    Genes of the immune system are generally considered to evolve rapidly due to host–parasite coevolution. They are therefore of great interest in evolutionary biology and molecular ecology. In this study, we manually annotated 144 avian immune genes from the zebra finch (Taeniopygia guttata) genome and conducted evolutionary analyses of these by comparing them with their orthologs in the chicken (Gallus gallus). Genes classified as immune receptors showed elevated dN/dS ratios compared with other classes of immune genes. Immune genes in general also appear to be evolving more rapidly than other genes, as inferred from a higher dN/dS ratio compared with the rest of the genome. Furthermore, ten genes (of 27) for which sequence data were available from at least three bird species showed evidence of positive selection acting on specific codons. From transcriptome data of eight different tissues, we found evidence for expression of 106 of the studied immune genes, with primary expression of most of these in bursa, blood, and spleen. These immune-related genes showed a more tissue-specific expression pattern than other genes in the zebra finch genome. Several of the avian immune genes investigated here provide strong candidates for in-depth studies of molecular adaptation in birds. PMID:20884724

  12. Neuropeptides in the Gonads: From Evolution to Pharmacology

    PubMed Central

    McGuire, Nicolette L.; Bentley, George E.

    2010-01-01

    Vertebrate gonads are the sites of synthesis and binding of many peptides that were initially classified as neuropeptides. These gonadal neuropeptide systems are neither well understood in isolation, nor in their interactions with other neuropeptide systems. Further, our knowledge of the control of these gonadal neuropeptides by peripheral hormones that bind to the gonads, and which themselves are under regulation by true neuropeptide systems from the hypothalamus, is relatively meager. This review discusses the existence of a variety of neuropeptides and their receptors which have been discovered in vertebrate gonads, and the possible way in which such systems could have evolved. We then focus on two key neuropeptides for regulation of the hypothalamo-pituitary-gonadal axis: gonadotropin-releasing hormone (GnRH) and gonadotropin-inhibitory hormone (GnIH). Comparative studies have provided us with a degree of understanding as to how a gonadal GnRH system might have evolved, and they have been responsible for the discovery of GnIH and its gonadal counterpart. We attempt to highlight what is known about these two key gonadal neuropeptides, how their actions differ from their hypothalamic counterparts, and how we might learn from comparative studies of them and other gonadal neuropeptides in terms of pharmacology, reproductive physiology and evolutionary biology. PMID:21607065

  13. J plots: a new method for characterizing structures in the interstellar medium

    NASA Astrophysics Data System (ADS)

    Jaffa, S. E.; Whitworth, A. P.; Clarke, S. D.; Howard, A. D. P.

    2018-06-01

    Large-scale surveys have brought about a revolution in astronomy. To analyse the resulting wealth of data, we need automated tools to identify, classify, and quantify the important underlying structures. We present here a method for classifying and quantifying a pixelated structure, based on its principal moments of inertia. The method enables us to automatically detect, and objectively compare, centrally condensed cores, elongated filaments, and hollow rings. We illustrate the method by applying it to (i) observations of surface density from Hi-GAL, and (ii) simulations of filament growth in a turbulent medium. We limit the discussion here to 2D data; in a future paper, we will extend the method to 3D data.

  14. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection.

    PubMed

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-06-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.

  15. Feature and Score Fusion Based Multiple Classifier Selection for Iris Recognition

    PubMed Central

    Islam, Md. Rabiul

    2014-01-01

    The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al. PMID:25114676

  16. Feature and score fusion based multiple classifier selection for iris recognition.

    PubMed

    Islam, Md Rabiul

    2014-01-01

    The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.

  17. Improved semi-supervised online boosting for object tracking

    NASA Astrophysics Data System (ADS)

    Li, Yicui; Qi, Lin; Tan, Shukun

    2016-10-01

    The advantage of an online semi-supervised boosting method which takes object tracking problem as a classification problem, is training a binary classifier from labeled and unlabeled examples. Appropriate object features are selected based on real time changes in the object. However, the online semi-supervised boosting method faces one key problem: The traditional self-training using the classification results to update the classifier itself, often leads to drifting or tracking failure, due to the accumulated error during each update of the tracker. To overcome the disadvantages of semi-supervised online boosting based on object tracking methods, the contribution of this paper is an improved online semi-supervised boosting method, in which the learning process is guided by positive (P) and negative (N) constraints, termed P-N constraints, which restrict the labeling of the unlabeled samples. First, we train the classification by an online semi-supervised boosting. Then, this classification is used to process the next frame. Finally, the classification is analyzed by the P-N constraints, which are used to verify if the labels of unlabeled data assigned by the classifier are in line with the assumptions made about positive and negative samples. The proposed algorithm can effectively improve the discriminative ability of the classifier and significantly alleviate the drifting problem in tracking applications. In the experiments, we demonstrate real-time tracking of our tracker on several challenging test sequences where our tracker outperforms other related on-line tracking methods and achieves promising tracking performance.

  18. Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

    PubMed Central

    Álvarez, Aitor; Sierra, Basilio; Arruti, Andoni; López-Gil, Juan-Miguel; Garay-Vitoria, Nestor

    2015-01-01

    In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first layer to select the optimal subset from the standard base classifiers. The good performance of the proposed new paradigm was demonstrated over different configurations and datasets. First, several CSS stacking classifiers were constructed on the RekEmozio dataset, using some specific standard base classifiers and a total of 123 spectral, quality and prosodic features computed using in-house feature extraction algorithms. These initial CSS stacking classifiers were compared to other multi-classifier systems and the employed standard classifiers built on the same set of speech features. Then, new CSS stacking classifiers were built on RekEmozio using a different set of both acoustic parameters (extended version of the Geneva Minimalistic Acoustic Parameter Set (eGeMAPS)) and standard classifiers and employing the best meta-classifier of the initial experiments. The performance of these two CSS stacking classifiers was evaluated and compared. Finally, the new paradigm was tested on the well-known Berlin Emotional Speech database. We compared the performance of single, standard stacking and CSS stacking systems using the same parametrization of the second phase. All of the classifications were performed at the categorical level, including the six primary emotions plus the neutral one. PMID:26712757

  19. A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data.

    PubMed

    Collell, Guillem; Prelec, Drazen; Patil, Kaustubh R

    2018-01-31

    Class imbalance presents a major hurdle in the application of classification methods. A commonly taken approach is to learn ensembles of classifiers using rebalanced data. Examples include bootstrap averaging (bagging) combined with either undersampling or oversampling of the minority class examples. However, rebalancing methods entail asymmetric changes to the examples of different classes, which in turn can introduce their own biases. Furthermore, these methods often require specifying the performance measure of interest a priori, i.e., before learning. An alternative is to employ the threshold moving technique, which applies a threshold to the continuous output of a model, offering the possibility to adapt to a performance measure a posteriori , i.e., a plug-in method. Surprisingly, little attention has been paid to this combination of a bagging ensemble and threshold-moving. In this paper, we study this combination and demonstrate its competitiveness. Contrary to the other resampling methods, we preserve the natural class distribution of the data resulting in well-calibrated posterior probabilities. Additionally, we extend the proposed method to handle multiclass data. We validated our method on binary and multiclass benchmark data sets by using both, decision trees and neural networks as base classifiers. We perform analyses that provide insights into the proposed method.

  20. A method to determine agro-climatic zones based on correlation and cluster analyses

    NASA Astrophysics Data System (ADS)

    Borges Valeriano, Taynara Tuany; de Souza Rolim, Glauco; de Oliveira Aparecido, Lucas Eduardo

    2017-12-01

    Determining agro-climatic zones (ACZs) is traditionally made by cross-comparing meteorological elements such as air temperature, rainfall, and water deficit (DEF). This study proposes a new method based on correlations between monthly DEFs during the crop cycle and annual yield and performs a multivariate cluster analysis on these correlations. This `correlation method' was applied to all municipalities in the state of São Paulo to determine ACZs for coffee plantations. A traditional ACZ method for coffee, which is based on temperature and DEF ranges (Evangelista et al.; RBEAA, 6:445-452, 2002), was applied to the study area to compare against the correlation method. The traditional ACZ classified the "Alta Mogina," "Média Mogiana," and "Garça and Marília" regions as traditional coffee regions that were either suitable or even restricted for coffee plantations. These traditional regions have produced coffee since 1800 and should not be classified as restricted. The correlation method classified those areas as high-producing regions and expanded them into other areas. The proposed method is innovative, because it is more detailed than common ACZ methods. Each developmental crop phase was analyzed based on correlations between the monthly DEF and yield, improving the importance of crop physiology in relation to climate.

  1. Machine learning of swimming data via wisdom of crowd and regression analysis.

    PubMed

    Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing

    2017-04-01

    Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.

  2. An Ensemble Framework Coping with Instability in the Gene Selection Process.

    PubMed

    Castellanos-Garzón, José A; Ramos, Juan; López-Sánchez, Daniel; de Paz, Juan F; Corchado, Juan M

    2018-03-01

    This paper proposes an ensemble framework for gene selection, which is aimed at addressing instability problems presented in the gene filtering task. The complex process of gene selection from gene expression data faces different instability problems from the informative gene subsets found by different filter methods. This makes the identification of significant genes by the experts difficult. The instability of results can come from filter methods, gene classifier methods, different datasets of the same disease and multiple valid groups of biomarkers. Even though there is a wide number of proposals, the complexity imposed by this problem remains a challenge today. This work proposes a framework involving five stages of gene filtering to discover biomarkers for diagnosis and classification tasks. This framework performs a process of stable feature selection, facing the problems above and, thus, providing a more suitable and reliable solution for clinical and research purposes. Our proposal involves a process of multistage gene filtering, in which several ensemble strategies for gene selection were added in such a way that different classifiers simultaneously assess gene subsets to face instability. Firstly, we apply an ensemble of recent gene selection methods to obtain diversity in the genes found (stability according to filter methods). Next, we apply an ensemble of known classifiers to filter genes relevant to all classifiers at a time (stability according to classification methods). The achieved results were evaluated in two different datasets of the same disease (pancreatic ductal adenocarcinoma), in search of stability according to the disease, for which promising results were achieved.

  3. The importance of serum albumin determination method to classify patients based on nutritional status.

    PubMed

    Alcorta, M Duque; Alvarez, P Chanca; Cabetas, R Nuñez; Martín, Mj Alcaide; Valero, M; Candela, C Gómez

    2018-06-01

    The global health community has recognized the role of food and nutrition in health maintenance and disease prevention. Undernutrition is an important problem in clinical circles but it is still not highly considered by specialists. It is well known the consequences of undernutrition on the immunological systems. Furthermore, the main consequences are an increase of morbidity-mortality rates, postoperative complications, length of stay and number of hospital early readmissions. These are all reasons to lead to increase health-care financial costs. The total assistance quality could be improved by the arrangement of an automatic detection system of undernutrition. In our hospital, we use the screening tool "CONtrolling NUTritional status" (CONUT). To measure albumin, the laboratory could use bromocresol green (BCG) and bromocresol purple (BCP) method. The aim of this study is to evaluate the CONNUT tool to classify patients using two different albumin methods to measure. The albumin and cholesterol performed in Advia 2400 analyzer using bromocresol green and purple methods to measure albumin. The total lymphocytes performed in Advia 2120. We calculate CONNUT index and classify the patients based on nutritional status. When we classified our patients based on nutritional status (CONNUT), 28% were misclassified, almost in moderate and severe groups. This is very important because this tool generates a multidisciplinary action to the patient. Therefore, in the Clinical Laboratory we have to know the methods we use, the validity of these methods in future tools/index and the management and outcome of the patients. Copyright © 2018 European Society for Clinical Nutrition and Metabolism. Published by Elsevier Ltd. All rights reserved.

  4. Emergence of novel domains in proteins

    PubMed Central

    2013-01-01

    Background Proteins are composed of a combination of discrete, well-defined, sequence domains, associated with specific functions that have arisen at different times during evolutionary history. The emergence of novel domains is related to protein functional diversification and adaptation. But currently little is known about how novel domains arise and how they subsequently evolve. Results To gain insights into the impact of recently emerged domains in protein evolution we have identified all human young protein domains that have emerged in approximately the past 550 million years. We have classified them into vertebrate-specific and mammalian-specific groups, and compared them to older domains. We have found 426 different annotated young domains, totalling 995 domain occurrences, which represent about 12.3% of all human domains. We have observed that 61.3% of them arose in newly formed genes, while the remaining 38.7% are found combined with older domains, and have very likely emerged in the context of a previously existing protein. Young domains are preferentially located at the N-terminus of the protein, indicating that, at least in vertebrates, novel functional sequences often emerge there. Furthermore, young domains show significantly higher non-synonymous to synonymous substitution rates than older domains using human and mouse orthologous sequence comparisons. This is also true when we compare young and old domains located in the same protein, suggesting that recently arisen domains tend to evolve in a less constrained manner than older domains. Conclusions We conclude that proteins tend to gain domains over time, becoming progressively longer. We show that many proteins are made of domains of different age, and that the fastest evolving parts correspond to the domains that have been acquired more recently. PMID:23425224

  5. Identification of dual-tropic HIV-1 using evolved neural networks.

    PubMed

    Fogel, Gary B; Lamers, Susanna L; Liu, Enoch S; Salemi, Marco; McGrath, Michael S

    2015-11-01

    Blocking the binding of the envelope HIV-1 protein to immune cells is a popular concept for development of anti-HIV therapeutics. R5 HIV-1 binds CCR5, X4 HIV-1 binds CXCR4, and dual-tropic HIV-1 can bind either coreceptor for cellular entry. R5 viruses are associated with early infection and over time can evolve to X4 viruses that are associated with immune failure. Dual-tropic HIV-1 is less studied; however, it represents functional antigenic intermediates during the transition of R5 to X4 viruses. Viral tropism is linked partly to the HIV-1 envelope V3 domain, where the amino acid sequence helps dictate the receptor a particular virus will target; however, using V3 sequence information to identify dual-tropic HIV-1 isolates has remained difficult. Our goal in this study was to elucidate features of dual-tropic HIV-1 isolates that assist in the biological understanding of dual-tropism and develop an approach for their detection. Over 1559 HIV-1 subtype B sequences with known tropisms were analyzed. Each sequence was represented by 73 structural, biochemical and regional features. These features were provided to an evolved neural network classifier and evaluated using balanced and unbalanced data sets. The study resolved R5X4 viruses from R5 with an accuracy of 81.8% and from X4 with an accuracy of 78.8%. The approach also identified a set of V3 features (hydrophobicity, structural and polarity) that are associated with tropism transitions. The ability to distinguish R5X4 isolates will improve computational tropism decisions for R5 vs. X4 and assist in HIV-1 research and drug development efforts. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  6. Identifying stereotypic evolving micro-scale seizures (SEMS) in the hypoxic-ischemic EEG of the pre-term fetal sheep with a wavelet type-II fuzzy classifier.

    PubMed

    Abbasi, Hamid; Bennet, Laura; Gunn, Alistair J; Unsworth, Charles P

    2016-08-01

    Perinatal hypoxic-ischemic encephalopathy (HIE) around the time of birth due to lack of oxygen can lead to debilitating neurological conditions such as epilepsy and cerebral palsy. Experimental data have shown that brain injury evolves over time, but during the first 6-8 hours after HIE the brain has recovered oxidative metabolism in a latent phase, and brain injury is reversible. Treatments such as therapeutic cerebral hypothermia (brain cooling) are effective when started during the latent phase, and continued for several days. Effectiveness of hypothermia is lost if started after the latent phase. Post occlusion monitoring of particular micro-scale transients in the hypoxic-ischemic (HI) Electroencephalogram (EEG), from an asphyxiated fetal sheep model in utero, could provide precursory evidence to identify potential biomarkers of injury when brain damage is still treatable. In our studies, we have reported how it is possible to automatically detect HI EEG transients in the form of spikes and sharp waves during the latent phase of the HI EEG of the preterm fetal sheep. This paper describes how to identify stereotypic evolving micro-scale seizures (SEMS) which have a relatively abrupt onset and termination in a frequency range of 1.8-3Hz (Delta waves) superimposed on a suppressed EEG amplitude background post occlusion. This research demonstrates how a Wavelet Type-II Fuzzy Logic System (WT-Type-II-FLS) can be used to automatically identify subtle abnormal SEMS that occur during the latent phase with a preliminary average validation overall performance of 78.71%±6.63 over the 390 minutes of the latent phase, post insult, using in utero pre-term hypoxic fetal sheep models.

  7. On the Nature of the Enigmatic Object IRAS 19312+1950: A Rare Phase of Massive Star Formation?

    NASA Technical Reports Server (NTRS)

    Cordiner, M. A.; Boogert, A. C. A.; Charnley, S. B.; Justtanont, K.; Cox, N. L. J.; Smith, R. G.; Tielens, A. G. G. M.; Wirstrom, E. S.; Milam, S. N.; Keane, J. V.

    2016-01-01

    IRAS?19312+1950 is a peculiar object that has eluded firm characterization since its discovery, with combined maser properties similar to an evolved star and a young stellar object (YSO). To help determine its true nature, we obtained infrared spectra of IRAS?19312+1950 in the range 5-550 microns using the Herschel and Spitzer space observatories. The Herschel PACS maps exhibit a compact, slightly asymmetric continuum source at 170 microns, indicative of a large, dusty circumstellar envelope. The far-IR CO emission line spectrum reveals two gas temperature components: approx. = 0.22 Stellar Mass of material at 280+/-18 K, and ˜1.6 Me of material at 157+/-3 K. The OI 63 micron line is detected on-source but no significant emission from atomic ions was found. The HIFI observations display shocked, high-velocity gas with outflow speeds up to 90 km/s along the line of sight. From Spitzer spectroscopy, we identify ice absorption bands due to H2O at 5.8 microns and CO2 at 15 microns. The spectral energy distribution is consistent with a massive, luminous (approx. 2 × 10(exp 4) Stellar Luminosity) central source surrounded by a dense, warm circumstellar disk and envelope of total mass approx. 500-700 Stellar Mass with large bipolar outflow cavities. The combination of distinctive far-IR spectral features suggest that IRAS19312+1950 should be classified as an accreting, high-mass YSO rather than an evolved star. In light of this reclassification, IRAS19312+1950 becomes only the fifth high-mass protostar known to exhibit SiO maser activity, and demonstrates that 18 cm OH maser line ratios may not be reliable observational discriminators between evolved stars and YSOs.

  8. Research on Classification of Chinese Text Data Based on SVM

    NASA Astrophysics Data System (ADS)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  9. Detection and Classification of Pole-Like Objects from Mobile Mapping Data

    NASA Astrophysics Data System (ADS)

    Fukano, K.; Masuda, H.

    2015-08-01

    Laser scanners on a vehicle-based mobile mapping system can capture 3D point-clouds of roads and roadside objects. Since roadside objects have to be maintained periodically, their 3D models are useful for planning maintenance tasks. In our previous work, we proposed a method for detecting cylindrical poles and planar plates in a point-cloud. However, it is often required to further classify pole-like objects into utility poles, streetlights, traffic signals and signs, which are managed by different organizations. In addition, our previous method may fail to extract low pole-like objects, which are often observed in urban residential areas. In this paper, we propose new methods for extracting and classifying pole-like objects. In our method, we robustly extract a wide variety of poles by converting point-clouds into wireframe models and calculating cross-sections between wireframe models and horizontal cutting planes. For classifying pole-like objects, we subdivide a pole-like object into five subsets by extracting poles and planes, and calculate feature values of each subset. Then we apply a supervised machine learning method using feature variables of subsets. In our experiments, our method could achieve excellent results for detection and classification of pole-like objects.

  10. Modeling misidentification errors in capture-recapture studies using photographic identification of evolving marks

    USGS Publications Warehouse

    Yoshizaki, J.; Pollock, K.H.; Brownie, C.; Webster, R.A.

    2009-01-01

    Misidentification of animals is potentially important when naturally existing features (natural tags) are used to identify individual animals in a capture-recapture study. Photographic identification (photoID) typically uses photographic images of animals' naturally existing features as tags (photographic tags) and is subject to two main causes of identification errors: those related to quality of photographs (non-evolving natural tags) and those related to changes in natural marks (evolving natural tags). The conventional methods for analysis of capture-recapture data do not account for identification errors, and to do so requires a detailed understanding of the misidentification mechanism. Focusing on the situation where errors are due to evolving natural tags, we propose a misidentification mechanism and outline a framework for modeling the effect of misidentification in closed population studies. We introduce methods for estimating population size based on this model. Using a simulation study, we show that conventional estimators can seriously overestimate population size when errors due to misidentification are ignored, and that, in comparison, our new estimators have better properties except in cases with low capture probabilities (<0.2) or low misidentification rates (<2.5%). ?? 2009 by the Ecological Society of America.

  11. Area Determination of Diabetic Foot Ulcer Images Using a Cascaded Two-Stage SVM-Based Classification.

    PubMed

    Wang, Lei; Pedersen, Peder C; Agu, Emmanuel; Strong, Diane M; Tulu, Bengisu

    2017-09-01

    The standard chronic wound assessment method based on visual examination is potentially inaccurate and also represents a significant clinical workload. Hence, computer-based systems providing quantitative wound assessment may be valuable for accurately monitoring wound healing status, with the wound area the best suited for automated analysis. Here, we present a novel approach, using support vector machines (SVM) to determine the wound boundaries on foot ulcer images captured with an image capture box, which provides controlled lighting and range. After superpixel segmentation, a cascaded two-stage classifier operates as follows: in the first stage, a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from superpixels that are used as input for each stage in the classifier training. Specifically, color and bag-of-word representations of local dense scale invariant feature transformation features are descriptors for ruling out irrelevant regions, and color and wavelet-based features are descriptors for distinguishing healthy tissue from wound regions. Finally, the detected wound boundary is refined by applying the conditional random field method. We have implemented the wound classification on a Nexus 5 smartphone platform, except for training which was done offline. Results are compared with other classifiers and show that our approach provides high global performance rates (average sensitivity = 73.3%, specificity = 94.6%) and is sufficiently efficient for a smartphone-based image analysis.

  12. Using random forest for reliable classification and cost-sensitive learning for medical diagnosis.

    PubMed

    Yang, Fan; Wang, Hua-zhen; Mi, Hong; Lin, Cheng-de; Cai, Wei-wen

    2009-01-30

    Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.

  13. Breast tissue classification in digital tomosynthesis images based on global gradient minimization and texture features

    NASA Astrophysics Data System (ADS)

    Qin, Xulei; Lu, Guolan; Sechopoulos, Ioannis; Fei, Baowei

    2014-03-01

    Digital breast tomosynthesis (DBT) is a pseudo-three-dimensional x-ray imaging modality proposed to decrease the effect of tissue superposition present in mammography, potentially resulting in an increase in clinical performance for the detection and diagnosis of breast cancer. Tissue classification in DBT images can be useful in risk assessment, computer-aided detection and radiation dosimetry, among other aspects. However, classifying breast tissue in DBT is a challenging problem because DBT images include complicated structures, image noise, and out-of-plane artifacts due to limited angular tomographic sampling. In this project, we propose an automatic method to classify fatty and glandular tissue in DBT images. First, the DBT images are pre-processed to enhance the tissue structures and to decrease image noise and artifacts. Second, a global smooth filter based on L0 gradient minimization is applied to eliminate detailed structures and enhance large-scale ones. Third, the similar structure regions are extracted and labeled by fuzzy C-means (FCM) classification. At the same time, the texture features are also calculated. Finally, each region is classified into different tissue types based on both intensity and texture features. The proposed method is validated using five patient DBT images using manual segmentation as the gold standard. The Dice scores and the confusion matrix are utilized to evaluate the classified results. The evaluation results demonstrated the feasibility of the proposed method for classifying breast glandular and fat tissue on DBT images.

  14. Classifying with confidence from incomplete information.

    DOE PAGES

    Parrish, Nathan; Anderson, Hyrum S.; Gupta, Maya R.; ...

    2013-12-01

    For this paper, we consider the problem of classifying a test sample given incomplete information. This problem arises naturally when data about a test sample is collected over time, or when costs must be incurred to compute the classification features. For example, in a distributed sensor network only a fraction of the sensors may have reported measurements at a certain time, and additional time, power, and bandwidth is needed to collect the complete data to classify. A practical goal is to assign a class label as soon as enough data is available to make a good decision. We formalize thismore » goal through the notion of reliability—the probability that a label assigned given incomplete data would be the same as the label assigned given the complete data, and we propose a method to classify incomplete data only if some reliability threshold is met. Our approach models the complete data as a random variable whose distribution is dependent on the current incomplete data and the (complete) training data. The method differs from standard imputation strategies in that our focus is on determining the reliability of the classification decision, rather than just the class label. We show that the method provides useful reliability estimates of the correctness of the imputed class labels on a set of experiments on time-series data sets, where the goal is to classify the time-series as early as possible while still guaranteeing that the reliability threshold is met.« less

  15. Support vector inductive logic programming outperforms the naive Bayes classifier and inductive logic programming for the classification of bioactive chemical compounds.

    PubMed

    Cannon, Edward O; Amini, Ata; Bender, Andreas; Sternberg, Michael J E; Muggleton, Stephen H; Glen, Robert C; Mitchell, John B O

    2007-05-01

    We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.

  16. The value of nodal information in predicting lung cancer relapse using 4DPET/4DCT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Heyse, E-mail: heyse.li@mail.utoronto.ca; Becker, Nathan; Raman, Srinivas

    2015-08-15

    Purpose: There is evidence that computed tomography (CT) and positron emission tomography (PET) imaging metrics are prognostic and predictive in nonsmall cell lung cancer (NSCLC) treatment outcomes. However, few studies have explored the use of standardized uptake value (SUV)-based image features of nodal regions as predictive features. The authors investigated and compared the use of tumor and node image features extracted from the radiotherapy target volumes to predict relapse in a cohort of NSCLC patients undergoing chemoradiation treatment. Methods: A prospective cohort of 25 patients with locally advanced NSCLC underwent 4DPET/4DCT imaging for radiation planning. Thirty-seven image features were derivedmore » from the CT-defined volumes and SUVs of the PET image from both the tumor and nodal target regions. The machine learning methods of logistic regression and repeated stratified five-fold cross-validation (CV) were used to predict local and overall relapses in 2 yr. The authors used well-known feature selection methods (Spearman’s rank correlation, recursive feature elimination) within each fold of CV. Classifiers were ranked on their Matthew’s correlation coefficient (MCC) after CV. Area under the curve, sensitivity, and specificity values are also presented. Results: For predicting local relapse, the best classifier found had a mean MCC of 0.07 and was composed of eight tumor features. For predicting overall relapse, the best classifier found had a mean MCC of 0.29 and was composed of a single feature: the volume greater than 0.5 times the maximum SUV (N). Conclusions: The best classifier for predicting local relapse had only tumor features. In contrast, the best classifier for predicting overall relapse included a node feature. Overall, the methods showed that nodes add value in predicting overall relapse but not local relapse.« less

  17. Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework

    PubMed Central

    2014-01-01

    Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set. PMID:24646119

  18. Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework.

    PubMed

    Simha, Ramanuja; Shatkay, Hagit

    2014-03-19

    Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.

  19. Relationship between global structural parameters and Enzyme Commission hierarchy: implications for function prediction.

    PubMed

    Boareto, Marcelo; Yamagishi, Michel E B; Caticha, Nestor; Leite, Vitor B P

    2012-10-01

    In protein databases there is a substantial number of proteins structurally determined but without function annotation. Understanding the relationship between function and structure can be useful to predict function on a large scale. We have analyzed the similarities in global physicochemical parameters for a set of enzymes which were classified according to the four Enzyme Commission (EC) hierarchical levels. Using relevance theory we introduced a distance between proteins in the space of physicochemical characteristics. This was done by minimizing a cost function of the metric tensor built to reflect the EC classification system. Using an unsupervised clustering method on a set of 1025 enzymes, we obtained no relevant clustering formation compatible with EC classification. The distance distributions between enzymes from the same EC group and from different EC groups were compared by histograms. Such analysis was also performed using sequence alignment similarity as a distance. Our results suggest that global structure parameters are not sufficient to segregate enzymes according to EC hierarchy. This indicates that features essential for function are rather local than global. Consequently, methods for predicting function based on global attributes should not obtain high accuracy in main EC classes prediction without relying on similarities between enzymes from training and validation datasets. Furthermore, these results are consistent with a substantial number of studies suggesting that function evolves fundamentally by recruitment, i.e., a same protein motif or fold can be used to perform different enzymatic functions and a few specific amino acids (AAs) are actually responsible for enzyme activity. These essential amino acids should belong to active sites and an effective method for predicting function should be able to recognize them. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Measurement Techniques for Hypervelocity Impact Test Fragments

    NASA Technical Reports Server (NTRS)

    Hill, Nicole E.

    2008-01-01

    The ability to classify the size and shape of individual orbital debris fragments provides a better understanding of the orbital debris environment as a whole. The characterization of breakup fragmentation debris has gradually evolved from a simplistic, spherical assumption towards that of describing debris in terms of size, material, and shape parameters. One of the goals of the NASA Orbital Debris Program Office is to develop high-accuracy techniques to measure these parameters and apply them to orbital debris observations. Measurement of the physical characteristics of debris resulting from groundbased, hypervelocity impact testing provides insight into the shapes and sizes of debris produced from potential impacts in orbit. Current techniques for measuring these ground-test fragments require determination of dimensions based upon visual judgment. This leads to reduced accuracy and provides little or no repeatability for the measurements. With the common goal of mitigating these error sources, allaying any misunderstandings, and moving forward in fragment shape determination, the NASA Orbital Debris Program Office recently began using a computerized measurement system. The goal of using these new techniques is to improve knowledge of the relation between commonly used dimensions and overall shape. The immediate objective is to scan a single fragment, measure its size and shape properties, and import the fragment into a program that renders a 3D model that adequately demonstrates how the object could appear in orbit. This information would then be used to aid optical methods in orbital debris shape determination. This paper provides a description of the measurement techniques used in this initiative and shows results of this work. The tradeoffs of the computerized methods are discussed, as well as the means of repeatability in the measurements of these fragments. This paper serves as a general description of methods for the measurement and shape analysis of orbital debris.

Top