Science.gov

Sample records for classifying coding dna

  1. DNA as a Binary Code: How the Physical Structure of Nucleotide Bases Carries Information

    ERIC Educational Resources Information Center

    McCallister, Gary

    2005-01-01

    The DNA triplet code also functions as a binary code. Because double-ring compounds cannot bind to double-ring compounds in the DNA code, the sequence of bases classified simply as purines or pyrimidines can encode for smaller groups of possible amino acids. This is an intuitive approach to teaching the DNA code. (Contains 6 figures.)

  2. DNA: Polymer and molecular code

    NASA Astrophysics Data System (ADS)

    Shivashankar, G. V.

    1999-10-01

    The thesis work focusses upon two aspects of DNA, the polymer and the molecular code. Our approach was to bring single molecule micromanipulation methods to the study of DNA. It included a home built optical microscope combined with an atomic force microscope and an optical tweezer. This combined approach led to a novel method to graft a single DNA molecule onto a force cantilever using the optical tweezer and local heating. With this method, a force versus extension assay of double stranded DNA was realized. The resolution was about 10 picoN. To improve on this force measurement resolution, a simple light backscattering technique was developed and used to probe the DNA polymer flexibility and its fluctuations. It combined the optical tweezer to trap a DNA tethered bead and the laser backscattering to detect the beads Brownian fluctuations. With this technique the resolution was about 0.1 picoN with a millisecond access time, and the whole entropic part of the DNA force-extension was measured. With this experimental strategy, we measured the polymerization of the protein RecA on an isolated double stranded DNA. We observed the progressive decoration of RecA on the l DNA molecule, which results in the extension of l , due to unwinding of the double helix. The dynamics of polymerization, the resulting change in the DNA entropic elasticity and the role of ATP hydrolysis were the main parts of the study. A simple model for RecA assembly on DNA was proposed. This work presents a first step in the study of genetic recombination. Recently we have started a study of equilibrium binding which utilizes fluorescence polarization methods to probe the polymerization of RecA on single stranded DNA. In addition to the study of material properties of DNA and DNA-RecA, we have developed experiments for which the code of the DNA is central. We studied one aspect of DNA as a molecular code, using different techniques. In particular the programmatic use of template specificity makes

  3. Superimposed Code Theorectic Analysis of DNA Codes and DNA Computing

    DTIC Science & Technology

    2010-03-01

    that the hybridization that occurs between a DNA strand and its Watson - Crick complement can be used to perform mathematical computation. This research... Watson - Crick (WC) duplex, e.g., TCGCA TCGCA . Note that non-WC duplexes can form and such a formation is called a cross-hybridization. Cross...5’GAAAGTCGCGTA3’ Watson Crick (WC) Duplexes TACGCGACTTTC Cross Hybridized (CH) Duplexes ATTTTTGCGTTA GAAAAAGAAGAA Coding Strands for Ligation

  4. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    DTIC Science & Technology

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  5. IN-MACA-MCC: Integrated Multiple Attractor Cellular Automata with Modified Clonal Classifier for Human Protein Coding and Promoter Prediction.

    PubMed

    Pokkuluri, Kiran Sree; Inampudi, Ramesh Babu; Nedunuri, S S S N Usha Devi

    2014-01-01

    Protein coding and promoter region predictions are very important challenges of bioinformatics (Attwood and Teresa, 2000). The identification of these regions plays a crucial role in understanding the genes. Many novel computational and mathematical methods are introduced as well as existing methods that are getting refined for predicting both of the regions separately; still there is a scope for improvement. We propose a classifier that is built with MACA (multiple attractor cellular automata) and MCC (modified clonal classifier) to predict both regions with a single classifier. The proposed classifier is trained and tested with Fickett and Tung (1992) datasets for protein coding region prediction for DNA sequences of lengths 54, 108, and 162. This classifier is trained and tested with MMCRI datasets for protein coding region prediction for DNA sequences of lengths 252 and 354. The proposed classifier is trained and tested with promoter sequences from DBTSS (Yamashita et al., 2006) dataset and nonpromoters from EID (Saxonov et al., 2000) and UTRdb (Pesole et al., 2002) datasets. The proposed model can predict both regions with an average accuracy of 90.5% for promoter and 89.6% for protein coding region predictions. The specificity and sensitivity values of promoter and protein coding region predictions are 0.89 and 0.92, respectively.

  6. Coding capacity of complementary DNA strands.

    PubMed Central

    Casino, A; Cipollaro, M; Guerrini, A M; Mastrocinque, G; Spena, A; Scarlato, V

    1981-01-01

    A Fortran computer algorithm has been used to analyze the nucleotide sequence of several structural genes. The analysis performed on both coding and complementary DNA strands shows that whereas open reading frames shorter than 100 codons are randomly distributed on both DNA strands, open reading frames longer than 100 codons ("virtual genes") are significantly more frequent on the complementary DNA strand than on the coding one. These "virtual genes" were further investigated by looking at intron sequences, splicing points, signal sequences and by analyzing gene mutations. On the basis of this analysis coding and complementary DNA strands of several eukaryotic structural genes cannot be distinguished. In particular we suggest that the complementary DNA strand of the human epsilon-globin gene might indeed code for a protein. PMID:7015290

  7. Superimposed Code Theoretic Analysis of DNA Codes and DNA Computing

    DTIC Science & Technology

    2008-01-01

    complements of one another and the DNA duplex formed is a Watson - Crick (WC) duplex. However, there are many instances when the formation of non-WC...that the user’s requirements for probe selection are met based on the Watson - Crick probe locality within a target. The second type, called

  8. Lung Cancer Classification Employing Proposed Real Coded Genetic Algorithm Based Radial Basis Function Neural Network Classifier

    PubMed Central

    Deepa, S. N.

    2016-01-01

    A proposed real coded genetic algorithm based radial basis function neural network classifier is employed to perform effective classification of healthy and cancer affected lung images. Real Coded Genetic Algorithm (RCGA) is proposed to overcome the Hamming Cliff problem encountered with the Binary Coded Genetic Algorithm (BCGA). Radial Basis Function Neural Network (RBFNN) classifier is chosen as a classifier model because of its Gaussian Kernel function and its effective learning process to avoid local and global minima problem and enable faster convergence. This paper specifically focused on tuning the weights and bias of RBFNN classifier employing the proposed RCGA. The operators used in RCGA enable the algorithm flow to compute weights and bias value so that minimum Mean Square Error (MSE) is obtained. With both the lung healthy and cancer images from Lung Image Database Consortium (LIDC) database and Real time database, it is noted that the proposed RCGA based RBFNN classifier has performed effective classification of the healthy lung tissues and that of the cancer affected lung nodules. The classification accuracy computed using the proposed approach is noted to be higher in comparison with that of the classifiers proposed earlier in the literatures. PMID:28050198

  9. Lung Cancer Classification Employing Proposed Real Coded Genetic Algorithm Based Radial Basis Function Neural Network Classifier.

    PubMed

    Selvakumari Jeya, I Jasmine; Deepa, S N

    2016-01-01

    A proposed real coded genetic algorithm based radial basis function neural network classifier is employed to perform effective classification of healthy and cancer affected lung images. Real Coded Genetic Algorithm (RCGA) is proposed to overcome the Hamming Cliff problem encountered with the Binary Coded Genetic Algorithm (BCGA). Radial Basis Function Neural Network (RBFNN) classifier is chosen as a classifier model because of its Gaussian Kernel function and its effective learning process to avoid local and global minima problem and enable faster convergence. This paper specifically focused on tuning the weights and bias of RBFNN classifier employing the proposed RCGA. The operators used in RCGA enable the algorithm flow to compute weights and bias value so that minimum Mean Square Error (MSE) is obtained. With both the lung healthy and cancer images from Lung Image Database Consortium (LIDC) database and Real time database, it is noted that the proposed RCGA based RBFNN classifier has performed effective classification of the healthy lung tissues and that of the cancer affected lung nodules. The classification accuracy computed using the proposed approach is noted to be higher in comparison with that of the classifiers proposed earlier in the literatures.

  10. Telomeres, histone code, and DNA damage response.

    PubMed

    Misri, S; Pandita, S; Kumar, R; Pandita, T K

    2008-01-01

    Genomic stability is maintained by telomeres, the end terminal structures that protect chromosomes from fusion or degradation. Shortening or loss of telomeric repeats or altered telomere chromatin structure is correlated with telomere dysfunction such as chromosome end-to-end associations that could lead to genomic instability and gene amplification. The structure at the end of telomeres is such that its DNA differs from DNA double strand breaks (DSBs) to avoid nonhomologous end-joining (NHEJ), which is accomplished by forming a unique higher order nucleoprotein structure. Telomeres are attached to the nuclear matrix and have a unique chromatin structure. Whether this special structure is maintained by specific chromatin changes is yet to be thoroughly investigated. Chromatin modifications implicated in transcriptional regulation are thought to be the result of a code on the histone proteins (histone code). This code, involving phosphorylation, acetylation, methylation, ubiquitylation, and sumoylation of histones, is believed to regulate chromatin accessibility either by disrupting chromatin contacts or by recruiting non-histone proteins to chromatin. The histone code in which distinct histone tail-protein interactions promote engagement may be the deciding factor for choosing specific DSB repair pathways. Recent evidence suggests that such mechanisms are involved in DNA damage detection and repair. Altered telomere chromatin structure has been linked to defective DNA damage response (DDR), and eukaryotic cells have evolved DDR mechanisms utilizing proficient DNA repair and cell cycle checkpoints in order to maintain genomic stability. Recent studies suggest that chromatin modifying factors play a critical role in the maintenance of genomic stability. This review will summarize the role of DNA damage repair proteins specifically ataxia-telangiectasia mutated (ATM) and its effectors and the telomere complex in maintaining genome stability.

  11. Classifying DNA assembly protocols for devising cellular architectures.

    PubMed

    Wang, Xi; Sa, Na; Tian, Ping-fang; Tan, Tian-wei

    2011-01-01

    DNA assembly is one of the most fundamental techniques in synthetic biology. Efficient methods can turn traditional DNA cloning into time-saving and higher efficiency practice, which is a foundation to accomplish the dreams of synthetic biologists for devising cellular architectures, reprogramming cellular behaviors, or creating synthetic cells. In this review, typical strategies of DNA assembly are discussed with special emphasis on the assembly of long and multiple DNA fragments into intact plasmids or assembled compositions. Constructively, all reported strategies were categorized into in vivo and in vitro types, and protocols are presented in a functional and practice-oriented way in order to portray the general nature of DNA assembly applications. Significantly, a five-step blueprint is proposed for devising cell architectures that produce valuable chemicals.

  12. Indications for spine surgery: validation of an administrative coding algorithm to classify degenerative diagnoses

    PubMed Central

    Lurie, Jon D.; Tosteson, Anna N.A.; Deyo, Richard A.; Tosteson, Tor; Weinstein, James; Mirza, Sohail K.

    2014-01-01

    Study Design Retrospective analysis of Medicare claims linked to a multi-center clinical trial. Objective The Spine Patient Outcomes Research Trial (SPORT) provided a unique opportunity to examine the validity of a claims-based algorithm for grouping patients by surgical indication. SPORT enrolled patients for lumbar disc herniation, spinal stenosis, and degenerative spondylolisthesis. We compared the surgical indication derived from Medicare claims to that provided by SPORT surgeons, the “gold standard”. Summary of Background Data Administrative data are frequently used to report procedure rates, surgical safety outcomes, and costs in the management of spinal surgery. However, the accuracy of using diagnosis codes to classify patients by surgical indication has not been examined. Methods Medicare claims were link to beneficiaries enrolled in SPORT. The sensitivity and specificity of three claims-based approaches to group patients based on surgical indications were examined: 1) using the first listed diagnosis; 2) using all diagnoses independently; and 3) using a diagnosis hierarchy based on the support for fusion surgery. Results Medicare claims were obtained from 376 SPORT participants, including 21 with disc herniation, 183 with spinal stenosis, and 172 with degenerative spondylolisthesis. The hierarchical coding algorithm was the most accurate approach for classifying patients by surgical indication, with sensitivities of 76.2%, 88.1%, and 84.3% for disc herniation, spinal stenosis, and degenerative spondylolisthesis cohorts, respectively. The specificity was 98.3% for disc herniation, 83.2% for spinal stenosis, and 90.7% for degenerative spondylolisthesis. Misclassifications were primarily due to codes attributing more complex pathology to the case. Conclusion Standardized approaches for using claims data to accurately group patients by surgical indications has widespread interest. We found that a hierarchical coding approach correctly classified over 90

  13. Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers.

    PubMed

    Yu, Hualong; Hong, Shufang; Yang, Xibei; Ni, Jun; Dan, Yuanyuan; Qin, Bin

    2013-01-01

    DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.

  14. DNA Code Validation Using Experimental Fluorescence Measurements and Thermodynamic Calculations

    DTIC Science & Technology

    2004-03-01

    1 SUMMARY A DNA code is a collection of single-stranded DNA molecules. In DNA hybridization assays, the formation of any Watson - Crick ...combinations represent the canonical Watson - Crick pairings. To obtain the reverse complement of a strand of DNA , one must first reverse the order of the... DNA codes. Using software designed by A.Macula and V. Rykov, (Macula, 2003), a set of 13 pairs, (X, WC(X)), of Watson - Crick reverse complementary

  15. V(D)J recombination coding junction formation without DNA homology: processing of coding termini.

    PubMed Central

    Boubnov, N V; Wills, Z P; Weaver, D T

    1993-01-01

    Coding junction formation in V(D)J recombination generates diversity in the antigen recognition structures of immunoglobulin and T-cell receptor molecules by combining processes of deletion of terminal coding sequences and addition of nucleotides prior to joining. We have examined the role of coding end DNA composition in junction formation with plasmid substrates containing defined homopolymers flanking the recombination signal sequence elements. We found that coding junctions formed efficiently with or without terminal DNA homology. The extent of junctional deletion was conserved independent of coding ends with increased, partial, or no DNA homology. Interestingly, G/C homopolymer coding ends showed reduced deletion regardless of DNA homology. Therefore, DNA homology cannot be the primary determinant that stabilizes coding end structures for processing and joining. PMID:8413286

  16. Classifier assessment and feature selection for recognizing short coding sequences of human genes.

    PubMed

    Song, Kai; Zhang, Ze; Tong, Tuo-Peng; Wu, Fang

    2012-03-01

    With the ever-increasing pace of genome sequencing, there is a great need for fast and accurate computational tools to automatically identify genes in these genomes. Although great progress has been made in the development of gene-finding algorithms during the past decades, there is still room for further improvement. In particular, the issue of recognizing short exons in eukaryotes is still not solved satisfactorily. This article is devoted to assessing various linear and kernel-based classification algorithms and selecting the best combination of Z-curve features for further improvement of the issue. Eight state-of-the-art linear and kernel-based supervised pattern recognition techniques were used to identify the short (21-192 bp) coding sequences of human genes. By measuring the prediction accuracy, the tradeoff between sensitivity and specificity and the time consumption, partial least squares (PLS) and kernel partial least squares (KPLS) algorithms were verified to be the most optimal linear and kernel-based classifiers, respectively. A surprising result was that, by making good use of the interpretability of the PLS and the Z-curve methods, 93 Z-curve features were proved to be the best selective combination. Using them, the average recognition accuracy was improved as high as 7.7% by means of KPLS when compared with what was obtained by the Fisher discriminant analysis using 189 Z-curve variables (Gao and Zhang, 2004 ). The used codes are freely available from the following approaches (implemented in MATLAB and supported on Linux and MS Windows): (1) SVM: http://www.support-vector-machines.org/SVM_soft.html. (2) GP: http://www.gaussianprocess.org. (3) KPLS and KFDA: Taylor, J.S., and Cristianini, N. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK. (4) PLS: Wise, B.M., and Gallagher, N.B. 2011. PLS-Toolbox for use with MATLAB: ver 1.5.2. Eigenvector Technologies, Manson, WA. Supplementary Material for this article is

  17. BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA

    PubMed Central

    2013-01-01

    Background In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic information in DNA. Inherently DNA is a digital medium whereby the nucleotide bases act as digital symbols, a fact which underpins all bioinformatics techniques, and which also makes trivial information encoding using DNA straightforward. However, the situation is more complex in methods which aim at embedding information in the genomes of living organisms. DNA is susceptible to mutations, which act as a noisy channel from the point of view of information encoded using DNA. This means that the DNA data embedding field is closely related to digital communications. Moreover it is a particularly unique digital communications area, because important biological constraints must be observed by all methods. Many DNA data embedding algorithms have been presented to date, all of which operate in one of two regions: non-coding DNA (ncDNA) or protein-coding DNA (pcDNA). Results This paper proposes two novel DNA data embedding algorithms jointly called BioCode, which operate in ncDNA and pcDNA, respectively, and which comply fully with stricter biological restrictions. Existing methods comply with some elementary biological constraints, such as preserving protein translation in pcDNA. However there exist further biological restrictions which no DNA data embedding methods to date account for. Observing these constraints is key to increasing the biocompatibility and in turn, the robustness of information encoded in DNA. Conclusion The algorithms encode information in near optimal ways from a coding point of view, as we demonstrate by means of theoretical and empirical (in silico) analyses. Also, they are shown to encode information in a robust way, such that mutations have isolated

  18. Chloroplast DNA codes for transfer RNA.

    PubMed Central

    McCrea, J M; Hershberger, C L

    1976-01-01

    Transfer RNA's were isolated from Euglena gracilis. Chloroplast cistrons for tRNA were quantitated by hybridizing tRNA to ct DNA. Species of tRNA hybridizing to ct DNA were partially purified by hybridization-chromatography. The tRNA's hybridizing to ct DNA and nuclear DNA appear to be different. Total cellular tRNA was hybridized to ct DNA to an equivalent of approximately 25 cistrons. The total cellular tRNA was also separated into 2 fractions by chromatography on dihydroxyboryl substituted amino ethyl cellulose. Fraction I hybridized to both nuclear and ct DNA. Hybridizations to ct DNA indicated approximately 18 cistrons. Fraction II-tRNA hybridized only to ct DNA, saturating at a level of approximately 7 cistrons. The tRNA from isolated chloroplasts hybridized to both chloroplast and nuclear DNA. The level of hybridization to ct DNA indicated approximately 18 cistrons. Fraction II-type tRNA could not be detected in the isolated chloroplasts. PMID:823529

  19. DNA Barcoding through Quaternary LDPC Codes

    PubMed Central

    Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar

    2015-01-01

    For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10−2 per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10−9 at the expense of a rate of read losses just in the order of 10−6. PMID:26492348

  20. DNA Barcoding through Quaternary LDPC Codes.

    PubMed

    Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar

    2015-01-01

    For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(-2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(-9) at the expense of a rate of read losses just in the order of 10(-6).

  1. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  2. DNA barcode goes two-dimensions: DNA QR code web server.

    PubMed

    Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

    2012-01-01

    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.

  3. Advances in SCA and RF-DNA Fingerprinting Through Enhanced Linear Regression Attacks and Application of Random Forest Classifiers

    DTIC Science & Technology

    2014-09-18

    ADVANCES IN SCA AND RF-DNA FINGERPRINTING THROUGH ENHANCED LINEAR REGRESSION ATTACKS AND APPLICATION OF RANDOM FOREST CLASSIFIERS DISSERTATION Hiren...SCA AND RF-DNA FINGERPRINTING THROUGH ENHANCED LINEAR REGRESSION ATTACKS AND APPLICATION OF RANDOM FOREST CLASSIFIERS DISSERTATION Presented to the...APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED AFIT-ENG-DS-14-S-03 ADVANCES IN SCA AND RF-DNA FINGERPRINTING THROUGH ENHANCED LINEAR REGRESSION ATTACKS

  4. Protection of the genome and central protein-coding sequences by non-coding DNA against DNA damage from radiation.

    PubMed

    Qiu, Guo-Hua

    2015-01-01

    Non-coding DNA comprises a very large proportion of the total genomic content in higher organisms, but its function remains largely unclear. Non-coding DNA sequences constitute the majority of peripheral heterochromatin, which has been hypothesized to be the genome's 'bodyguard' against DNA damage from chemicals and radiation for almost four decades. The bodyguard protective function of peripheral heterochromatin in genome defense has been strengthened by the results from numerous recent studies, which are summarized in this review. These data have suggested that cells and/or organisms with a higher level of heterochromatin and more non-coding DNA sequences, including longer telomeric DNA and rDNAs, exhibit a lower frequency of DNA damage, higher radioresistance and longer lifespan after IR exposure. In addition, the majority of heterochromatin is peripherally located in the three-dimensional structure of genome organization. Therefore, the peripheral heterochromatin with non-coding DNA could play a protective role in genome defense against DNA damage from ionizing radiation by both absorbing the radicals from water radiolysis in the cytosol and reducing the energy of IR. However, the bodyguard protection by heterochromatin has been challenged by the observation that DNA damage is less frequently detected in peripheral heterochromatin than in euchromatin, which is inconsistent with the expectation and simulation results. Previous studies have also shown that the DNA damage in peripheral heterochromatin is rarely repaired and moves more quickly, broadly and outwardly to approach the nuclear pore complex (NPC). Additionally, it has been shown that extrachromosomal circular DNAs (eccDNAs) are formed in the nucleus, highly detectable in the cytoplasm (particularly under stress conditions) and shuttle between the nucleus and the cytoplasm. Based on these studies, this review speculates that the sites of DNA damage in peripheral heterochromatin could occur more

  5. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  6. Nonextensive statistical approach to non-coding human DNA

    NASA Astrophysics Data System (ADS)

    Oikonomou, Th.; Provata, A.; Tirnakli, U.

    2008-04-01

    We use q-exponential distributions, which maximize the nonextensive entropy Sq (defined as Sq≡(1-∑ipiq)/(q-1)), to study the size distributions of non-coding DNA (including introns and intergenic regions) in all human chromosomes. We show that the value of the exponent q describing the non-coding size distributions is similar for all chromosomes and varies between 2≤q≤2.3 with the exception of chromosomes X and Y.

  7. Within- and Cross-Participant Classifiers Reveal Different Neural Coding of Information

    PubMed Central

    Clithero, John A.; Smith, David V.; Carter, R. McKell; Huettel, Scott A.

    2010-01-01

    Analyzing distributed patterns of brain activation using multivariate pattern analysis (MVPA) has become a popular approach for using functional magnetic resonance imaging (fMRI) data to predict mental states. While the majority of studies currently build separate classifiers for each participant in the sample, in principle a single classifier can be derived from and tested on data from all participants. These two approaches, within- and cross-participant classification, rely on potentially different sources of variability and thus may provide distinct information about brain function. Here, we used both approaches to identify brain regions that contain information about passively-received monetary rewards (i.e., images of currency that influenced participant payment) and social rewards (i.e., images of human faces). Our within-participant analyses implicated regions in the ventral visual processing stream – including fusiform gyrus and primary visual cortex – and ventromedial prefrontal cortex (VMPFC). Two key results indicate these regions may contain statistically discriminable patterns that contain different informational representations. First, cross-participant analyses implicated additional brain regions, including striatum and anterior insula. The cross-participant analyses also revealed systematic changes in predictive power across brain regions, with the pattern of change consistent with the functional properties of regions. Second, individual differences in classifier performance in VMPFC were related to individual differences in preferences between our two reward modalities. We interpret these results as reflecting a distinction between patterns reflecting participant-specific functional organization and those indicating aspects of brain organization that generalize across individuals. PMID:20347995

  8. Coding-complete sequencing classifies parrot bornavirus 5 into a novel virus species.

    PubMed

    Marton, Szilvia; Bányai, Krisztián; Gál, János; Ihász, Katalin; Kugler, Renáta; Lengyel, György; Jakab, Ferenc; Bakonyi, Tamás; Farkas, Szilvia L

    2015-11-01

    In this study, we determined the sequence of the coding region of an avian bornavirus detected in a blue-and-yellow macaw (Ara ararauna) with pathological/histopathological changes characteristic of proventricular dilatation disease. The genomic organization of the macaw bornavirus is similar to that of other bornaviruses, and its nucleotide sequence is nearly identical to the available partial parrot bornavirus 5 (PaBV-5) sequences. Phylogenetic analysis showed that these strains formed a monophyletic group distinct from other mammalian and avian bornaviruses and in calculations performed with matrix protein coding sequences, the PaBV-5 and PaBV-6 genotypes formed a common cluster, suggesting that according to the recently accepted classification system for bornaviruses, these two genotypes may belong to a new species, provisionally named Psittaciform 2 bornavirus.

  9. Diversity and recombination of dispersed ribosomal DNA and protein coding genes in microsporidia.

    PubMed

    Ironside, Joseph Edward

    2013-01-01

    Microsporidian strains are usually classified on the basis of their ribosomal DNA (rDNA) sequences. Although rDNA occurs as multiple copies, in most non-microsporidian species copies within a genome occur as tandem arrays and are homogenised by concerted evolution. In contrast, microsporidian rDNA units are dispersed throughout the genome in some species, and on this basis are predicted to undergo reduced concerted evolution. Furthermore many microsporidian species appear to be asexual and should therefore exhibit reduced genetic diversity due to a lack of recombination. Here, DNA sequences are compared between microsporidia with different life cycles in order to determine the effects of concerted evolution and sexual reproduction upon the diversity of rDNA and protein coding genes. Comparisons of cloned rDNA sequences between microsporidia of the genus Nosema with different life cycles provide evidence of intragenomic variability coupled with strong purifying selection. This suggests a birth and death process of evolution. However, some concerted evolution is suggested by clustering of rDNA sequences within species. Variability of protein-coding sequences indicates that considerable intergenomic variation also occurs between microsporidian cells within a single host. Patterns of variation in microsporidian DNA sequences indicate that additional diversity is generated by intragenomic and/or intergenomic recombination between sequence variants. The discovery of intragenomic variability coupled with strong purifying selection in microsporidian rRNA sequences supports the hypothesis that concerted evolution is reduced when copies of a gene are dispersed rather than repeated tandemly. The presence of intragenomic variability also renders the use of rDNA sequences for barcoding microsporidia questionable. Evidence of recombination in the single-copy genes of putatively asexual microsporidia suggests that these species may undergo cryptic sexual reproduction, a

  10. Diversity and Recombination of Dispersed Ribosomal DNA and Protein Coding Genes in Microsporidia

    PubMed Central

    Ironside, Joseph Edward

    2013-01-01

    Microsporidian strains are usually classified on the basis of their ribosomal DNA (rDNA) sequences. Although rDNA occurs as multiple copies, in most non-microsporidian species copies within a genome occur as tandem arrays and are homogenised by concerted evolution. In contrast, microsporidian rDNA units are dispersed throughout the genome in some species, and on this basis are predicted to undergo reduced concerted evolution. Furthermore many microsporidian species appear to be asexual and should therefore exhibit reduced genetic diversity due to a lack of recombination. Here, DNA sequences are compared between microsporidia with different life cycles in order to determine the effects of concerted evolution and sexual reproduction upon the diversity of rDNA and protein coding genes. Comparisons of cloned rDNA sequences between microsporidia of the genus Nosema with different life cycles provide evidence of intragenomic variability coupled with strong purifying selection. This suggests a birth and death process of evolution. However, some concerted evolution is suggested by clustering of rDNA sequences within species. Variability of protein-coding sequences indicates that considerable intergenomic variation also occurs between microsporidian cells within a single host. Patterns of variation in microsporidian DNA sequences indicate that additional diversity is generated by intragenomic and/or intergenomic recombination between sequence variants. The discovery of intragenomic variability coupled with strong purifying selection in microsporidian rRNA sequences supports the hypothesis that concerted evolution is reduced when copies of a gene are dispersed rather than repeated tandemly. The presence of intragenomic variability also renders the use of rDNA sequences for barcoding microsporidia questionable. Evidence of recombination in the single-copy genes of putatively asexual microsporidia suggests that these species may undergo cryptic sexual reproduction, a

  11. Free Energy Gap and Statistical Thermodynamic Fidelity of DNA Codes

    DTIC Science & Technology

    2007-10-01

    reverse-complement unless otherwise stated. For strand x, let Nx denote its complement. A (perfect) Watson - Crick duplex is the joining of complement...is possible for complementary sequences to form a non-perfectly aligned duplex, we will call any x W Nx duplex a Watson - Crick (WC) duplex. Two...DATES COVERED (From - To) 4. TITLE AND SUBTITLE FREE ENERGY GAP AND STATISTICAL THERMODYNAMIC FIDELITY OF DNA CODES 5a. CONTRACT NUMBER FA8750-07

  12. Structural Code for DNA Recognition Revealed in Crystal Structures of Papillomavirus E2-DNA Targets

    NASA Astrophysics Data System (ADS)

    Rozenberg, Haim; Rabinovich, Dov; Frolow, Felix; Hegde, Rashmi S.; Shakked, Zippora

    1998-12-01

    Transcriptional regulation in papillomaviruses depends on sequence-specific binding of the regulatory protein E2 to several sites in the viral genome. Crystal structures of bovine papillomavirus E2 DNA targets reveal a conformational variant of B-DNA characterized by a roll-induced writhe and helical repeat of 10.5 bp per turn. A comparison between the free and the protein-bound DNA demonstrates that the intrinsic structure of the DNA regions contacted directly by the protein and the deformability of the DNA region that is not contacted by the protein are critical for sequence-specific protein/DNA recognition and hence for gene-regulatory signals in the viral system. We show that the selection of dinucleotide or longer segments with appropriate conformational characteristics, when positioned at correct intervals along the DNA helix, can constitute a structural code for DNA recognition by regulatory proteins. This structural code facilitates the formation of a complementary protein-DNA interface that can be further specified by hydrogen bonds and nonpolar interactions between the protein amino acids and the DNA bases.

  13. Superimposed Code Theoretic Analysis of Deoxyribonucleic Acid (DNA) Codes and DNA Computing

    DTIC Science & Technology

    2010-01-01

    hybridization that occurs between a DNA strand and its Watson - Crick complement can be used to perform mathematical computation. This research addresses how the...are 5′→3′ and strands with strikethrough are 3′→5′. A dsDNA duplex formed between a strand and its reverse complement is called a Watson - Crick (WC...3’ 5’ 3’ 5’TACGCGACTTTC3’ 5’GAAAGTCGCGTA3’ ATCAAACGATGC GCATCGTTTGAT Watson Crick (WC) Duplexes TACGCGACTTTC

  14. Extra-coding RNAs regulate neuronal DNA methylation dynamics

    PubMed Central

    Savell, Katherine E.; Gallus, Nancy V. N.; Simon, Rhiana C.; Brown, Jordan A.; Revanna, Jasmin S.; Osborn, Mary Katherine; Song, Esther Y.; O'Malley, John J.; Stackhouse, Christian T.; Norvil, Allison; Gowher, Humaira; Sweatt, J. David; Day, Jeremy J.

    2016-01-01

    Epigenetic mechanisms such as DNA methylation are essential regulators of the function and information storage capacity of neurons. DNA methylation is highly dynamic in the developing and adult brain, and is actively regulated by neuronal activity and behavioural experiences. However, it is presently unclear how methylation status at individual genes is targeted for modification. Here, we report that extra-coding RNAs (ecRNAs) interact with DNA methyltransferases and regulate neuronal DNA methylation. Expression of ecRNA species is associated with gene promoter hypomethylation, is altered by neuronal activity, and is overrepresented at genes involved in neuronal function. Knockdown of the Fos ecRNA locus results in gene hypermethylation and mRNA silencing, and hippocampal expression of Fos ecRNA is required for long-term fear memory formation in rats. These results suggest that ecRNAs are fundamental regulators of DNA methylation patterns in neuronal systems, and reveal a promising avenue for therapeutic targeting in neuropsychiatric disease states. PMID:27384705

  15. Imperfect DNA mirror repeats in E. coli TnsA and other protein-coding DNA.

    PubMed

    Lang, Dorothy M

    2005-09-01

    DNA imperfect mirror repeats (DNA-IMRs) are ubiquitous in protein-coding DNA. However, they overlap and often have different centers of symmetry, making it difficult to evaluate their relationship to each other and to specific DNA and protein motifs and structures. This paper describes a systematic method of determining a hierarchy for DNA-IMRs and evaluates their relationship to protein structural elements (PSEs)--helices, turns and beta-sheets. DNA-IMRs are identifed by two different methods--DNA-IMRs terminated by reverse dinucleotides (rd-IMRs) and DNA-IMRs terminated by a single (mono) matching nucleotide (m-IMRs). Both rd-IMRs and m-IMRs are evaluated in 17 proteins, and illustrated in detail for TnsA. For each of the proteins, Fisher's exact test (FET) is used to measure the coincidence between the terminal dinucleotides of rd-IMRs and the terminal amino acids of individual PSEs. A significant correlation over a span of about 3 nt was found for each protein. The correlation is robust and for most genes, all rd-IMRs16 nt contain approximately 88% of the potential functional motifs. The protein translation of the longest rd- and m-IMRs span sequences important to the protein's structure and function. In all 17 proteins studied, the population of rd-IMRs is substantially less than the expected number and the population of m-IMRs greater than the expected number, indicating strong selective pressures. The association of rd-IMRs with PSEs restricts their spatial distribution, and therefore, their number. The greater than predicted number of m-IMRs indicates that DNA symmetry exists throughout the entire protein-coding region and may stabilize the sequence.

  16. Multifractal detrended cross-correlation analysis of coding and non-coding DNA sequences through chaos-game representation

    NASA Astrophysics Data System (ADS)

    Pal, Mayukha; Satish, B.; Srinivas, K.; Rao, P. Madhusudana; Manimaran, P.

    2015-10-01

    We propose a new approach combining the chaos game representation and the two dimensional multifractal detrended cross correlation analysis methods to examine multifractal behavior in power law cross correlation between any pair of nucleotide sequences of unequal lengths. In this work, we analyzed the characteristic behavior of coding and non-coding DNA sequences of eight prokaryotes. The results show the presence of strong multifractal nature between coding and non-coding sequences of all data sets. We found that this integrative approach helps us to consider complete DNA sequences for characterization, and further it may be useful for classification, clustering, identification of class affiliation of nucleotide sequences etc. with high precision.

  17. Coding DNA repeated throughout intergenic regions of the Arabidopsis thaliana genome: Evolutionary footprints of RNA silencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pyknons are non-random sequence patterns significantly repeated throughout non-coding genomic DNA that also appear at least once among genes. They are interesting because they portend an unforeseen connection between coding and non-coding DNA. Pyknons have only been discovered in the human genome,...

  18. Evolutionary analysis of DNA-protein-coding regions based on a genetic code cube metric.

    PubMed

    Sanchez, Robersy

    2014-01-01

    The right estimation of the evolutionary distance between DNA or protein sequences is the cornerstone of the current phylogenetic analysis based on distance methods. Herein, it is demonstrated that the Manhattan distance (dw), weighted by the evolutionary importance of the nucleotide bases in the codon, is a naturally derived metric in the standard genetic code cube inserted into the three-dimensional Euclidean space. Based on the application of distance dw, a novel evolutionary model is proposed. This model includes insertion/deletion mutations that are very important for cancer studies, but usually discarded in classical evolutionary models. In this study, the new evolutionary model was applied to the phylogenetic analysis of the DNA protein-coding regions of 13 mammal mitochondrial genomes and of four cancer genetic- susceptibility genes (ATM, BRCA1, BRCA2 and p53) from nine mammals. The opossum (a marsupial) was used as an out-group species for both sets of sequences. The new evolutionary model yielded the correct topology, while the current models failed to separate the evolutionarily distant species of mouse and opossum.

  19. Non-extensive trends in the size distribution of coding and non-coding DNA sequences in the human genome

    NASA Astrophysics Data System (ADS)

    Oikonomou, Th.; Provata, A.

    2006-03-01

    We study the primary DNA structure of four of the most completely sequenced human chromosomes (including chromosome 19 which is the most dense in coding), using non-extensive statistics. We show that the exponents governing the spatial decay of the coding size distributions vary between 5.2 ≤r ≤5.7 for the short scales and 1.45 ≤q ≤1.50 for the large scales. On the contrary, the exponents governing the spatial decay of the non-coding size distributions in these four chromosomes, take the values 2.4 ≤r ≤3.2 for the short scales and 1.50 ≤q ≤1.72 for the large scales. These results, in particular the values of the tail exponent q, indicate the existence of correlations in the coding and non-coding size distributions with tendency for higher correlations in the non-coding DNA.

  20. In search of coding and non-coding regions of DNA sequences based on balanced estimation of diffusion entropy.

    PubMed

    Zhang, Jin; Zhang, Wenqing; Yang, Huijie

    2016-01-01

    Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).

  1. An Integrated Prognostic Classifier for Stage I Lung Adenocarcinoma based on mRNA, microRNA and DNA Methylation Biomarkers

    PubMed Central

    Robles, Ana I.; Arai, Eri; Mathé, Ewy A.; Okayama, Hirokazu; Schetter, Aaron J.; Brown, Derek; Petersen, David; Bowman, Elise D.; Noro, Rintaro; Welsh, Judith A.; Edelman, Daniel C.; Stevenson, Holly S.; Wang, Yonghong; Tsuchiya, Naoto; Kohno, Takashi; Skaug, Vidar; Mollerup, Steen; Haugen, Aage; Meltzer, Paul S.; Yokota, Jun; Kanai, Yae

    2015-01-01

    Introduction Up to 30% Stage I lung cancer patients suffer recurrence within 5 years of curative surgery. We sought to improve existing protein-coding gene and microRNA expression prognostic classifiers by incorporating epigenetic biomarkers. Methods Genome-wide screening of DNA methylation and pyrosequencing analysis of HOXA9 promoter methylation were performed in two independently collected cohorts of Stage I lung adenocarcinoma. The prognostic value of HOXA9 promoter methylation alone and in combination with mRNA and miRNA biomarkers was assessed by Cox regression and Kaplan-Meier survival analysis in both cohorts. Results Promoters of genes marked by Polycomb in Embryonic Stem Cells were methylated de novo in tumors and identified patients with poor prognosis. The HOXA9 locus was methylated de novo in Stage I tumors (P < 0.0005). High HOXA9 promoter methylation was associated with worse cancer-specific survival (Hazard Ratio [HR], 2.6; P = 0.02) and recurrence-free survival (HR, 3.0; P = 0.01), and identified high-risk patients in stratified analysis of Stage IA and IB. Four protein-coding gene (XPO1, BRCA1, HIF1α, DLC1), miR-21 expression and HOXA9 promoter methylation were each independently associated with outcome (HR, 2.8; P = 0.002; HR, 2.3; P = 0.01; and HR, 2.4; P = 0.005, respectively), and, when combined, identified high-risk, therapy naïve, Stage I patients (HR, 10.2; P = 3x10−5). All associations were confirmed in two independently collected cohorts. Conclusion A prognostic classifier comprising three types of genomic and epigenomic data may help guide the postoperative management of Stage I lung cancer patients at high risk of recurrence. PMID:26134223

  2. What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?

    NASA Astrophysics Data System (ADS)

    Liebovitch, Larry

    1998-03-01

    The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find

  3. Sequences encoding identical peptides for the analysis and manipulation of coding DNA

    PubMed Central

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression. PMID:23861567

  4. Sequences encoding identical peptides for the analysis and manipulation of coding DNA.

    PubMed

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression.

  5. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA.

    PubMed

    Schmitz, Jonathan F; Bornberg-Bauer, Erich

    2017-01-01

    Over the last few years, there has been an increasing amount of evidence for the de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of de novo protein-coding genes such as the identification and verification of de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide de novo genes and almost no knowledge about protein structures and how they come about during the emergence of de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of de novo protein-coding genes as well as the detailed study of the emergence of functional de novo protein-coding genes.

  6. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA

    PubMed Central

    Schmitz, Jonathan F; Bornberg-Bauer, Erich

    2017-01-01

    Over the last few years, there has been an increasing amount of evidence for the de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of de novo protein-coding genes such as the identification and verification of de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide de novo genes and almost no knowledge about protein structures and how they come about during the emergence of de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of de novo protein-coding genes as well as the detailed study of the emergence of functional de novo protein-coding genes. PMID:28163910

  7. TOWARDS A PROBABILISTIC RECOGNITION CODE FOR PROTEIN-DNA INTERACTIONS

    SciTech Connect

    P. BENOS; ET AL

    2000-09-01

    We are investigating the rules that govern protein-DNA interactions, using a statistical mechanics based formalism that is related to the Boltzmann Machine of the neural net literature. Our approach is data-driven, in which probabilistic algorithms are used to model protein-DNA interactions, given SELEX and phage data as input. Under the ''one-to-one'' model for interactions (i.e. one amino acid contacts one base), we can successfully identify the wild-type binding sites of EGR and MIG protein families. The predictions using our method are the same or better than that of methods existing in the literature, however our methodology offers the potential to capitalize in quantitative detail on more data as it becomes available.

  8. [Cloning and insertion mutagenesis of DNA fragment coding for the luminescent system of Photobacterium leiognathi].

    PubMed

    Ptitsyn, L R; Gurevich, V B; Barsanova, T G; Shenderov, A N; Khaĭkinson, M Ia

    1988-10-01

    Fragments of DNA, obtained from the luminescent bacterium Photobacterium leiognathi and inserted into the plasmid pBR322, were found to code for the luminescence expressed in E. coli cells. The genetic functions necessary for light production in E. coli are localized on a DNA fragment of about 7 kbp. The insertion mutagenesis was used to define the luminescence functions encoded by the hybrid plasmid.

  9. DNA methylation patterns of protein-coding genes and long non-coding RNAs in males with schizophrenia.

    PubMed

    Liao, Qi; Wang, Yunliang; Cheng, Jia; Dai, Dongjun; Zhou, Xingyu; Zhang, Yuzheng; Li, Jinfeng; Yin, Honglei; Gao, Shugui; Duan, Shiwei

    2015-11-01

    Schizophrenia (SCZ) is one of the most complex mental illnesses affecting ~1% of the population worldwide. SCZ pathogenesis is considered to be a result of genetic as well as epigenetic alterations. Previous studies have aimed to identify the causative genes of SCZ. However, DNA methylation of long non-coding RNAs (lncRNAs) involved in SCZ has not been fully elucidated. In the present study, a comprehensive genome-wide analysis of DNA methylation was conducted using samples from two male patients with paranoid and undifferentiated SCZ, respectively. Methyl-CpG binding domain protein-enriched genome sequencing was used. In the two patients with paranoid and undifferentiated SCZ, 1,397 and 1,437 peaks were identified, respectively. Bioinformatic analysis demonstrated that peaks were enriched in protein-coding genes, which exhibited nervous system and brain functions. A number of these peaks in gene promoter regions may affect gene expression and, therefore, influence SCZ-associated pathways. Furthermore, 7 and 20 lncRNAs, respectively, in the Refseq database were hypermethylated. According to the lncRNA dataset in the NONCODE database, ~30% of intergenic peaks overlapped with novel lncRNA loci. The results of the present study demonstrated that aberrant hypermethylation of lncRNA genes may be an important epigenetic factor associated with SCZ. However, further studies using larger sample sizes are required.

  10. Correcting sequencing errors in DNA coding regions using a dynamic programming approach

    SciTech Connect

    Xu, Y.; Mural, R.J.; Uberbacher, E.C.

    1994-12-01

    This paper presents an algorithm for detecting and ``correcting`` sequencing errors that occur in DNA coding regions. The types of sequencing error addressed include insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of ``neutral`` bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. The authors have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. On a test set consisting of 68 Human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the ``corrected`` sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the ``corrupted`` sequences using standard GRAIL II method. The method uses a dynamic programming algorithm, and runs in time and space linear to the size of the input sequence.

  11. Correcting sequencing errors in DNA coding regions using a dynamic programming approach.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1995-04-01

    This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)

  12. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  13. A Conserved Structural Signature of the Homeobox Coding DNA in HOX genes

    PubMed Central

    Fongang, Bernard; Kong, Fanping; Negi, Surendra; Braun, Werner; Kudlicki, Andrzej

    2016-01-01

    The homeobox encodes a DNA-binding domain found in transcription factors regulating key developmental processes. The most notable examples of homeobox containing genes are the Hox genes, arranged on chromosomes in the same order as their expression domains along the body axis. The mechanisms responsible for the synchronous regulation of Hox genes and the molecular function of their colinearity remain unknown. Here we report the discovery of a conserved structural signature of the 180-base pair DNA fragment comprising the homeobox. We demonstrate that the homeobox DNA has a characteristic 3-base-pair periodicity in the hydroxyl radical cleavage pattern. This periodic pattern is significant in most of the 39 mammalian Hox genes and in other homeobox-containing transcription factors. The signature is present in segmented bilaterian animals as evolutionarily distant as humans and flies. It remains conserved despite the fact that it would be disrupted by synonymous mutations, which raises the possibility of evolutionary selective pressure acting on the structure of the coding DNA. The homeobox coding DNA may therefore have a secondary function, possibly as a regulatory element. The existence of such element may have important consequences for understanding how these genes are regulated. PMID:27739488

  14. A Conserved Structural Signature of the Homeobox Coding DNA in HOX genes.

    PubMed

    Fongang, Bernard; Kong, Fanping; Negi, Surendra; Braun, Werner; Kudlicki, Andrzej

    2016-10-14

    The homeobox encodes a DNA-binding domain found in transcription factors regulating key developmental processes. The most notable examples of homeobox containing genes are the Hox genes, arranged on chromosomes in the same order as their expression domains along the body axis. The mechanisms responsible for the synchronous regulation of Hox genes and the molecular function of their colinearity remain unknown. Here we report the discovery of a conserved structural signature of the 180-base pair DNA fragment comprising the homeobox. We demonstrate that the homeobox DNA has a characteristic 3-base-pair periodicity in the hydroxyl radical cleavage pattern. This periodic pattern is significant in most of the 39 mammalian Hox genes and in other homeobox-containing transcription factors. The signature is present in segmented bilaterian animals as evolutionarily distant as humans and flies. It remains conserved despite the fact that it would be disrupted by synonymous mutations, which raises the possibility of evolutionary selective pressure acting on the structure of the coding DNA. The homeobox coding DNA may therefore have a secondary function, possibly as a regulatory element. The existence of such element may have important consequences for understanding how these genes are regulated.

  15. Free Energy Gap and Statistical Thermodynamic Fidelity of DNA Codes (Postprint)

    DTIC Science & Technology

    2007-01-01

    reverse-complement unless otherwise stated. For strand x, let Nx denote its complement. A (perfect) Watson - Crick duplex is the joining of complement...is possible for complementary sequences to form a non-perfectly aligned duplex, we will call any x W Nx duplex a Watson - Crick (WC) duplex. Two...DATES COVERED (From - To) 4. TITLE AND SUBTITLE FREE ENERGY GAP AND STATISTICAL THERMODYNAMIC FIDELITY OF DNA CODES 5a. CONTRACT NUMBER FA8750-07

  16. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  17. HyDEn: a hybrid steganocryptographic approach for data encryption using randomized error-correcting DNA codes.

    PubMed

    Tulpan, Dan; Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach.

  18. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes

    PubMed Central

    Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  19. A molecular code dictates sequence-specific DNA recognition by homeodomains.

    PubMed Central

    Damante, G; Pellizzari, L; Esposito, G; Fogolari, F; Viglino, P; Fabbro, D; Tell, G; Formisano, S; Di Lauro, R

    1996-01-01

    Most homeodomains bind to DNA sequences containing the motif 5'-TAAT-3'. The homeodomain of thyroid transcription factor 1 (TTF-1HD) binds to sequences containing a 5'-CAAG-3' core motif, delineating a new mechanism for differential DNA recognition by homeodomains. We investigated the molecular basis of the DNA binding specificity of TTF-1HD by both structural and functional approaches. As already suggested by the three-dimensional structure of TTF-1HD, the DNA binding specificities of the TTF-1, Antennapedia and Engrailed homeodomains, either wild-type or mutants, indicated that the amino acid residue in position 54 is involved in the recognition of the nucleotide at the 3' end of the core motif 5'-NAAN-3'. The nucleotide at the 5' position of this core sequence is recognized by the amino acids located in position 6, 7 and 8 of the TTF-1 and Antennapedia homeodomains. These data, together with previous suggestions on the role of amino acids in position 50, indicate that the DNA binding specificity of homeodomains can be determined by a combinatorial molecular code. We also show that some specific combinations of the key amino acid residues involved in DNA recognition do not follow a simple, additive rule. Images PMID:8890172

  20. A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding

    NASA Astrophysics Data System (ADS)

    Jin, Xin; Nie, Rencan; Zhou, Dongming; Yao, Shaowen; Chen, Yanyan; Yu, Jiefu; Wang, Quan

    2016-11-01

    A novel method for the calculation of DNA sequence similarity is proposed based on simplified pulse-coupled neural network (S-PCNN) and Huffman coding. In this study, we propose a coding method based on Huffman coding, where the triplet code was used as a code bit to transform DNA sequence into numerical sequence. The proposed method uses the firing characters of S-PCNN neurons in DNA sequence to extract features. Besides, the proposed method can deal with different lengths of DNA sequences. First, according to the characteristics of S-PCNN and the DNA primary sequence, the latter is encoded using Huffman coding method, and then using the former, the oscillation time sequence (OTS) of the encoded DNA sequence is extracted. Simultaneously, relevant features are obtained, and finally the similarities or dissimilarities of the DNA sequences are determined by Euclidean distance. In order to verify the accuracy of this method, different data sets were used for testing. The experimental results show that the proposed method is effective.

  1. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  2. Characterization of the cDNA and gene coding for the biotin synthase of Arabidopsis thaliana.

    PubMed Central

    Weaver, L M; Yu, F; Wurtele, E S; Nikolau, B J

    1996-01-01

    Biotin, an essential cofactor, is synthesized de novo only by plants and some microbes. An Arabidopsis thaliana expressed sequence tag that shows sequence similarity to the carboxyl end of biotin synthase from Escherichia coli was used to isolate a near-full-length cDNA. This cDNA was shown to code for the Arabidopsis biotin synthase by its ability to complement a bioB mutant of E. coli. Site-specific mutagenesis indicates that residue threonine-173, which is highly conserved in biotin synthases, is important for catalytic competence of the enzyme. The primary sequence of the Arabidopsis biotin synthase is most similar to biotin synthases from E. coli, Serratia marcescens, and Saccharomyces cerevisiae (about 50% sequence identity) and more distantly related to the Bacillus sphaericus enzyme (33% sequence identity). The primary sequence of the amino terminus of the Arabidopsis biotin synthase may represent an organelle-targeting transit peptide. The single Arabidopsis gene coding for biotin synthase, BIO2, was isolated and sequenced. The biotin synthase coding sequence is interrupted by five introns. The gene sequence upstream of the translation start site has several unusual features, including imperfect palindromes and polypyrimidine sequences, which may function in the transcriptional regulation of the BIO2 gene. PMID:8819873

  3. Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors

    PubMed Central

    Liu, Jiajian; Stormo, Gary D.

    2008-01-01

    Motivation: Modeling and identifying the DNA-protein recognition code is one of the most challenging problems in computational biology. Several quantitative methods have been developed to model DNA-protein interactions with specific focus on the C2H2 zinc-finger proteins, the largest transcription factor family in eukaryotic genomes. In many cases, they performed well. But the overall the predictive accuracy of these methods is still limited. One of the major reasons is all these methods used weight matrix models to represent DNA-protein interactions, assuming all base-amino acid contacts contribute independently to the total free energy of binding. Results: We present a context-dependent model for DNA–zinc-finger protein interactions that allows us to identify inter-positional dependencies in the DNA recognition code for C2H2 zinc-finger proteins. The degree of non-independence was detected by comparing the linear perceptron model with the non-linear neural net (NN) model for their predictions of DNA–zinc-finger protein interactions. This dependency is supported by the complex base-amino acid contacts observed in DNA–zinc-finger interactions from structural analyses. Using extensive published qualitative and quantitative experimental data, we demonstrated that the context-dependent model developed in this study can significantly improves predictions of DNA binding profiles and free energies of binding for both individual zinc fingers and proteins with multiple zinc fingers when comparing to previous positional-independent models. This approach can be extended to other protein families with complex base-amino acid residue interactions that would help to further understand the transcriptional regulation in eukaryotic genomes. Availability:The software implemented as c programs and are available by request. http://ural.wustl.edu/softwares.html Contact: stormo@ural.wustl.edu PMID:18586699

  4. DNA methylation patterns of protein coding genes and long noncoding RNAs in female schizophrenic patients.

    PubMed

    Liao, Qi; Wang, Yunliang; Cheng, Jia; Dai, Dongjun; Zhou, Xingyu; Zhang, Yuzheng; Gao, Shugui; Duan, Shiwei

    2015-02-01

    Schizophrenia (SCZ) is a complex mental disorder contributed by both genetic and epigenetic factors. Long noncoding RNAs (lncRNAs) was recently found playing an important regulatory role in mental disorders. However, little was known about the DNA methylation of lncRNAs, although numerous SCZ studies have been performed on genetic polymorphisms or epigenetic marks in protein coding genes. We presented a comprehensive genome wide DNA methylation study of both protein coding genes and lncRNAs in female patients with paranoid and undifferentiated SCZ. Using the methyl-CpG binding domain (MBD) protein-enriched genome sequencing (MBD-seq), 8,163 and 764 peaks were identified in paranoid and undifferentiated SCZ, respectively (p < 1 × 10-5). Gene ontology analysis showed that the hypermethylated regions were enriched in the genes related to neuron system and brain for both paranoid and undifferentiated SCZ (p < 0.05). Among these peaks, 121 peaks were located in gene promoter regions that might affect gene expression and influence the SCZ related pathways. Interestingly, DNA methylation of 136 and 23 known lncRNAs in Refseq database were identified in paranoid and undifferentiated SCZ, respectively. In addition, ∼20% of intergenic peaks annotated based on Refseq genes were overlapped with lncRNAs in UCSC and gencode databases. In order to show the results well for most biological researchers, we created an online database to display and visualize the information of DNA methyation peaks in both types of SCZ (http://www.bioinfo.org/scz/scz.htm). Our results showed that the aberrant DNA methylation of lncRNAs might be another important epigenetic factor for SCZ.

  5. Classifying Microorganisms.

    ERIC Educational Resources Information Center

    Baker, William P.; Leyva, Kathryn J.; Lang, Michael; Goodmanis, Ben

    2002-01-01

    Focuses on an activity in which students sample air at school and generate ideas about how to classify the microorganisms they observe. The results are used to compare air quality among schools via the Internet. Supports the development of scientific inquiry and technology skills. (DDR)

  6. Particle classifier

    SciTech Connect

    Etkin, B.

    1987-04-14

    This patent describes a classifier for particulate material comprising a housing having an inlet to receive a classifying air flow flowing in a given direction, collection means downstream of the inlet to receive material classified by the air flow, and material introduction means intermediate the inlet and the collection means to introduce particles entrained in a secondary air stream into the housing in a direction other than the given direction. The material introduction means includes a material outlet aperture in a wall of the housing extending generally perpendicular to the given direction, conveying means to convey material and the secondary air stream to the material outlet and diverting means to divert the secondary air stream to a direction generally parallel to the classifying air flow flowing in the given direction. The diverting means includes a surface extending downstream from the outlet and adjacent thereto and being dimensioned to divert the secondary airstream by a Coanda effect generally parallel to the given direction and thereby segregate the secondary air/stream from the particles and permit continued movement of the particles along predictable trajectories.

  7. DNA strand breaks induced by electrons simulated with Nanodosimetry Monte Carlo Simulation Code: NASIC.

    PubMed

    Li, Junli; Li, Chunyan; Qiu, Rui; Yan, Congchong; Xie, Wenzhang; Wu, Zhen; Zeng, Zhi; Tung, Chuanjong

    2015-09-01

    The method of Monte Carlo simulation is a powerful tool to investigate the details of radiation biological damage at the molecular level. In this paper, a Monte Carlo code called NASIC (Nanodosimetry Monte Carlo Simulation Code) was developed. It includes physical module, pre-chemical module, chemical module, geometric module and DNA damage module. The physical module can simulate physical tracks of low-energy electrons in the liquid water event-by-event. More than one set of inelastic cross sections were calculated by applying the dielectric function method of Emfietzoglou's optical-data treatments, with different optical data sets and dispersion models. In the pre-chemical module, the ionised and excited water molecules undergo dissociation processes. In the chemical module, the produced radiolytic chemical species diffuse and react. In the geometric module, an atomic model of 46 chromatin fibres in a spherical nucleus of human lymphocyte was established. In the DNA damage module, the direct damages induced by the energy depositions of the electrons and the indirect damages induced by the radiolytic chemical species were calculated. The parameters should be adjusted to make the simulation results be agreed with the experimental results. In this paper, the influence study of the inelastic cross sections and vibrational excitation reaction on the parameters and the DNA strand break yields were studied. Further work of NASIC is underway.

  8. DANIO-CODE: Toward an Encyclopedia of DNA Elements in Zebrafish.

    PubMed

    Tan, Haihan; Onichtchouk, Daria; Winata, Cecilia

    2016-02-01

    The zebrafish has emerged as a model organism for genomics studies. The symposium "Toward an encyclopedia of DNA elements in zebrafish" held in London in December 2014, was coorganized by Ferenc Müller and Fiona Wardle. This meeting is a follow-up of a similar previous workshop held 2 years earlier and represents a push toward the formalization of a community effort to annotate functional elements in the zebrafish genome. The meeting brought together zebrafish researchers, bioinformaticians, as well as members of established consortia, to exchange scientific findings and experience, as well as to discuss the initial steps toward the formation of a DANIO-CODE consortium. In this study, we provide the latest updates on the current progress of the consortium's efforts, opening up a broad invitation to researchers to join in and contribute to DANIO-CODE.

  9. DANIO-CODE: Toward an Encyclopedia of DNA Elements in Zebrafish

    PubMed Central

    2016-01-01

    Abstract The zebrafish has emerged as a model organism for genomics studies. The symposium “Toward an encyclopedia of DNA elements in zebrafish” held in London in December 2014, was coorganized by Ferenc Müller and Fiona Wardle. This meeting is a follow-up of a similar previous workshop held 2 years earlier and represents a push toward the formalization of a community effort to annotate functional elements in the zebrafish genome. The meeting brought together zebrafish researchers, bioinformaticians, as well as members of established consortia, to exchange scientific findings and experience, as well as to discuss the initial steps toward the formation of a DANIO-CODE consortium. In this study, we provide the latest updates on the current progress of the consortium's efforts, opening up a broad invitation to researchers to join in and contribute to DANIO-CODE. PMID:26671609

  10. Low mitochondrial DNA variation among American alligators and a novel non-coding region in crocodilians.

    PubMed

    Glenn, Travis C; Staton, Joseph L; Vu, Alex T; Davis, Lisa M; Bremer, Jaime R Alvarado; Rhodes, Walter E; Brisbin, I Lehr; Sawyer, Roger H

    2002-12-15

    We analyzed 1317-1823 base pairs (bp) of mitochondrial DNA sequence beginning in the 5' end of cytochrome b (cyt b) and ending in the central domain of the control region for 25 American alligators (Alligator mississippiensis) and compared these to a homologous sequence from a Chinese alligator (A. sinensis). Both species share a non-coding spacer between cyt b and tRNA(Thr). Chinese alligator cyt b differs from that of the American alligator by 17.5% at the nucleotide level and 13.8% for inferred amino acids, which is consistent with their presumed ancient divergence. Only two cyt b haplotypes were detected among the 25 American alligators (693-1199 bp surveyed), with one haplotype shared among 24 individuals. One alligator from Mississippi differed from all other alligators by a single silent substitution. The control region contained only slightly more variation among the 25 American alligators, with two variable positions (624 bp surveyed), yielding three haplotypes with 22, two, and one individuals in each of these groups. Previous genetic studies examining allozymes and the proportion of variable microsatellite DNA loci also found low levels of genetic diversity in American alligators. However, in contrast with allozymes, microsatellites, and morphology, the mtDNA data shows no evidence of differentiation among populations from the extremes of the species range. These results suggest that American alligators underwent a severe population bottleneck in the late Pleistocene, resulting in nearly homogenous mtDNA among all American alligators today.

  11. Quartz crystal microbalance detection of DNA single-base mutation based on monobase-coded cadmium tellurium nanoprobe.

    PubMed

    Zhang, Yuqin; Lin, Fanbo; Zhang, Youyu; Li, Haitao; Zeng, Yue; Tang, Hao; Yao, Shouzhuo

    2011-01-01

    A new method for the detection of point mutation in DNA based on the monobase-coded cadmium tellurium nanoprobes and the quartz crystal microbalance (QCM) technique was reported. A point mutation (single-base, adenine, thymine, cytosine, and guanine, namely, A, T, C and G, mutation in DNA strand, respectively) DNA QCM sensor was fabricated by immobilizing single-base mutation DNA modified magnetic beads onto the electrode surface with an external magnetic field near the electrode. The DNA-modified magnetic beads were obtained from the biotin-avidin affinity reaction of biotinylated DNA and streptavidin-functionalized core/shell Fe(3)O(4)/Au magnetic nanoparticles, followed by a DNA hybridization reaction. Single-base coded CdTe nanoprobes (A-CdTe, T-CdTe, C-CdTe and G-CdTe, respectively) were used as the detection probes. The mutation site in DNA was distinguished by detecting the decreases of the resonance frequency of the piezoelectric quartz crystal when the coded nanoprobe was added to the test system. This proposed detection strategy for point mutation in DNA is proved to be sensitive, simple, repeatable and low-cost, consequently, it has a great potential for single nucleotide polymorphism (SNP) detection.

  12. Wide Host Ranges of Herbivorous Beetles? Insights from DNA Bar Coding

    PubMed Central

    Kishimoto-Yamada, Keiko; Kamiya, Koichi; Meleng, Paulus; Diway, Bibian; Kaliang, Het; Chong, Lucy; Itioka, Takao; Sakai, Shoko; Ito, Motomi

    2013-01-01

    There are very few studies that have investigated host-specificity among tropical herbivorous insects. Indeed, most of the trophic interactions of herbivorous insects in Southeast Asian tropical rainforests remain unknown, and whether polyphagous feeding is common in the herbivores of this ecosystem has not been determined. The present study employed DNA bar coding to reveal the trophic associations of adult leaf-chewing chrysomelid beetles in a Bornean rainforest. Plant material ingested by the adults was retrieved from the bodies of the insects, and a portion of the chloroplast rbcL sequence was then amplified from this material. The plants were identified at the family level using an existing reference database of chloroplast DNA. Our DNA-based diet analysis of eleven chrysomelid species successfully identified their host plant families and indicated that five beetle species fed on more than two families within the angiosperms, and four species fed on several families of gymnosperms and/or ferns together with multiple angiosperm families. These findings suggest that generalist chrysomelid beetles associated with ecologically and taxonomically distant plants constitute a part of the plant-insect network of the Bornean rainforest. PMID:24073210

  13. Temporal and spatial trends in prey composition of wahoo Acanthocybium solandri: a diet analysis from the central North Pacific Ocean using visual and DNA bar-coding techniques.

    PubMed

    Oyafuso, Z S; Toonen, R J; Franklin, E C

    2016-04-01

    A diet analysis was conducted on 444 wahoo Acanthocybium solandri caught in the central North Pacific Ocean longline fishery and a nearshore troll fishery surrounding the Hawaiian Islands from June to December 2014. In addition to traditional observational methods of stomach contents, a DNA bar-coding approach was integrated into the analysis by sequencing the cytochrome c oxidase subunit 1 (COI) region of the mtDNA genome to taxonomically identify individual prey items that could not be classified visually to species. For nearshore-caught A. solandri, juvenile pre-settlement reef fish species from various families dominated the prey composition during the summer months, followed primarily by Carangidae in autumn months. Gempylidae, Echeneidae and Scombridae were dominant prey taxa from the offshore fishery. Molidae was a common prey family found in stomachs collected north-east of the Hawaiian Archipelago while tetraodontiform reef fishes, known to have extended pelagic stages, were prominent prey items south-west of the Hawaiian Islands. The diet composition of A. solandri was indicative of an adaptive feeder and thus revealed dominant geographic and seasonal abundances of certain taxa from various ecosystems in the marine environment. The addition of molecular bar-coding to the traditional visual method of prey identifications allowed for a more comprehensive range of the prey field of A. solandri to be identified and should be used as a standard component in future diet studies.

  14. Molecular cloning of the cDNA coding for the (R)-(+)-mandelonitrile lyase of Prunus amygdalus: temporal and spatial expression patterns in flowers and mature seeds.

    PubMed

    Suelves, M; Puigdomènech, P

    1998-10-01

    A gene highly expressed in the floral organs of almond (Prunus amygdalus Batsch), and coding for the cyanogenic enzyme (R)-(+)-mandelonitrile lyase (EC 4.1.2.10), has been identified and the full-length cDNA sequenced. The temporal expression pattern in maturing seeds and during floral development was analyzed by RNA blot, and the highest mRNA levels were detected in floral tissues. The spatial mRNA accumulation pattern in almond flower buds was also analyzed by in-situ hybridization. The mRNA levels were compared during seed maturation and floral development in fruit and floral samples from cultivars classified as homozygous or heterozygous for the sweet-almond trait or homozygous for the bitter trait. No correlation was found between these characteristics and levels of mandelonitrile lyase mRNA, suggesting that the presence of this protein is not the limiting factor in the production of hydrogen cyanide.

  15. Comparison of Geant4-DNA simulation of S-values with other Monte Carlo codes

    NASA Astrophysics Data System (ADS)

    André, T.; Morini, F.; Karamitros, M.; Delorme, R.; Le Loirec, C.; Campos, L.; Champion, C.; Groetz, J.-E.; Fromm, M.; Bordage, M.-C.; Perrot, Y.; Barberet, Ph.; Bernal, M. A.; Brown, J. M. C.; Deleuze, M. S.; Francis, Z.; Ivanchenko, V.; Mascialino, B.; Zacharatou, C.; Bardiès, M.; Incerti, S.

    2014-01-01

    Monte Carlo simulations of S-values have been carried out with the Geant4-DNA extension of the Geant4 toolkit. The S-values have been simulated for monoenergetic electrons with energies ranging from 0.1 keV up to 20 keV, in liquid water spheres (for four radii, chosen between 10 nm and 1 μm), and for electrons emitted by five isotopes of iodine (131, 132, 133, 134 and 135), in liquid water spheres of varying radius (from 15 μm up to 250 μm). The results have been compared to those obtained from other Monte Carlo codes and from other published data. The use of the Kolmogorov-Smirnov test has allowed confirming the statistical compatibility of all simulation results.

  16. Humans and chimpanzees differ in their cellular response to DNA damage and non-coding sequence elements of DNA repair-associated genes.

    PubMed

    Weis, E; Galetzka, D; Herlyn, H; Schneider, E; Haaf, T

    2008-01-01

    Compared to humans, chimpanzees appear to be less susceptible to many types of cancer. Because DNA repair defects lead to accumulation of gene and chromosomal mutations, species differences in DNA repair are one plausible explanation. Here we analyzed the repair kinetics of human and chimpanzee cells after cisplatin treatment and irradiation. Dot blots for the quantification of single-stranded (ss) DNA repair intermediates revealed a biphasic response of human and chimpanzee lymphoblasts to cisplatin-induced damage. The early phase of DNA repair was identical in both species with a peak of ssDNA intermediates at 1 h after DNA damage induction. However, the late phase differed between species. Human cells showed a second peak of ssDNA intermediates at 6 h, chimpanzee cells at 5 h. One of four analyzed DNA repair-associated genes, UBE2A, was differentially expressed in human and chimpanzee cells at 5 h after cisplatin treatment. Immunofluorescent staining of gammaH2AX foci demonstrated equally high numbers of DNA strand breaks in human and chimpanzee cells at 30 min after irradiation and equally low numbers at 2 h. However, at 1 h chimpanzee cells had significantly less DNA breaks than human cells. Comparative sequence analyses of approximately 100 DNA repair-associated genes in human and chimpanzee revealed 13% and 32% genes, respectively, with evidence for an accelerated evolution in promoter regions and introns. This is strikingly contrasting to the 3% of DNA repair-associated genes with positive selection in the coding sequence. Compared to the rhesus macaque as an outgroup, chimpanzees have a higher accelerated evolution in non-coding sequences than humans. The TRF1-interacting, ankyrin-related ADP-ribose polymerase (TNKS) gene showed an accelerated intraspecific evolution among humans. Our results are consistent with the view that chimpanzee cells repair different types of DNA damage faster than human cells, whereas the overall repair capacity is similar in

  17. Detection of coding microsatellite frameshift mutations in DNA mismatch repair-deficient mouse intestinal tumors.

    PubMed

    Woerner, Stefan M; Tosti, Elena; Yuan, Yan P; Kloor, Matthias; Bork, Peer; Edelmann, Winfried; Gebert, Johannes

    2015-11-01

    Different DNA mismatch repair (MMR)-deficient mouse strains have been developed as models for the inherited cancer predisposing Lynch syndrome. It is completely unresolved, whether coding mononucleotide repeat (cMNR) gene mutations in these mice can contribute to intestinal tumorigenesis and whether MMR-deficient mice are a suitable molecular model of human microsatellite instability (MSI)-associated intestinal tumorigenesis. A proof-of-principle study was performed to identify mouse cMNR-harboring genes affected by insertion/deletion mutations in MSI murine intestinal tumors. Bioinformatic algorithms were developed to establish a database of mouse cMNR-harboring genes. A panel of five mouse noncoding mononucleotide markers was used for MSI classification of intestinal matched normal/tumor tissues from MMR-deficient (Mlh1(-/-) , Msh2(-/-) , Msh2(LoxP/LoxP) ) mice. cMNR frameshift mutations of candidate genes were determined by DNA fragment analysis. Murine MSI intestinal tumors but not normal tissues from MMR-deficient mice showed cMNR frameshift mutations in six candidate genes (Elavl3, Tmem107, Glis2, Sdccag1, Senp6, Rfc3). cMNRs of mouse Rfc3 and Elavl3 are conserved in type and length in their human orthologs that are known to be mutated in human MSI colorectal, endometrial and gastric cancer. We provide evidence for the utility of a mononucleotide marker panel for detection of MSI in murine tumors, the existence of cMNR instability in MSI murine tumors, the utility of mouse subspecies DNA for identification of polymorphic repeats, and repeat conservation among some orthologous human/mouse genes, two of them showing instability in human and mouse MSI intestinal tumors. MMR-deficient mice hence are a useful molecular model system for analyzing MSI intestinal carcinogenesis.

  18. Analysis of cDNA coding MHC class II beta chain of the chimpanzee (Pan troglodytes).

    PubMed

    Hatta, Yuki; Kanai, Tomoko; Matsumoto, Yoshitsugu; Kyuwa, Shigeru; Hayasaka, Ikuo; Yoshikawa, Yasuhiro

    2002-04-01

    The chimpanzee (Pan troglodytes, Patr) is the closest zoological living relative of humans and shares approximately 98.6% genetic homology to human beings. Although major histocompatibility complex (MHC) plays a critical role in T cell-mediated immune responses in vertebrates, the information on Patr MHC remains at a relatively poor level. Therefore, we attempted to isolate Patr MHC class II genes and determine their nucleotide sequences. The cDNAs encoding Patr MHC class II DP, DQ and DR beta chains were isolated from the cDNA library of a chimpanzee B lymphocyte cell line Bch261. As a result of screening, the clone 6-3-1 as a representative of Patr DP clone, clone 30-1 as a Patr DQ clone, and clones 4-7-1 and 55-1 having different sequences as Patr DR clones were detected. The clone 6-3-1 consisted of 1,062 nucleotides including an open reading frame (ORF) of 777 bp. In the same way, clone 30-1 consisted of 1,172 nucleotides including ORF of 786 bp, clones 4-7-1 and 55-1 consisted of 1,163 nucleotides including ORF of 801 bp. Except for five nucleotide changes, clones 4-7-1 and 55-1 were the same sequence. By comparison with the nucleotide sequences already reported on chimpanzee MHC class II beta 1 genes, clones 6-3-1, 30-1, 4-7-1 and 55-1 were classified as PatrDPB1*16, PatrDQB1*0302, PatrDRB1*0201 and PatrDRB1*0204, respectively. This is the first report to describe complete cDNA sequences of Patr DP and DQ molecules. The nucleotide sequence data of Patr MHC class II genes obtained in this study will be useful for the genotyping of Patr MHC class II genes in individual chimpanzees.

  19. The dnaN gene codes for the beta subunit of DNA polymerase III holoenzyme of escherichia coli.

    PubMed Central

    Burgers, P M; Kornberg, A; Sakakibara, Y

    1981-01-01

    An Escherichia coli mutant, dnaN59, stops DNA synthesis promptly upon a shift to a high temperature; the wild-type dnaN gene carried in a transducing phage encodes a polypeptide of about 41,000 daltons [Sakakibara, Y. & Mizukami, T. (1980) Mol. Gen. Genet. 178, 541-553; Yuasa, S. & Sakakibara, Y. (1980) Mol. Gen. Genet. 180, 267-273]. We now find that the product of dnaN gene is the beta subunit of DNA polymerase III holoenzyme, the principal DNA synthetic multipolypeptide complex in E. coli. The conclusion is based on the following observations: (i) Extracts from dnaN59 cells were defective in phage phi X174 and G4 DNA synthesis after the mutant cells had been exposed to the increased temperature. (ii) The enzymatic defect was overcome by addition of purified beta subunit but not by other subunits of DNA polymerase III holoenzyme or by other replication proteins required for phi X174 DNA synthesis. (iii) Partially purified beta subunit from the dnaN mutant, unlike that from the wild type, was inactive in reconstituting the holoenzyme when mixed with the other purified subunits. (iv) Increased dosage of the dnaN gene provided by a plasmid carrying the gene raised cellular levels of the beta subunit 5- to 6-fold. PMID:6458041

  20. The dnaN gene codes for the beta subunit of DNA polymerase III holoenzyme of escherichia coli.

    PubMed

    Burgers, P M; Kornberg, A; Sakakibara, Y

    1981-09-01

    An Escherichia coli mutant, dnaN59, stops DNA synthesis promptly upon a shift to a high temperature; the wild-type dnaN gene carried in a transducing phage encodes a polypeptide of about 41,000 daltons [Sakakibara, Y. & Mizukami, T. (1980) Mol. Gen. Genet. 178, 541-553; Yuasa, S. & Sakakibara, Y. (1980) Mol. Gen. Genet. 180, 267-273]. We now find that the product of dnaN gene is the beta subunit of DNA polymerase III holoenzyme, the principal DNA synthetic multipolypeptide complex in E. coli. The conclusion is based on the following observations: (i) Extracts from dnaN59 cells were defective in phage phi X174 and G4 DNA synthesis after the mutant cells had been exposed to the increased temperature. (ii) The enzymatic defect was overcome by addition of purified beta subunit but not by other subunits of DNA polymerase III holoenzyme or by other replication proteins required for phi X174 DNA synthesis. (iii) Partially purified beta subunit from the dnaN mutant, unlike that from the wild type, was inactive in reconstituting the holoenzyme when mixed with the other purified subunits. (iv) Increased dosage of the dnaN gene provided by a plasmid carrying the gene raised cellular levels of the beta subunit 5- to 6-fold.

  1. DNA sequence-based "bar codes" for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.

    PubMed

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A; Schnable, Patrick S

    2003-10-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence "bar codes" were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects.

  2. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome.

    PubMed

    Beh, Leslie Y; Müller, Manuel M; Muir, Tom W; Kaplan, Noam; Landweber, Laura F

    2015-11-01

    A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these "seed" nucleosomes--together with trans-acting factors--may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences.

  3. Large-scale motif discovery using DNA Gray code and equiprobable oligomers

    PubMed Central

    Ichinose, Natsuhiro; Yada, Tetsushi; Gotoh, Osamu

    2012-01-01

    Motivation: How to find motifs from genome-scale functional sequences, such as all the promoters in a genome, is a challenging problem. Word-based methods count the occurrences of oligomers to detect excessively represented ones. This approach is known to be fast and accurate compared with other methods. However, two problems have hampered the application of such methods to large-scale data. One is the computational cost necessary for clustering similar oligomers, and the other is the bias in the frequency of fixed-length oligomers, which complicates the detection of significant words. Results: We introduce a method that uses a DNA Gray code and equiprobable oligomers, which solve the clustering problem and the oligomer bias, respectively. Our method can analyze 18 000 sequences of ~1 kbp long in 30 s. We also show that the accuracy of our method is superior to that of a leading method, especially for large-scale data and small fractions of motif-containing sequences. Availability: The online and stand-alone versions of the application, named Hegma, are available at our website: http://www.genome.ist.i.kyoto-u.ac.jp/~ichinose/hegma/ Contact: ichinose@i.kyoto-u.ac.jp; o.gotoh@i.kyoto-u.ac.jp PMID:22057160

  4. Widespread selection across coding and noncoding DNA in the pea aphid genome.

    PubMed

    Bickel, Ryan D; Dunham, Joseph P; Brisson, Jennifer A

    2013-06-21

    Genome-wide patterns of diversity and selection are critical measures for understanding how evolution has shaped the genome. Yet, these population genomic estimates are available for only a limited number of model organisms. Here we focus on the population genomics of the pea aphid (Acyrthosiphon pisum). The pea aphid is an emerging model system that exhibits a range of intriguing biological traits not present in classic model systems. We performed low-coverage genome resequencing of 21 clonal pea aphid lines collected from alfalfa host plants in North America to characterize genome-wide patterns of diversity and selection. We observed an excess of low-frequency polymorphisms throughout coding and noncoding DNA, which we suggest is the result of a founding event and subsequent population expansion in North America. Most gene regions showed lower levels of Tajima's D than synonymous sites, suggesting that the majority of the genome is not evolving neutrally but rather exhibits significant constraint. Furthermore, we used the pea aphid's unique manner of X-chromosome inheritance to assign genomic scaffolds to either autosomes or the X chromosome. Comparing autosomal vs. X-linked sequence variation, we discovered that autosomal genes show an excess of low frequency variants indicating that purifying selection acts more efficiently on the X chromosome. Overall, our results provide a critical first step in characterizing the genetic diversity and evolutionary pressures on an aphid genome.

  5. Classifying Facial Actions

    PubMed Central

    Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.

    2010-01-01

    The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284

  6. Functional expression in primate cells of cloned DNA coding for the hemagglutinin surface glycoprotein of influenza virus.

    PubMed Central

    Sveda, M M; Lai, C J

    1981-01-01

    We have used simian virus 40 (SV40) DNA as a vector for expression of functional activity of a cloned influenza viral DNA segment in primate cells. Cloned full-length DNA sequences coding for the hemagglutinin of influenza A virus (Udorn/72/[H3N2]) were inserted into the late region of a viable deletion mutant of SV40, and the hybrid DNA was propagated in the presence of an early SV40 mutant (tsA28) helper. Infection of primate cells with the hybrid virus produced a polypeptide similar in molecular size to the hemagglutinin of influenza virus, as shown by immunoprecipitation and gel electrophoresis. The polypeptide was glycosylated, as shown by incorporation of radioactive sugars. The putative hemagglutinin exhibited functional activity, as shown by agglutination of erythrocytes. In addition, an indirect immunofluorescence assay showed that the hemagglutinin polypeptide of the hybrid virus could be detected on the surface of infected cells. Images PMID:6272305

  7. Natural selection on coding and noncoding DNA sequences is associated with virulence genes in a plant pathogenic fungus.

    PubMed

    Rech, Gabriel E; Sanz-Martín, José M; Anisimova, Maria; Sukno, Serenella A; Thon, Michael R

    2014-09-04

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5' untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen.

  8. Signalign: An Ontology of DNA as Signal for Comparative Gene Structure Prediction Using Information-Coding-and-Processing Techniques.

    PubMed

    Yu, Ning; Guo, Xuan; Gu, Feng; Pan, Yi

    2016-03-01

    Conventional character-analysis-based techniques in genome analysis manifest three main shortcomings-inefficiency, inflexibility, and incompatibility. In our previous research, a general framework, called DNA As X was proposed for character-analysis-free techniques to overcome these shortcomings, where X is the intermediates, such as digit, code, signal, vector, tree, graph network, and so on. In this paper, we further implement an ontology of DNA As Signal, by designing a tool named Signalign for comparative gene structure analysis, in which DNA sequences are converted into signal series, processed by modified method of dynamic time warping and measured by signal-to-noise ratio (SNR). The ontology of DNA As Signal integrates the principles and concepts of other disciplines including information coding theory and signal processing into sequence analysis and processing. Comparing with conventional character-analysis-based methods, Signalign can not only have the equivalent or superior performance, but also enrich the tools and the knowledge library of computational biology by extending the domain from character/string to diverse areas. The evaluation results validate the success of the character-analysis-free technique for improved performances in comparative gene structure prediction.

  9. The molecular cloning and characterisation of cDNA coding for the alpha subunit of the acetylcholine receptor.

    PubMed Central

    Sumikawa, K; Houghton, M; Smith, J C; Bell, L; Richards, B M; Barnard, E A

    1982-01-01

    A rare cDNA coding for most of the alpha subunit of the Torpedo nicotinic acetylcholine receptor has been cloned into bacteria. The use of a mismatched oligonucleotide primer of reverse transcriptase facilitated the design of an efficient, specific probe for recombinant bacteria. DNA sequence analysis has enabled the elucidation of a large part of the polypeptide primary sequence which is discussed in relation to its acetylcholine binding activity and the location of receptor within the plasma membrane. When used as a radioactive probe, the cloned cDNA binds specifically to a single Torpedo mRNA species of about 2350 nucleotides in length but fails to show significant cross-hybridisation with alpha subunit mRNA extracted from cat muscle. Images PMID:6183641

  10. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system.

    PubMed

    Kawano, Tomonori

    2013-03-01

    There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed.

  11. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    PubMed Central

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-01-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity. PMID:26472689

  12. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    NASA Astrophysics Data System (ADS)

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-10-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity.

  13. The vicilin gene family of pea (Pisum sativum L.): a complete cDNA coding sequence for preprovicilin.

    PubMed Central

    Lycett, G W; Delauney, A J; Gatehouse, J A; Gilroy, J; Croy, R R; Boulter, D

    1983-01-01

    A cDNA plasmid bank has been constructed using mRNA from developing pea seeds and three cDNAs coding for vicilin polypeptides have been selected. These cDNAs have been sequenced and between them cover the whole of the coding sequence plus part of the 5' and 3' untranslated regions. Comparison with amino acid sequence data from the protein indicates that vicilin is synthesised as preprovicilin with subsequent removal of a signal peptide and a C-terminal peptide as well as post translational endo-proteolytic cleavage. The cDNAs represent two different classes of vicilin genes whilst amino acid data show that there are at least three major classes of vicilin polypeptide. The vicilin sequences show extensive homology with conglycinin and phaseolin except in the regions of the internal proteolytic cleavages. The evolutionary significance of this relationship is discussed. Images PMID:6687941

  14. Bio-bar-code dendrimer-like DNA as signal amplifier for cancerous cells assay using ruthenium nanoparticle-based ultrasensitive chemiluminescence detection.

    PubMed

    Bi, Sai; Hao, Shuangyuan; Li, Li; Zhang, Shusheng

    2010-09-07

    Bio-bar-code dendrimer-like DNA (bbc-DL-DNA) is employed as a label for the amplification assay of cancer cells in combination with the newly explored chemiluminescence (CL) system of luminol-H(2)O(2)-Ru(3+) and specificity of structure-switching aptamers selected by cell-based SELEX.

  15. A novel non-coding RNA lncRNA-JADE connects DNA damage signalling to histone H4 acetylation.

    PubMed

    Wan, Guohui; Hu, Xiaoxiao; Liu, Yunhua; Han, Cecil; Sood, Anil K; Calin, George A; Zhang, Xinna; Lu, Xiongbin

    2013-10-30

    A prompt and efficient DNA damage response (DDR) eliminates the detrimental effects of DNA lesions in eukaryotic cells. Basic and preclinical studies suggest that the DDR is one of the primary anti-cancer barriers during tumorigenesis. The DDR involves a complex network of processes that detect and repair DNA damage, in which long non-coding RNAs (lncRNAs), a new class of regulatory RNAs, may play an important role. In the current study, we identified a novel lncRNA, lncRNA-JADE, that is induced after DNA damage in an ataxia-telangiectasia mutated (ATM)-dependent manner. LncRNA-JADE transcriptionally activates Jade1, a key component in the HBO1 (human acetylase binding to ORC1) histone acetylation complex. Consequently, lncRNA-JADE induces histone H4 acetylation in the DDR. Markedly higher levels of lncRNA-JADE were observed in human breast tumours in comparison with normal breast tissues. Knockdown of lncRNA-JADE significantly inhibited breast tumour growth in vivo. On the basis of these results, we propose that lncRNA-JADE is a key functional link that connects the DDR to histone H4 acetylation, and that dysregulation of lncRNA-JADE may contribute to breast tumorigenesis.

  16. Conservation of genetic information: a code for site-specific DNA recognition.

    PubMed Central

    Harris, L F; Sullivan, M R; Hickok, D F

    1993-01-01

    We present findings of genetic information conservation between the glucocorticoid response element (GRE) DNA and the cDNA encoding the glucocorticoid receptor (GR) DNA-binding domain (DBD). The regions of nucleotide sub-sequence similarity to the GRE in the GR DBD occur specifically at nucleotide sequences on the ends of exons 3,4, and 5 at their splice junction sites. These sequences encode the DNA recognition helix on exon 3, a beta-strand on exon 4, and a putative alpha-helix on exon 5, respectively. The nucleotide sequence of exon 5 that encodes the putative alpha-helix located on the carboxyl terminus of the GR DBD shares sequence similarity with the flanking nucleotide regions of the GRE. We generated a computer model of the GR DBD using atomic coordinates derived from nuclear magnetic resonance spectroscopy to which we attached the exon 5-encoded putative alpha-helix. We docked this GR DBD structure at the 39-base-pair nucleotide sequence containing the GRE binding site and flanking nucleotides, which contained conserved genetic information. We observed that amino acids of the DNA recognition helix, the beta-strand, and the putative alpha-helix are spatially aligned with trinucleotides identical to their cognate codons within the GRE and its flanking nucleotides. Images Fig. 3 PMID:8516297

  17. Characterization of EBV Promoters and Coding Regions by Sequencing PCR-Amplified DNA Fragments.

    PubMed

    Szenthe, Kalman; Bánáti, Ferenc

    2017-01-01

    DNA sequencing approaches originally developed in two directions, the chemical degradation method and the chain-termination method. The latter one became more widespread and a huge amount of sequencing data including whole genome sequences accumulated, based on the use of capillary sequencer systems and the application of a modified chain-termination method which proved to be relatively easy, fast, and reliable. In addition, relatively long, up to 1000 bp sequences could be obtained with a single read with high per-base accuracy. Although the recent appearance of next-generation DNA sequencing (NGS) technologies enabled high-throughput and low cost analysis of DNA, the modified chain-terminating methods are often applied in research until now. In the following, we shall present the application of capillary sequencing for the sequence characterization of viral genomes in case of partial and whole genome sequencing, and demonstrate it on the BARF1 promoter of Epstein Barr virus (EBV).

  18. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA

    PubMed Central

    Wright, Imogen A.; Travers, Simon A.

    2014-01-01

    The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. PMID:24861618

  19. iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples.

    PubMed

    Kabir, Muhammad; Hayat, Maqsood

    2016-02-01

    Meiotic recombination is vital for maintaining the sequence diversity in human genome. Meiosis and recombination are considered the essential phases of cell division. In meiosis, the genome is divided into equal parts for sexual reproduction whereas in recombination, the diverse genomes are combined to form new combination of genetic variations. Recombination process does not occur randomly across the genomes, it targets specific areas called recombination "hotspots" and "coldspots". Owing to huge exploration of polygenetic sequences in data banks, it is impossible to recognize the sequences through conventional methods. Looking at the significance of recombination spots, it is indispensable to develop an accurate, fast, robust, and high-throughput automated computational model. In this model, the numerical descriptors are extracted using two sequence representation schemes namely: dinucleotide composition and trinucleotide composition. The performances of seven classification algorithms were investigated. Finally, the predicted outcomes of individual classifiers are fused to form ensemble classification, which is formed through majority voting and genetic algorithm (GA). The performance of GA-based ensemble model is quite promising compared to individual classifiers and majority voting-based ensemble model. iRSpot-GAEnsC has achieved 84.46 % accuracy. The empirical results revealed that the performance of iRSpot-GAEnsC is not only higher than the examined algorithms but also better than existing methods in the literature developed so far. It is anticipated that the proposed model might be helpful for research community, academia and for drug discovery.

  20. An Abundant Class of Non-coding DNA Can Prevent Stochastic Gene Silencing in the C. elegans Germline.

    PubMed

    Frøkjær-Jensen, Christian; Jain, Nimit; Hansen, Loren; Davis, M Wayne; Li, Yongbin; Zhao, Di; Rebora, Karine; Millet, Jonathan R M; Liu, Xiao; Kim, Stuart K; Dupuy, Denis; Jorgensen, Erik M; Fire, Andrew Z

    2016-07-14

    Cells benefit from silencing foreign genetic elements but must simultaneously avoid inactivating endogenous genes. Although chromatin modifications and RNAs contribute to maintenance of silenced states, the establishment of silenced regions will inevitably reflect underlying DNA sequence and/or structure. Here, we demonstrate that a pervasive non-coding DNA feature in Caenorhabditis elegans, characterized by 10-base pair periodic An/Tn-clusters (PATCs), can license transgenes for germline expression within repressive chromatin domains. Transgenes containing natural or synthetic PATCs are resistant to position effect variegation and stochastic silencing in the germline. Among endogenous genes, intron length and PATC-character undergo dramatic changes as orthologs move from active to repressive chromatin over evolutionary time, indicating a dynamic character to the An/Tn periodicity. We propose that PATCs form the basis of a cellular immune system, identifying certain endogenous genes in heterochromatic contexts as privileged while foreign DNA can be suppressed with no requirement for a cellular memory of prior exposure.

  1. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  2. The non-coding B2 RNA binds to the DNA cleft and active site region of RNA polymerase II

    PubMed Central

    Ponicsan, Steven L.; Houel, Stephane; Old, William M.; Ahn, Natalie G.; Goodrich, James A.; Kugel, Jennifer F.

    2013-01-01

    The B2 family of short interspersed elements is transcribed into non-coding RNA by RNA polymerase III. The ~180 nt B2 RNA has been shown to potently repress mRNA transcription by binding tightly to RNA polymerase II (Pol II) and assembling with it into complexes on promoter DNA, where it keeps the polymerase from properly engaging the promoter DNA. Mammalian Pol II is a ~500 kD complex that contains 12 different protein subunits, providing many possible surfaces for interaction with B2 RNA. We found that the carboxy-terminal domain of the largest Pol II subunit was not required for B2 RNA to bind Pol II and repress transcription in vitro. To identify the surface on Pol II to which the minimal functional region of B2 RNA binds, we coupled multi-step affinity purification, reversible formaldehyde crosslinking, peptide sequencing by mass spectrometry, and analysis of peptide enrichment. The Pol II peptides most highly recovered after crosslinking to B2 RNA mapped to the DNA binding cleft and active site region of Pol II. These studies determine the location of a defined nucleic acid binding site on a large, native, multi-subunit complex and provide insight into the mechanism of transcriptional repression by B2 RNA. PMID:23416138

  3. Population dynamics coded in DNA: genetic traces of the expansion of modern humans

    NASA Astrophysics Data System (ADS)

    Kimmel, Marek

    1999-12-01

    It has been proposed that modern humans evolved from a small ancestral population, which appeared several hundred thousand years ago in Africa. Descendants of the founder group migrated to Europe and then to Asia, not mixing with the pre-existing local populations but replacing them. Two demographic elements are present in this “out of Africa” hypothesis: numerical growth of the modern humans and their migration into Eurasia. Did these processes leave an imprint in our DNA? To address this question, we use the classical Fisher-Wright-Moran model of population genetics, assuming variable population size and two models of mutation: the infinite-sites model and the stepwise-mutation model. We use the coalescence theory, which amounts to tracing the common ancestors of contemporary genes. We obtain mathematical formulae expressing the distribution of alleles given the time changes of population size . In the framework of the infinite-sites model, simulations indicate that the pattern of past population size change leaves its signature on the pattern of DNA polymorphism. Application of the theory to the published mitochondrial DNA sequences indicates that the current mitochondrial DNA sequence variation is not inconsistent with the logistic growth of the modern human population. In the framework of the stepwise-mutation model, we demonstrate that population bottleneck followed by growth in size causes an imbalance between allele-size variance and heterozygosity. We analyze a set of data on tetranucleotide repeats which reveals the existence of this imbalance. The pattern of imbalance is consistent with the bottleneck being most ancient in Africans, most recent in Asians and intermediate in Europeans. These findings are consistent with the “out of Africa” hypothesis, although by no means do they constitute its proof.

  4. Fine-tuning the ubiquitin code at DNA double-strand breaks: deubiquitinating enzymes at work

    PubMed Central

    Citterio, Elisabetta

    2015-01-01

    Ubiquitination is a reversible protein modification broadly implicated in cellular functions. Signaling processes mediated by ubiquitin (ub) are crucial for the cellular response to DNA double-strand breaks (DSBs), one of the most dangerous types of DNA lesions. In particular, the DSB response critically relies on active ubiquitination by the RNF8 and RNF168 ub ligases at the chromatin, which is essential for proper DSB signaling and repair. How this pathway is fine-tuned and what the functional consequences are of its deregulation for genome integrity and tissue homeostasis are subject of intense investigation. One important regulatory mechanism is by reversal of substrate ubiquitination through the activity of specific deubiquitinating enzymes (DUBs), as supported by the implication of a growing number of DUBs in DNA damage response processes. Here, we discuss the current knowledge of how ub-mediated signaling at DSBs is controlled by DUBs, with main focus on DUBs targeting histone H2A and on their recent implication in stem cell biology and cancer. PMID:26442100

  5. African swine fever virus ORF P1192R codes for a functional type II DNA topoisomerase.

    PubMed

    Coelho, João; Martins, Carlos; Ferreira, Fernando; Leitão, Alexandre

    2015-01-01

    Topoisomerases modulate the topological state of DNA during processes, such as replication and transcription, that cause overwinding and/or underwinding of the DNA. African swine fever virus (ASFV) is a nucleo-cytoplasmic double-stranded DNA virus shown to contain an OFR (P1192R) with homology to type II topoisomerases. Here we observed that pP1192R is highly conserved among ASFV isolates but dissimilar from other viral, prokaryotic or eukaryotic type II topoisomerases. In both ASFV/Ba71V-infected Vero cells and ASFV/L60-infected pig macrophages we detected pP1192R at intermediate and late phases of infection, cytoplasmically localized and accumulating in the viral factories. Finally, we used a Saccharomyces cerevisiae temperature-sensitive strain in order to demonstrate, through complementation and in vitro decatenation assays, the functionality of P1192R, which we further confirmed by mutating its predicted catalytic residue. Overall, this work strengthens the idea that P1192R constitutes a target for studying, and possibly controlling, ASFV transcription and replication.

  6. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system

    PubMed Central

    Kawano, Tomonori

    2013-01-01

    There have been a wide variety of approaches for handling the pieces of DNA as the “unplugged” tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given “passwords” and/or secret numbers using DNA sequences. The “passwords” of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original “passwords.” The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303

  7. Isolation and identification of a cDNA clone coding for an HLA-DR transplantation antigen alpha-chain.

    PubMed

    Gustafsson, K; Bill, P; Larhammar, D; Wiman, K; Claesson, L; Schenning, L; Servenius, B; Sundelin, J; Rask, L; Peterson, P A

    1982-10-01

    Membrane-bound mRNA was isolated from Raji cells and enriched for message coding for the HLA-DR transplantation antigen alpha-chain by sucrose gradient centrifugation. Double-stranded cDNA was constructed from this mRNA fraction, ligated to plasmid pBR322, and cloned into Escherichia coli. By hybrid selection, a plasmid, pDR-alpha-1, able to hybridize with mRNA coding for the HLA-DR alpha-chain was identified. From the nucleotide sequence of one end of the insert an amino acid sequence was predicted which is identical to part of the amino-terminal sequence of an HLA-DR alpha-chain preparation isolated from Raji cells. This clearly shows that pDR-alpha-1 carries almost the complete message for an HLD-DR alpha-chain. From the nucleotide sequence of this plasmid it will be possible to predict the primary structure of an HLA-DR alpha-chain.

  8. Variable continental distribution of polymorphisms in the coding regions of DNA-repair genes.

    PubMed

    Mathonnet, Géraldine; Labuda, Damian; Meloche, Caroline; Wambach, Tina; Krajinovic, Maja; Sinnett, Daniel

    2003-01-01

    DNA-repair pathways are critical for maintaining the integrity of the genetic material by protecting against mutations due to exposure-induced damages or replication errors. Polymorphisms in the corresponding genes may be relevant in genetic epidemiology by modifying individual cancer susceptibility or therapeutic response. We report data on the population distribution of potentially functional variants in XRCC1, APEX1, ERCC2, ERCC4, hMLH1, and hMSH3 genes among groups representing individuals of European, Middle Eastern, African, Southeast Asian and North American descent. The data indicate little interpopulation differentiation in some of these polymorphisms and typical FST values ranging from 10 to 17% at others. Low FST was observed in APEX1 and hMSH3 exon 23 in spite of their relatively high minor allele frequencies, which could suggest the effect of balancing selection. In XRCC1, hMSH3 exon 21 and hMLH1 Africa clusters either with Middle East and Europe or with Southeast Asia, which could be related to the demographic history of human populations, whereby human migrations and genetic drift rather than selection would account for the observed differences.

  9. A positive detecting code and its decoding algorithm for DNA library screening.

    PubMed

    Uehara, Hiroaki; Jimbo, Masakazu

    2009-01-01

    The study of gene functions requires high-quality DNA libraries. However, a large number of tests and screenings are necessary for compiling such libraries. We describe an algorithm for extracting as much information as possible from pooling experiments for library screening. Collections of clones are called pools, and a pooling experiment is a group test for detecting all positive clones. The probability of positiveness for each clone is estimated according to the outcomes of the pooling experiments. Clones with high chance of positiveness are subjected to confirmatory testing. In this paper, we introduce a new positive clone detecting algorithm, called the Bayesian network pool result decoder (BNPD). The performance of BNPD is compared, by simulation, with that of the Markov chain pool result decoder (MCPD) proposed by Knill et al. in 1996. Moreover, the combinatorial properties of pooling designs suitable for the proposed algorithm are discussed in conjunction with combinatorial designs and d-disjunct matrices. We also show the advantage of utilizing packing designs or BIB designs for the BNPD algorithm.

  10. DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids.

    PubMed

    Hafeez, Ibbad; Khan, Asifullah; Qadir, Abdul

    2014-11-01

    Data-hiding in deoxyribonucleic acid (DNA) sequences can be used to develop an organic memory and to track parent genes in an offspring as well as in genetically modified organism. However, the main concerns regarding data-hiding in DNA sequences are the survival of organism and successful extraction of watermark from DNA. This implies that the organism should live and reproduce without any functional disorder even in the presence of the embedded data. Consequently, performing synonymous substitution in amino acids for watermarking becomes a primary option. In this regard, a hybrid watermark embedding strategy that employs synonymous substitution in both twofold and fourfold codons of amino acids is proposed. This work thus presents a high-capacity and mutation-resistant watermarking technique, DNA-LCEB, for hiding secret information in DNA of living organisms. By employing the different types of synonymous codons of amino acids, the data storage capacity has been significantly increased. It is further observed that the proposed DNA-LCEB employing a combination of synonymous substitution, lossless compression, encryption, and Bose-Chaudary-Hocquenghem coding is secure and performs better in terms of both capacity and robustness compared to existing DNA data-hiding schemes. The proposed DNA-LCEB is tested against different mutations, including silent, miss-sense, and non-sense mutations, and provides substantial improvement in terms of mutation detection/correction rate and bits per nucleotide. A web application for DNA-LCEB is available at http://111.68.99.218/DNA-LCEB.

  11. Dynamic system classifier

    NASA Astrophysics Data System (ADS)

    Pumpe, Daniel; Greiner, Maksim; Müller, Ewald; Enßlin, Torsten A.

    2016-07-01

    Stochastic differential equations describe well many physical, biological, and sociological systems, despite the simplification often made in their derivation. Here the usage of simple stochastic differential equations to characterize and classify complex dynamical systems is proposed within a Bayesian framework. To this end, we develop a dynamic system classifier (DSC). The DSC first abstracts training data of a system in terms of time-dependent coefficients of the descriptive stochastic differential equation. Thereby the DSC identifies unique correlation structures within the training data. For definiteness we restrict the presentation of the DSC to oscillation processes with a time-dependent frequency ω (t ) and damping factor γ (t ) . Although real systems might be more complex, this simple oscillator captures many characteristic features. The ω and γ time lines represent the abstract system characterization and permit the construction of efficient signal classifiers. Numerical experiments show that such classifiers perform well even in the low signal-to-noise regime.

  12. Dynamic system classifier.

    PubMed

    Pumpe, Daniel; Greiner, Maksim; Müller, Ewald; Enßlin, Torsten A

    2016-07-01

    Stochastic differential equations describe well many physical, biological, and sociological systems, despite the simplification often made in their derivation. Here the usage of simple stochastic differential equations to characterize and classify complex dynamical systems is proposed within a Bayesian framework. To this end, we develop a dynamic system classifier (DSC). The DSC first abstracts training data of a system in terms of time-dependent coefficients of the descriptive stochastic differential equation. Thereby the DSC identifies unique correlation structures within the training data. For definiteness we restrict the presentation of the DSC to oscillation processes with a time-dependent frequency ω(t) and damping factor γ(t). Although real systems might be more complex, this simple oscillator captures many characteristic features. The ω and γ time lines represent the abstract system characterization and permit the construction of efficient signal classifiers. Numerical experiments show that such classifiers perform well even in the low signal-to-noise regime.

  13. A non-coding plastid DNA phylogeny of Asian Begonia (Begoniaceae): evidence for morphological homoplasy and sectional polyphyly.

    PubMed

    Thomas, D C; Hughes, M; Phutthai, T; Rajbhandary, S; Rubite, R; Ardi, W H; Richardson, J E

    2011-09-01

    Maximum likelihood and Bayesian analyses of non-coding plastid DNA sequence data based on a broad sampling of all major Asian Begonia sections (ndhA intron, ndhF-rpl32 spacer, rpl32-trnL spacer, 3977 aligned characters, 84 species) were used to reconstruct the phylogeny of Asian Begonia and to test the monophyly of major Asian Begonia sections. Ovary and fruit characters which are crucial in current sectional circumscriptions were mapped on the phylogeny to assess their utility in infrageneric classifications. The results indicate that the strong systematic emphasis placed on single, homoplasious characters such as undivided placenta lamellae (section Reichenheimia) and fleshy pericarps (section Sphenanthera), and the recognition of sections primarily based on a suite of plesiomorphic characters including three-locular ovaries with axillary, bilamellate placentae and dry, dehiscent pericarps (section Diploclinium), has resulted in the circumscription of several polyphyletic sections. Moreover, sections Platycentrum and Petermannia were recovered as paraphyletic. Because of the homoplasy of systematically important characters, current classifications have a certain diagnostic, but only poor predictive value. The presented phylogeny provides for the first time a reasonably resolved and supported phylogenetic framework for Asian Begonia which has the power to inform future taxonomic, biogeographic and evolutionary studies.

  14. Hierarchical Pattern Classifier

    NASA Technical Reports Server (NTRS)

    Yates, Gigi L.; Eberlein, Susan J.

    1992-01-01

    Hierarchical pattern classifier reduces number of comparisons between input and memory vectors without reducing detail of final classification by dividing classification process into coarse-to-fine hierarchy that comprises first "grouping" step and second classification step. Three-layer neural network reduces computation further by reducing number of vector dimensions in processing. Concept applicable to pattern-classification problems with need to reduce amount of computation necessary to classify, identify, or match patterns to desired degree of resolution.

  15. Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

    PubMed

    Qiu, Guo-Hua

    2016-01-01

    In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense.

  16. The Arabidopsis HOMOLOGY-DEPENDENT GENE SILENCING1 Gene Codes for an S-Adenosyl-l-Homocysteine Hydrolase Required for DNA Methylation-Dependent Gene Silencing

    PubMed Central

    Rocha, Pedro S.C.F.; Sheikh, Mazhar; Melchiorre, Rosalba; Fagard, Mathilde; Boutet, Stéphanie; Loach, Rebecca; Moffatt, Barbara; Wagner, Conrad; Vaucheret, Hervé; Furner, Ian

    2005-01-01

    Genes introduced into higher plant genomes can become silent (gene silencing) and/or cause silencing of homologous genes at unlinked sites (homology-dependent gene silencing or HDG silencing). Mutations of the HOMOLOGY-DEPENDENT GENE SILENCING1 (HOG1) locus relieve transcriptional gene silencing and methylation-dependent HDG silencing and result in genome-wide demethylation. The hog1 mutant plants also grow slowly and have low fertility and reduced seed germination. Three independent mutants of HOG1 were each found to have point mutations at the 3′ end of a gene coding for S-adenosyl-l-homocysteine (SAH) hydrolase, and hog1-1 plants show reduced SAH hydrolase activity. A transposon (hog1-4) and a T-DNA tag (hog1-5) in the HOG1 gene each behaved as zygotic embryo lethal mutants and could not be made homozygous. The results suggest that the homozygous hog1 point mutants are leaky and result in genome demethylation and poor growth and that homozygous insertion mutations result in zygotic lethality. Complementation of the hog1-1 point mutation with a T-DNA containing the gene coding for SAH hydrolase restored gene silencing, HDG silencing, DNA methylation, fast growth, and normal seed viability. The same T-DNA also complemented the zygotic embryo lethal phenotype of the hog1-4 tagged mutant. A model relating the HOG1 gene, DNA methylation, and methylation-dependent HDG silencing is presented. PMID:15659630

  17. New Insights into the Lake Chad Basin Population Structure Revealed by High-Throughput Genotyping of Mitochondrial DNA Coding SNPs

    PubMed Central

    Černý, Viktor; Carracedo, Ángel

    2011-01-01

    Background Located in the Sudan belt, the Chad Basin forms a remarkable ecosystem, where several unique agricultural and pastoral techniques have been developed. Both from an archaeological and a genetic point of view, this region has been interpreted to be the center of a bidirectional corridor connecting West and East Africa, as well as a meeting point for populations coming from North Africa through the Saharan desert. Methodology/Principal Findings Samples from twelve ethnic groups from the Chad Basin (n = 542) have been high-throughput genotyped for 230 coding region mitochondrial DNA (mtDNA) Single Nucleotide Polymorphisms (mtSNPs) using Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI-TOF) mass spectrometry. This set of mtSNPs allowed for much better phylogenetic resolution than previous studies of this geographic region, enabling new insights into its population history. Notable haplogroup (hg) heterogeneity has been observed in the Chad Basin mirroring the different demographic histories of these ethnic groups. As estimated using a Bayesian framework, nomadic populations showed negative growth which was not always correlated to their estimated effective population sizes. Nomads also showed lower diversity values than sedentary groups. Conclusions/Significance Compared to sedentary population, nomads showed signals of stronger genetic drift occurring in their ancestral populations. These populations, however, retained more haplotype diversity in their hypervariable segments I (HVS-I), but not their mtSNPs, suggesting a more ancestral ethnogenesis. Whereas the nomadic population showed a higher Mediterranean influence signaled mainly by sub-lineages of M1, R0, U6, and U5, the other populations showed a more consistent sub-Saharan pattern. Although lifestyle may have an influence on diversity patterns and hg composition, analysis of molecular variance has not identified these differences. The present study indicates that analysis of mt

  18. Recognition Using Hybrid Classifiers.

    PubMed

    Osadchy, Margarita; Keren, Daniel; Raviv, Dolev

    2016-04-01

    A canonical problem in computer vision is category recognition (e.g., find all instances of human faces, cars etc., in an image). Typically, the input for training a binary classifier is a relatively small sample of positive examples, and a huge sample of negative examples, which can be very diverse, consisting of images from a large number of categories. The difficulty of the problem sharply increases with the dimension and size of the negative example set. We propose to alleviate this problem by applying a "hybrid" classifier, which replaces the negative samples by a prior, and then finds a hyperplane which separates the positive samples from this prior. The method is extended to kernel space and to an ensemble-based approach. The resulting binary classifiers achieve an identical or better classification rate than SVM, while requiring far smaller memory and lower computational complexity to train and apply.

  19. Detecting Selection in the Blue Crab, Callinectes sapidus, Using DNA Sequence Data from Multiple Nuclear Protein-Coding Genes

    PubMed Central

    Yednock, Bree K.; Neigel, Joseph E.

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available. PMID:24896825

  20. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    PubMed

    Yednock, Bree K; Neigel, Joseph E

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available.

  1. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-02-20

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  2. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-01-01

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  3. Number in Classifier Languages

    ERIC Educational Resources Information Center

    Nomoto, Hiroki

    2013-01-01

    Classifier languages are often described as lacking genuine number morphology and treating all common nouns, including those conceptually count, as an unindividuated mass. This study argues that neither of these popular assumptions is true, and presents new generalizations and analyses gained by abandoning them. I claim that no difference exists…

  4. Classifying Cereal Data

    Cancer.gov

    The DSQ includes questions about cereal intake and allows respondents up to two responses on which cereals they consume. We classified each cereal reported first by hot or cold, and then along four dimensions: density of added sugars, whole grains, fiber, and calcium.

  5. Classifying Adolescent Perfectionists

    ERIC Educational Resources Information Center

    Rice, Kenneth G.; Ashby, Jeffrey S.; Gilman, Rich

    2011-01-01

    A large school-based sample of 9th-grade adolescents (N = 875) completed the Almost Perfect Scale-Revised (APS-R; Slaney, Mobley, Trippi, Ashby, & Johnson, 1996). Decision rules and cut-scores were developed and replicated that classify adolescents as one of two kinds of perfectionists (adaptive or maladaptive) or as nonperfectionists. A…

  6. C.U.R.R.F. (Codon Usage regarding Restriction Finder): a free Java(®)-based tool to detect potential restriction sites in both coding and non-coding DNA sequences.

    PubMed

    Gatter, Michael; Gatter, Thomas; Matthäus, Falk

    2012-10-01

    The synthesis of complete genes is becoming a more and more popular approach in heterologous gene expression. Reasons for this are the decreasing prices and the numerous advantages in comparison to classic molecular cloning methods. Two of these advantages are the possibility to adapt the codon usage to the host organism and the option to introduce restriction enzyme target sites of choice. C.U.R.R.F. (Codon Usage regarding Restriction Finder) is a free Java(®)-based software program which is able to detect possible restriction sites in both coding and non-coding DNA sequences by introducing multiple silent or non-silent mutations, respectively. The deviation of an alternative sequence containing a desired restriction motive from the sequence with the optimal codon usage is considered during the search of potential restriction sites in coding DNA and mRNA sequences as well as protein sequences. C.U.R.R.F is available at http://www.zvm.tu-dresden.de/die_tu_dresden/fakultaeten/fakultaet_mathematik_und_naturwissenschaften/fachrichtung_biologie/mikrobiologie/allgemeine_mikrobiologie/currf.

  7. Generalized classifier neural network.

    PubMed

    Ozyildirim, Buse Melis; Avci, Mutlu

    2013-03-01

    In this work a new radial basis function based classification neural network named as generalized classifier neural network, is proposed. The proposed generalized classifier neural network has five layers, unlike other radial basis function based neural networks such as generalized regression neural network and probabilistic neural network. They are input, pattern, summation, normalization and output layers. In addition to topological difference, the proposed neural network has gradient descent based optimization of smoothing parameter approach and diverge effect term added calculation improvements. Diverge effect term is an improvement on summation layer calculation to supply additional separation ability and flexibility. Performance of generalized classifier neural network is compared with that of the probabilistic neural network, multilayer perceptron algorithm and radial basis function neural network on 9 different data sets and with that of generalized regression neural network on 3 different data sets include only two classes in MATLAB environment. Better classification performance up to %89 is observed. Improved classification performances proved the effectivity of the proposed neural network.

  8. A sandwich-hybridization assay for simultaneous determination of HIV and tuberculosis DNA targets based on signal amplification by quantum dots-PowerVision™ polymer coding nanotracers.

    PubMed

    Yan, Zhongdan; Gan, Ning; Zhang, Huairong; Wang, De; Qiao, Li; Cao, Yuting; Li, Tianhua; Hu, Futao

    2015-09-15

    A novel sandwich-hybridization assay for simultaneous electrochemical detection of multiple DNA targets related to human immune deficiency virus (HIV) and tuberculosis (TB) was developed based on the different quantum dots-PowerVision(TM) polymer nanotracers. The polymer nanotracers were respectively fabricated by immobilizing SH-labeled oligonucleotides (s-HIV or s-TB), which can partially hybrid with virus DNA (HIV or TB), on gold nanoparticles (Au NPs) and then modified with PowerVision(TM) (PV) polymer-encapsulated quantum dots (CdS or PbS) as signal tags. PV is a dendrimer enzyme linked polymer, which can immobilize abundant QDs to amplify the stripping voltammetry signals from the metal ions (Pb or Cd). The capture probes were prepared through the immobilization of SH-labeled oligonucleotides, which can complementary with HIV and TB DNA, on the magnetic Fe3O4@Au (GMPs) beads. After sandwich-hybridization, the polymer nanotracers together with HIV and TB DNA targets were simultaneously introduced onto the surface of GMPs. Then the two encoding metal ions (Cd(2+) and Pb(2+)) were used to differentiate two viruses DNA due to the different subsequent anodic stripping voltammetric peaks at -0.84 V (Cd) and -0.61 V (Pb). Because of the excellent signal amplification of the polymer nanotracers and the great specificity of DNA targets, this assay could detect targets DNA as low as 0.2 femtomolar and exhibited excellent selectivity with the dynamitic range from 0.5 fM to 500 pM. Those results demonstrated that this electrochemical coding assay has great potential in applications for screening more viruses DNA while changing the probes.

  9. The Use and Effectiveness of Triple Multiplex System for Coding Region Single Nucleotide Polymorphism in Mitochondrial DNA Typing of Archaeologically Obtained Human Skeletons from Premodern Joseon Tombs of Korea.

    PubMed

    Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon

    2015-01-01

    Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods.

  10. Identification of a cDNA clone that contains the complete coding sequence for a 140-kD rat NCAM polypeptide

    PubMed Central

    1987-01-01

    Neural cell adhesion molecules (NCAMs) are cell surface glycoproteins that appear to mediate cell-cell adhesion. In vertebrates NCAMs exist in at least three different polypeptide forms of apparent molecular masses 180, 140, and 120 kD. The 180- and 140-kD forms span the plasma membrane whereas the 120-kD form lacks a transmembrane region. In this study, we report the isolation of NCAM clones from an adult rat brain cDNA library. Sequence analysis indicated that the longest isolate, pR18, contains a 2,574 nucleotide open reading frame flanked by 208 bases of 5' and 409 bases of 3' untranslated sequence. The predicted polypeptide encoded by clone pR18 contains a single membrane-spanning region and a small cytoplasmic domain (120 amino acids), suggesting that it codes for a full-length 140-kD NCAM form. In Northern analysis, probes derived from 5' sequences of pR18, which presumably code for extracellular portions of the molecule hybridized to five discrete mRNA size classes (7.4, 6.7, 5.2, 4.3, and 2.9 kb) in adult rat brain but not to liver or muscle RNA. However, the 5.2- and 2.9-kb mRNA size classes did not hybridize to either a large restriction fragment or three oligonucleotides derived from the putative transmembrane coding region and regions that lie 3' to it. The 3' probes did hybridize to the 7.4-, 6.7-, and 4.3-kb message size classes. These combined results indicate that clone pR18 is derived from either the 7.4-, 6.7-, or 4.3- kb adult rat brain RNA size class. Comparison with chicken and mouse NCAM cDNA sequences suggests that pR18 represents the amino acid coding region of the 6.7- or 4.3-kb mRNA. The isolation of pR18, the first cDNA that contains the complete coding sequence of an NCAM polypeptide, unambiguously demonstrates the predicted linear amino acid sequence of this probable rat 140-kD polypeptide. This cDNA also contains a 30-base pair segment not found in NCAM cDNAs isolated from other species. The significance of this segment and other

  11. Quantum decision tree classifier

    NASA Astrophysics Data System (ADS)

    Lu, Songfeng; Braunstein, Samuel L.

    2013-11-01

    We study the quantum version of a decision tree classifier to fill the gap between quantum computation and machine learning. The quantum entropy impurity criterion which is used to determine which node should be split is presented in the paper. By using the quantum fidelity measure between two quantum states, we cluster the training data into subclasses so that the quantum decision tree can manipulate quantum states. We also propose algorithms constructing the quantum decision tree and searching for a target class over the tree for a new quantum object.

  12. Mitochondrial DNA of Clathrina clathrus (Calcarea, Calcinea): six linear chromosomes, fragmented rRNAs, tRNA editing, and a novel genetic code.

    PubMed

    Lavrov, Dennis V; Pett, Walker; Voigt, Oliver; Wörheide, Gert; Forget, Lise; Lang, B Franz; Kayal, Ehsan

    2013-04-01

    Sponges (phylum Porifera) are a large and ancient group of morphologically simple but ecologically important aquatic animals. Although their body plan and lifestyle are relatively uniform, sponges show extensive molecular and genetic diversity. In particular, mitochondrial genomes from three of the four previously studied classes of Porifera (Demospongiae, Hexactinellida, and Homoscleromorpha) have distinct gene contents, genome organizations, and evolutionary rates. Here, we report the mitochondrial genome of Clathrina clathrus (Calcinea, Clathrinidae), a representative of the fourth poriferan class, the Calcarea, which proves to be the most unusual. Clathrina clathrus mitochondrial DNA (mtDNA) consists of six linear chromosomes 7.6-9.4 kb in size and encodes at least 37 genes: 13 protein codings, 2 ribosomal RNAs (rRNAs), and 24 transfer RNAs (tRNAs). Protein genes include atp9, which has now been found in all major sponge lineages, but no atp8. Our analyses further reveal the presence of a novel genetic code that involves unique reassignments of the UAG codons from termination to tyrosine and of the CGN codons from arginine to glycine. Clathrina clathrus mitochondrial rRNAs are encoded in three (srRNA) and ≥6 (lrRNA) fragments distributed out of order and on several chromosomes. The encoded tRNAs contain multiple mismatches in the aminoacyl acceptor stems that are repaired posttranscriptionally by 3'-end RNA editing. Although our analysis does not resolve the phylogenetic position of calcareous sponges, likely due to their high rates of mitochondrial sequence evolution, it confirms mtDNA as a promising marker for population studies in this group. The combination of unusual mitochondrial features in C. clathrus redefines the extremes of mtDNA evolution in animals and further argues against the idea of a "typical animal mtDNA."

  13. High Performance Medical Classifiers

    NASA Astrophysics Data System (ADS)

    Fountoukis, S. G.; Bekakos, M. P.

    2009-08-01

    In this paper, parallelism methodologies for the mapping of machine learning algorithms derived rules on both software and hardware are investigated. Feeding the input of these algorithms with patient diseases data, medical diagnostic decision trees and their corresponding rules are outputted. These rules can be mapped on multithreaded object oriented programs and hardware chips. The programs can simulate the working of the chips and can exhibit the inherent parallelism of the chips design. The circuit of a chip can consist of many blocks, which are operating concurrently for various parts of the whole circuit. Threads and inter-thread communication can be used to simulate the blocks of the chips and the combination of block output signals. The chips and the corresponding parallel programs constitute medical classifiers, which can classify new patient instances. Measures taken from the patients can be fed both into chips and parallel programs and can be recognized according to the classification rules incorporated in the chips and the programs design. The chips and the programs constitute medical decision support systems and can be incorporated into portable micro devices, assisting physicians in their everyday diagnostic practice.

  14. Molecular cloning and expression in photosynthetic bacteria of a soybean cDNA coding for phytoene desaturase, an enzyme of the carotenoid biosynthesis pathway.

    PubMed Central

    Bartley, G E; Viitanen, P V; Pecker, I; Chamovitz, D; Hirschberg, J; Scolnik, P A

    1991-01-01

    Carotenoids are orange, yellow, or red photo-protective pigments present in all plastids. The first carotenoid of the pathway is phytoene, a colorless compound that is converted into colored carotenoids through a series of desaturation reactions. Genes coding for carotenoid desaturases have been cloned from microbes but not from plants. We report the cloning of a cDNA for pds1, a soybean (Glycine max) gene that, based on a complementation assay using the photosynthetic bacterium Rhodobacter capsulatus, codes for an enzyme that catalyzes the two desaturation reactions that convert phytoene into zeta-carotene, a yellow carotenoid. The 2281-base-pair cDNA clone analyzed contains an open reading frame with the capacity to code for a 572-residue protein of predicted Mr 63,851. Alignment of the deduced Pds1 peptide sequence with the sequences of fungal and bacterial carotenoid desaturases revealed conservation of several amino acid residues, including a dinucleotide-binding motif that could mediate binding to FAD. The Pds1 protein is synthesized in vitro as a precursor that, upon import into isolated chloroplasts, is processed to a smaller mature form. Hybridization of the pds1 cDNA to genomic blots indicated that this gene is a member of a low-copy-number gene family. One of these loci was genetically mapped using restriction fragment length polymorphisms between Glycine max and Glycine soja. We conclude that pds1 is a nuclear gene encoding a phytoene desaturase enzyme that, as its microbial counterparts, contains sequence motifs characteristic of flavoproteins. Images PMID:1862081

  15. Stack filter classifiers

    SciTech Connect

    Porter, Reid B; Hush, Don

    2009-01-01

    Just as linear models generalize the sample mean and weighted average, weighted order statistic models generalize the sample median and weighted median. This analogy can be continued informally to generalized additive modeels in the case of the mean, and Stack Filters in the case of the median. Both of these model classes have been extensively studied for signal and image processing but it is surprising to find that for pattern classification, their treatment has been significantly one sided. Generalized additive models are now a major tool in pattern classification and many different learning algorithms have been developed to fit model parameters to finite data. However Stack Filters remain largely confined to signal and image processing and learning algorithms for classification are yet to be seen. This paper is a step towards Stack Filter Classifiers and it shows that the approach is interesting from both a theoretical and a practical perspective.

  16. Transionospheric chirp event classifier

    SciTech Connect

    Argo, P.E.; Fitzgerald, T.J.; Freeman, M.J.

    1995-09-01

    In this paper we will discuss a project designed to provide computer recognition of the transionospheric chirps/pulses measured by the Blackbeard (BB) satellite, and expected to be measured by the upcoming FORTE satellite. The Blackbeard data has been perused by human means -- this has been satisfactory for the relatively small amount of data taken by Blackbeard. But with the advent of the FORTE system, which by some accounts might ``see`` thousands of events per day, it is important to provide a software/hardware method of accurately analyzing the data. In fact, we are providing an onboard DSP system for FORTE, which will test the usefulness of our Event Classifier techniques in situ. At present we are constrained to work with data from the Blackbeard satellite, and will discuss the progress made to date.

  17. Transionospheric chirp event classifier

    NASA Astrophysics Data System (ADS)

    Argo, P. E.; Fitzgerald, T. J.; Freeman, M. J.

    In this paper we will discuss a project designed to provide computer recognition of the transionospheric chirps/pulses measured by the Blackbeard (BB) satellite, and expected to be measured by the upcoming FORTE satellite. The Blackbeard data has been perused by human means - this has been satisfactory for the relatively small amount of data taken by Blackbeard. But with the advent of the FORTE system, which by some accounts might 'see' thousands of events per day, it is important to provide a software/hardware method of accurately analyzing the data. In fact, we are providing an onboard DSP system for FORTE, which will test the usefulness of our Event Classifier techniques in situ. At present we are constrained to work with data from the Blackbeard satellite, and will discuss the progress made to date.

  18. Classifying TDSS Stellar Variables

    NASA Astrophysics Data System (ADS)

    Amaro, Rachael Christina; Green, Paul J.; TDSS Collaboration

    2017-01-01

    The Time Domain Spectroscopic Survey (TDSS), a subprogram of SDSS-IV eBOSS, obtains classification/discovery spectra of point-source photometric variables selected from PanSTARRS and SDSS multi-color light curves regardless of object color or lightcurve shape. Tens of thousands of TDSS spectra are already available and have been spectroscopically classified both via pipeline and by visual inspection. About half of these spectra are quasars, half are stars. Our goal is to classify the stars with their correct variability types. We do this by acquiring public multi-epoch light curves for brighter stars (r<19.5mag) from the Catalina Sky Survey (CSS). We then run a number of light curve analyses from VARTOOLS, a program for analyzing astronomical time-series data, to constrain variable type both for broad statistics relevant to future surveys like the Transiting Exoplanet Survey Satellite (TESS) and the Large Synoptic Survey Telescope (LSST), and to find the inevitable exotic oddballs that warrant further follow-up. Specifically, the Lomb-Scargle Periodogram and the Box-Least Squares Method are being implemented and tested against their known variable classifications and parameters in the Catalina Surveys Periodic Variable Star Catalog. Variable star classifications include RR Lyr, close eclipsing binaries, CVs, pulsating white dwarfs, and other exotic systems. The key difference between our catalog and others is that along with the light curves, we will be using TDSS spectra to help in the classification of variable type, as spectra are rich with information allowing estimation of physical parameters like temperature, metallicity, gravity, etc. This work was supported by the SDSS Research Experience for Undergraduates program, which is funded by a grant from Sloan Foundation to the Astrophysical Research Consortium.

  19. Identification of an androgen-repressed mRNA in rat ventral prostate as coding for sulphated glycoprotein 2 by cDNA cloning and sequence analysis.

    PubMed Central

    Bettuzzi, S; Hiipakka, R A; Gilna, P; Liao, S T

    1989-01-01

    The concentrations of a small number of mRNAs in the rat ventral prostate increase after castration and then decrease upon androgen treatment. Since the repression of specific gene expression may be important in the regulation of organ growth, we have cloned a cDNA for an androgen-repressed mRNA, the concentration of which increased 17-fold 4 days after castration, and this increase was reversed rapidly by androgen treatment. By sequence analysis the androgen-repressed mRNA was identified as that coding for sulphated glycoprotein 2. Images Fig. 1. PMID:2920020

  20. Cloning and characterization of a cDNA coding 3-hydroxy-3-methylglutary CoA reductase involved in glycyrrhizic acid biosynthesis in Glycyrrhiza uralensis.

    PubMed

    Liu, Ying; Xu, Qiao-Xian; Xi, Pei-Yu; Chen, Hong-Hao; Liu, Chun-Sheng

    2013-05-01

    The roots of Glycyrrhiza uralensis are widely used in Chinese medicine for their action of clearing heat, detoxicating, relieving cough, dispelling sputum and tonifying spleen and stomach. The reason why Glycyrrhiza uralensis has potent and significant actions is that it contains various active secondary metabolites, especially glycyrrhizic acid. In the present study, we cloned the cDNA coding 3-hydroxy-3-methylglutary CoA reductase (HMGR) involved in glycyrrhizic acid biosynthesis in Glycyrrhiza uralensis. The corresponding cDNA was expressed in Escherichia coli as fusion proteins. Recombinant HMGR exhibited catalysis activity in reduction of HMG-CoA to mevalonic acid (MVA) just as HMGR isolated from other species. Because HMGR gene is very important in the biosynthesis of glycyrrhizic acid in Glycyrrhiza uralensis, this work is significant for further studies concerned with strengthening the efficacy of Glycyrrhiza uralensis by means of increasing glycyrrhizic acid content and exploring the biosynthesis of glycyrrhizic acid in vitro.

  1. DNMT3B interacts with constitutive centromere protein CENP-C to modulate DNA methylation and the histone code at centromeric regions.

    PubMed

    Gopalakrishnan, Suhasni; Sullivan, Beth A; Trazzi, Stefania; Della Valle, Giuliano; Robertson, Keith D

    2009-09-01

    DNA methylation is an epigenetically imposed mark of transcriptional repression that is essential for maintenance of chromatin structure and genomic stability. Genome-wide methylation patterns are mediated by the combined action of three DNA methyltransferases: DNMT1, DNMT3A and DNMT3B. Compelling links exist between DNMT3B and chromosome stability as emphasized by the mitotic defects that are a hallmark of ICF syndrome, a disease arising from germline mutations in DNMT3B. Centromeric and pericentromeric regions are essential for chromosome condensation and the fidelity of segregation. Centromere regions contain distinct epigenetic marks, including dense DNA hypermethylation, yet the mechanisms by which DNA methylation is targeted to these regions remains largely unknown. In the present study, we used a yeast two-hybrid screen and identified a novel interaction between DNMT3B and constitutive centromere protein CENP-C. CENP-C is itself essential for mitosis. We confirm this interaction in mammalian cells and map the domains responsible. Using siRNA knock downs, bisulfite genomic sequencing and ChIP, we demonstrate for the first time that CENP-C recruits DNA methylation and DNMT3B to both centromeric and pericentromeric satellite repeats and that CENP-C and DNMT3B regulate the histone code in these regions, including marks characteristic of centromeric chromatin. Finally, we demonstrate that loss of CENP-C or DNMT3B leads to elevated chromosome misalignment and segregation defects during mitosis and increased transcription of centromeric repeats. Taken together, our data reveal a novel mechanism by which DNA methylation is targeted to discrete regions of the genome and contributes to chromosomal stability.

  2. Cloning and expression of a cDNA coding for the human platelet-derived growth factor receptor: Evidence for more than one receptor class

    SciTech Connect

    Gronwald, R.G.K.; Grant, F.J.; Haldeman, B.A.; Hart, C.E.; O'Hara, P.J.; Hagen, F.S.; Ross, R.; Bowen-Pope, D.F.; Murray, M.J. )

    1988-05-01

    The complete nucleotide sequence of a cDNA encoding the human platelet-derived growth factor (PDGF) receptor is presented. The cDNA contains an open reading frame that codes for a protein of 1106 amino acids. Comparison to the mouse PDGF receptor reveals an overall amino acid sequence identity of 86%. This sequence identity rises to 98% in the cytoplasmic split tyrosine kinase domain. RNA blot hybridization analysis of poly(A){sup +} RNA from human dermal fibroblasts detects a major and a minor transcript using the cDNA as a probe. Baby hamster kidney cells, transfected with an expression vector containing the receptor cDNA, express an {approx} 190-kDa cell surface protein that is recognized by an anti-human PDGF receptor antibody. The recombinant PDGF receptor is functional in the transfected baby hamster kidney cells as demonstrated by ligand-induced phosphorylation of the receptor. Binding properties of the recombinant PDGF receptor were also assessed with pure preparations of BB and AB isoforms of PDGF. Unlike human dermal fibroblasts, which bind both isoforms with high affinity, the transfected baby hamster kidney cells bind only the BB isoform of PDGF with high affinity. This observation is consistent with the existence of more than one PDGF receptor class.

  3. DNA.

    ERIC Educational Resources Information Center

    Felsenfeld, Gary

    1985-01-01

    Structural form, bonding scheme, and chromatin structure of and gene-modification experiments with deoxyribonucleic acid (DNA) are described. Indicates that DNA's double helix is variable and also flexible as it interacts with regulatory and other molecules to transfer hereditary messages. (DH)

  4. Electron slowing-down spectra in water for electron and photon sources calculated with the Geant4-DNA code.

    PubMed

    Vassiliev, Oleg N

    2012-02-21

    Recently, a very low energy extension was added to the Monte Carlo simulation toolkit Geant4. It is intended for radiobiological modeling and is referred to as Geant4-DNA. Its performance, however, has not been systematically benchmarked in terms of transport characteristics. This study reports on the electron slowing-down spectra and mean energy per ion pair, the W-value, in water for monoenergetic electron and photon sources calculated with Geant4-DNA. These quantities depend on electron energy, but not on spatial or angular variables which makes them a good choice for testing the model of energy transfer processes. The spectra also have a scientific value for radiobiological modeling as they describe the energy distribution of electrons entering small volumes, such as the cell nucleus. Comparisons of Geant4-DNA results with previous studies showed overall good agreement. Some differences in slowing-down spectra between Geant4-DNA and previous studies were found at 100 eV and at approximately 500 eV that were attributed to approximations in models of vibrational excitations and atomic de-excitation after ionization by electron impact. We also found that the high-energy part of the Geant4-DNA spectrum for a 1 keV electron source was higher, and the asymptotic high-energy W-value was lower than previous studies reported.

  5. Classifying partner femicide.

    PubMed

    Dixon, Louise; Hamilton-Giachritsis, Catherine; Browne, Kevin

    2008-01-01

    The heterogeneity of domestic violent men has long been established. However, research has failed to examine this phenomenon among men committing the most severe form of domestic violence. This study aims to use a multidimensional approach to empirically construct a classification system of men who are incarcerated for the murder of their female partner based on the Holtzworth-Munroe and Stuart (1994) typology. Ninety men who had been convicted and imprisoned for the murder of their female partner or spouse in England were identified from two prison samples. A content dictionary defining offense and offender characteristics associated with two dimensions of psychopathology and criminality was developed. These variables were extracted from institutional records via content analysis and analyzed for thematic structure using multidimensional scaling procedures. The resultant framework classified 80% (n = 72) of the sample into three subgroups of men characterized by (a) low criminality/low psychopathology (15%), (b) moderate-high criminality/ high psychopathology (36%), and (c) high criminality/low-moderate psychopathology (49%). The latter two groups are akin to Holtzworth-Munroe and Stuart's (1994) generally violent/antisocial and dysphoric/borderline offender, respectively. The implications for intervention, developing consensus in research methodology across the field, and examining typologies of domestic violent men prospectively are discussed.

  6. A Framework for Identifying and Classifying Undergraduate Student Proof Errors

    ERIC Educational Resources Information Center

    Strickland, S.; Rand, B.

    2016-01-01

    This paper describes a framework for identifying, classifying, and coding student proofs, modified from existing proof-grading rubrics. The framework includes 20 common errors, as well as categories for interpreting the severity of the error. The coding scheme is intended for use in a classroom context, for providing effective student feedback. In…

  7. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) – Definition of a Distinct Class of Begomovirus-Associated Satellites

    PubMed Central

    Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  8. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) - Definition of a Distinct Class of Begomovirus-Associated Satellites.

    PubMed

    Lozano, Gloria; Trenado, Helena P; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem-loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem-loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed.

  9. Replication of a pathogenic non-coding RNA increases DNA methylation in plants associated with a bromodomain-containing viroid-binding protein

    PubMed Central

    Lv, Dian-Qiu; Liu, Shang-Wu; Zhao, Jian-Hua; Zhou, Bang-Jun; Wang, Shao-Peng; Guo, Hui-Shan; Fang, Yuan-Yuan

    2016-01-01

    Viroids are plant-pathogenic molecules made up of single-stranded circular non-coding RNAs. How replicating viroids interfere with host silencing remains largely unknown. In this study, we investigated the effects of a nuclear-replicating Potato spindle tuber viroid (PSTVd) on interference with plant RNA silencing. Using transient induction of silencing in GFP transgenic Nicotiana benthamiana plants (line 16c), we found that PSTVd replication accelerated GFP silencing and increased Virp1 mRNA, which encodes bromodomain-containing viroid-binding protein 1 and is required for PSTVd replication. DNA methylation was increased in the GFP transgene promoter of PSTVd-replicating plants, indicating involvement of transcriptional gene silencing. Consistently, accelerated GFP silencing and increased DNA methylation in the of GFP transgene promoter were detected in plants transiently expressing Virp1. Virp1 mRNA was also increased upon PSTVd infection in natural host potato plants. Reduced transcript levels of certain endogenous genes were also consistent with increases in DNA methylation in related gene promoters in PSTVd-infected potato plants. Together, our data demonstrate that PSTVd replication interferes with the nuclear silencing pathway in that host plant, and this is at least partially attributable to Virp1. This study provides new insights into the plant-viroid interaction on viroid pathogenicity by subverting the plant cell silencing machinery. PMID:27767195

  10. Phylogeny of genetic codes and punctuation codes within genetic codes.

    PubMed

    Seligmann, Hervé

    2015-03-01

    Punctuation codons (starts, stops) delimit genes, reflect translation apparatus properties. Most codon reassignments involve punctuation. Here two complementary approaches classify natural genetic codes: (A) properties of amino acids assigned to codons (classical phylogeny), coding stops as X (A1, antitermination/suppressor tRNAs insert unknown residues), or as gaps (A2, no translation, classical stop); and (B) considering only punctuation status (start, stop and other codons coded as -1, 0 and 1 (B1); 0, -1 and 1 (B2, reflects ribosomal translational dynamics); and 1, -1, and 0 (B3, starts/stops as opposites)). All methods separate most mitochondrial codes from most nuclear codes; Gracilibacteria consistently cluster with metazoan mitochondria; mitochondria co-hosted with chloroplasts cluster with nuclear codes. Method A1 clusters the euplotid nuclear code with metazoan mitochondria; A2 separates euplotids from mitochondria. Firmicute bacteria Mycoplasma/Spiroplasma and Protozoan (and lower metazoan) mitochondria share codon-amino acid assignments. A1 clusters them with mitochondria, they cluster with the standard genetic code under A2: constraints on amino acid ambiguity versus punctuation-signaling produced the mitochondrial versus bacterial versions of this genetic code. Punctuation analysis B2 converges best with classical phylogenetic analyses, stressing the need for a unified theory of genetic code punctuation accounting for ribosomal constraints.

  11. The Stat3/GR interaction code: predictive value of direct/indirect DNA recruitment for transcription outcome.

    PubMed

    Langlais, David; Couture, Catherine; Balsalobre, Aurélio; Drouin, Jacques

    2012-07-13

    Transcription factor recruitment to genomic sites of action is primarily due to direct protein:DNA interactions. The subsequent recruitment of coregulatory complexes leads to either transcriptional activation or repression. In contrast to this canonical scheme, some transcription factors, such as the glucocorticoid receptor (GR), behave as transcriptional repressors when recruited to target genes through protein tethering. We have investigated the genome-wide prevalence of tethering between GR and Stat3 and found nonreciprocal interactions, namely that GR tethering to DNA-bound Stat3 results in transcriptional repression, whereas Stat3 tethering to GR results in synergism. Further, other schemes of GR and Stat3 corecruitment to regulatory modules result in transcriptional synergism, including neighboring and composite binding sites. The results indicate extensive transcriptional interactions between Stat3 and GR; further, they provide a genome-wide assessment of transcriptional regulation by tethering and a molecular basis for integration of signals mediated by GR and Stats in health and disease.

  12. Effective Protective Immunity to Yersinia pestis Infection Conferred by DNA Vaccine Coding for Derivatives of the F1 Capsular Antigen

    PubMed Central

    Grosfeld, Haim; Cohen, Sara; Bino, Tamar; Flashner, Yehuda; Ber, Raphael; Mamroud, Emanuelle; Kronman, Chanoch; Shafferman, Avigdor; Velan, Baruch

    2003-01-01

    Three plasmids expressing derivatives of the Yersinia pestis capsular F1 antigen were evaluated for their potential as DNA vaccines. These included plasmids expressing the full-length F1, F1 devoid of its putative signal peptide (deF1), and F1 fused to the signal-bearing E3 polypeptide of Semliki Forest virus (E3/F1). Expression of these derivatives in transfected HEK293 cells revealed that deF1 is expressed in the cytosol, E3/F1 is targeted to the secretory cisternae, and the nonmodified F1 is rapidly eliminated from the cell. Intramuscular vaccination of mice with these plasmids revealed that the vector expressing deF1 was the most effective in eliciting anti-F1 antibodies. This response was not limited to specific mouse strains or to the mode of DNA administration, though gene gun-mediated vaccination was by far more effective than intramuscular needle injection. Vaccination of mice with deF1 DNA conferred protection against subcutaneous infection with the virulent Y. pestis Kimberley53 strain, even at challenge amounts as high as 4,000 50% lethal doses. Antibodies appear to play a major role in mediating this protection, as demonstrated by passive transfer of anti-deF1 DNA antiserum. Taken together, these observations indicate that a tailored genetic vaccine based on a bacterial protein can be used to confer protection against plague in mice without resorting to regimens involving the use of purified proteins. PMID:12496187

  13. Sequence of a novel cytochrome CYP2B cDNA coding for a protein which is expressed in a sebaceous gland, but not in the liver.

    PubMed Central

    Friedberg, T; Grassow, M A; Bartlomowicz-Oesch, B; Siegert, P; Arand, M; Adesnik, M; Oesch, F

    1992-01-01

    The major phenobarbital-inducible rat hepatic cytochromes P-450, CYP2B1 and CYP2B2, are the paradigmatic members of a cytochrome P-450 gene subfamily that contains at least seven additional members. Specific oligonucleotide probes for these genomic members of the CYP2B subfamily were used to assess their tissue-specific expression. In Northern-blot analysis a probe specific to gene 4 (which is designated now as CYP2B12) hybridized to a single mRNA present in the preputial gland, an organ which is used as a model for sebaceous glands, but did not hybridize to mRNA isolated from the liver or from five other tissues of untreated or Aroclor 1254-treated rats. The cDNA sequence for the CYP2B12 RNA was determined from overlapping cDNA clones and contained a long open reading frame of 1476 bp. The nucleotide sequence of the CYP2B12 cDNA was 85% similar to the sequence of the CYP2B1 cDNA in its coding region and was different from any CYP2B cDNA characterized until now. The cDNA-derived primary structure of the CYP2B12 protein contains a signal sequence for its insertion into the endoplasmic reticulum and the putative haem-binding site characteristic of cytochromes P-450. A part of the potential haem pocket of CYP2B12 was identical with a similar structure in a bacterial protocatechuate dioxygenase. In immunoblot analysis of preputial-gland microsomes, antibodies against CYP2B1 recognized a single abundant protein with a lower apparent molecular mass than that of CYP2B1. Our results demonstrate that the CYP2B12 protein has the potential to be enzymically active and are the first demonstration that a member of the CYP2B subfamily is expressed exclusively and at high levels in an extrahepatic organ. Images Fig. 1. Fig. 5. Fig. 6. PMID:1445240

  14. DNA

    ERIC Educational Resources Information Center

    Stent, Gunther S.

    1970-01-01

    This history for molecular genetics and its explanation of DNA begins with an analysis of the Golden Jubilee essay papers, 1955. The paper ends stating that the higher nervous system is the one major frontier of biological inquiry which still offers some romance of research. (Author/VW)

  15. Emergent behaviors of classifier systems

    SciTech Connect

    Forrest, S.; Miller, J.H.

    1989-01-01

    This paper discusses some examples of emergent behavior in classifier systems, describes some recently developed methods for studying them based on dynamical systems theory, and presents some initial results produced by the methodology. The goal of this work is to find techniques for noticing when interesting emergent behaviors of classifier systems emerge, to study how such behaviors might emerge over time, and make suggestions for designing classifier systems that exhibit preferred behaviors. 20 refs., 1 fig.

  16. Dose point kernels in liquid water: an intra-comparison between GEANT4-DNA and a variety of Monte Carlo codes.

    PubMed

    Champion, C; Incerti, S; Perrot, Y; Delorme, R; Bordage, M C; Bardiès, M; Mascialino, B; Tran, H N; Ivanchenko, V; Bernal, M; Francis, Z; Groetz, J-E; Fromm, M; Campos, L

    2014-01-01

    Modeling the radio-induced effects in biological medium still requires accurate physics models to describe the interactions induced by all the charged particles present in the irradiated medium in detail. These interactions include inelastic as well as elastic processes. To check the accuracy of the very low energy models recently implemented into the GEANT4 toolkit for modeling the electron slowing-down in liquid water, the simulation of electron dose point kernels remains the preferential test. In this context, we here report normalized radial dose profiles, for mono-energetic point sources, computed in liquid water by using the very low energy "GEANT4-DNA" physics processes available in the GEANT4 toolkit. In the present study, we report an extensive intra-comparison of profiles obtained by a large selection of existing and well-documented Monte-Carlo codes, namely, EGSnrc, PENELOPE, CPA100, FLUKA and MCNPX.

  17. Rare Failures of DNA Bar Codes to Separate Morphologically Distinct Species in a Biodiversity Survey of Iberian Leaf Beetles

    PubMed Central

    Baselga, Andrés; Gómez-Rodríguez, Carola; Novoa, Francisco; Vogler, Alfried P.

    2013-01-01

    During a survey of genetic and species diversity patterns of leaf beetle (Coleoptera: Chrysomelidae) assemblages across the Iberian Peninsula we found a broad congruence between morphologically delimited species and variation in the cytochrome oxidase (cox1) gene. However, one species pair each in the genera Longitarsus Berthold and Pachybrachis Chevrolat was inseparable using molecular methods, whereas diagnostic morphological characters (including male or female genitalia) unequivocally separated the named species. Parsimony haplotype networks and maximum likelihood trees built from cox1 showed high genetic structure within each species pair, but no correlation with the morphological types and neither with geographic distributions. This contrasted with all analysed congeneric species, which were recovered as monophyletic. A limited number of specimens were sequenced for the nuclear 18S rRNA gene, which showed no or very limited variation within the species pair and no separation of morphological types. These results suggest that processes of lineage sorting for either group are lagging behind the clear morphological and presumably reproductive separation. In the Iberian chrysomelids, incongruence between DNA-based and morphological delimitations is a rare exception, but the discovery of these species pairs may be useful as an evolutionary model for studying the process of speciation in this ecological and geographical setting. In addition, the study of biodiversity patterns based on DNA requires an evolutionary understanding of these incongruences and their potential causes. PMID:24040352

  18. Application of DNA Bar Codes for Screening of Industrially Important Fungi: the Haplotype of Trichoderma harzianum Sensu Stricto Indicates Superior Chitinase Formation▿

    PubMed Central

    Nagy, Viviana; Seidl, Verena; Szakacs, George; Komoń-Zelazowska, Monika; Kubicek, Christian P.; Druzhinina, Irina S.

    2007-01-01

    Selection of suitable strains for biotechnological purposes is frequently a random process supported by high-throughput methods. Using chitinase production by Hypocrea lixii/Trichoderma harzianum as a model, we tested whether fungal strains with superior enzyme formation may be diagnosed by DNA bar codes. We analyzed sequences of two phylogenetic marker loci, internal transcribed spacer 1 (ITS1) and ITS2 of the rRNA-encoding gene cluster and the large intron of the elongation factor 1-alpha gene, tef1, from 50 isolates of H. lixii/T. harzianum, which were also tested to determine their ability to produce chitinases in solid-state fermentation (SSF). Statistically supported superior chitinase production was obtained for strains carrying one of the observed ITS1 and ITS2 and tef1 alleles corresponding to an allele of T. harzianum type strain CBS 226.95. A tef1-based DNA bar code tool, TrichoCHIT, for rapid identification of these strains was developed. The geographic origin of the strains was irrelevant for chitinase production. The improved chitinase production by strains containing this haplotype was not due to better growth on N-acetyl-β-d-glucosamine or glucosamine. Isoenzyme electrophoresis showed that neither the isoenzyme profile of N-acetyl-β-glucosaminidases or the endochitinases nor the intensity of staining of individual chitinase bands correlated with total chitinase in the culture filtrate. The superior chitinase producers did not exhibit similarly increased cellulase formation. Biolog Phenotype MicroArray analysis identified lack of N-acetyl-β-d-mannosamine utilization as a specific trait of strains with the chitinase-overproducing haplotype. This observation was used to develop a plate screening assay for rapid microbiological identification of the strains. The data illustrate that desired industrial properties may be an attribute of certain populations within a species, and screening procedures should thus include a balanced mixture of all

  19. Structure and expression of the gene coding for the alpha-subunit of DNA-dependent RNA polymerase from the chloroplast genome of Zea mays.

    PubMed Central

    Ruf, M; Kössel, H

    1988-01-01

    The rpoA gene coding for the alpha-subunit of DNA-dependent RNA polymerase located on the DNA of Zea mays chloroplasts has been characterized with respect to its position on the chloroplast genome and its nucleotide sequence. The amino acid sequence derived for a 39 Kd polypeptide shows strong homology with sequences derived from the rpoA genes of other chloroplast species and with the amino acid sequence of the alpha-subunit from E. coli RNA polymerase. Transcripts of the rpoA gene were identified by Northern hybridization and characterized by S1 mapping using total RNA isolated from maize chloroplasts. Antibodies raised against a synthetic C-terminal heptapeptide show cross reactivity with a 39 Kd polypeptide contained in the stroma fraction of maize chloroplasts. It is concluded that the rpoA gene is a functional gene and that therefore, at least the alpha-subunit of plastidic RNA polymerase, is expressed in chloroplasts. Images PMID:3399379

  20. Feature Selection and Effective Classifiers.

    ERIC Educational Resources Information Center

    Deogun, Jitender S.; Choubey, Suresh K.; Raghavan, Vijay V.; Sever, Hayri

    1998-01-01

    Develops and analyzes four algorithms for feature selection in the context of rough set methodology. Experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. When compared, results of upper classifiers perform better than lower…

  1. MScanner: a classifier for retrieving Medline citations

    PubMed Central

    Poulter, Graham L; Rubin, Daniel L; Altman, Russ B; Seoighe, Cathal

    2008-01-01

    retrieving topics for which many features may indicate relevance. Its web interface simplifies the task of classifying Medline citations, compared to building a pre-filter and classifier specific to the topic. The data sets and open source code used to obtain the results in this paper are available on-line and as supplementary material, and the web interface may be accessed at . PMID:18284683

  2. Lichenase and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2000-08-15

    The present invention provides a fungal lichenase, i.e., an endo-1,3-1,4-.beta.-D-glucanohydrolase, its coding sequence, recombinant DNA molecules comprising the lichenase coding sequences, recombinant host cells and methods for producing same. The present lichenase is from Orpinomyces PC-2.

  3. Sequence analysis of coding DNA fragments of pfcrt and pfmdr-1 genes in Plasmodium falciparum isolates from Odisha, India.

    PubMed

    Sutar, Sasmita Kumari Das; Gupta, Bhavna; Ranjit, Manoranjan; Kar, Shantanu Kumar; Das, Aparup

    2011-02-01

    The global emergence and spread of malaria parasites resistant to antimalarial drugs is the major problem in malaria control. The genetic basis of the parasite's resistance to the antimalarial drug chloroquine (CQ) is well-documented, allowing for the analysis of field isolates of malaria parasites to address evolutionary questions concerning the origin and spread of CQ-resistance. Here, we present DNA sequence analyses of both the second exon of the Plasmodium falciparum CQ-resistance transporter (pfcrt) gene and the 5' end of the P. falciparum multidrug-resistance 1 (pfmdr-1) gene in 40 P. falciparum field isolates collected from eight different localities of Odisha, India. First, we genotyped the samples for the pfcrt K76T and pfmdr-1 N86Y mutations in these two genes, which are the mutations primarily implicated in CQ-resistance. We further analyzed amino acid changes in codons 72-76 of the pfcrt haplotypes. Interestingly, both the K76T and N86Y mutations were found to co-exist in 32 out of the total 40 isolates, which were of either the CVIET or SVMNT haplotype, while the remaining eight isolates were of the CVMNK haplotype. In total, eight nonsynonymous single nucleotide polymorphisms (SNPs) were observed, six in the pfcrt gene and two in the pfmdr-1 gene. One poorly studied SNP in the pfcrt gene (A97T) was found at a high frequency in many P. falciparum samples. Using population genetics to analyze these two gene fragments, we revealed comparatively higher nucleotide diversity in the pfcrt gene than in the pfmdr-1 gene. Furthermore, linkage disequilibrium was found to be tight between closely spaced SNPs of the pfcrt gene. Finally, both the pfcrt and the pfmdr-1 genes were found to evolve under the standard neutral model of molecular evolution.

  4. Classifying Chondrules Based on Cathodoluminesence

    NASA Astrophysics Data System (ADS)

    Cristarela, T. C.; Sears, D. W.

    2011-03-01

    Sears et al. (1991) proposed a scheme to classify chondrules based on cathodoluminesence color and electron microprobe analysis. This research evaluates that scheme and criticisms received from Grossman and Brearley (2005).

  5. IAEA safeguards and classified materials

    SciTech Connect

    Pilat, J.F.; Eccleston, G.W.; Fearey, B.L.; Nicholas, N.J.; Tape, J.W.; Kratzer, M.

    1997-11-01

    The international community in the post-Cold War period has suggested that the International Atomic Energy Agency (IAEA) utilize its expertise in support of the arms control and disarmament process in unprecedented ways. The pledges of the US and Russian presidents to place excess defense materials, some of which are classified, under some type of international inspections raises the prospect of using IAEA safeguards approaches for monitoring classified materials. A traditional safeguards approach, based on nuclear material accountancy, would seem unavoidably to reveal classified information. However, further analysis of the IAEA`s safeguards approaches is warranted in order to understand fully the scope and nature of any problems. The issues are complex and difficult, and it is expected that common technical understandings will be essential for their resolution. Accordingly, this paper examines and compares traditional safeguards item accounting of fuel at a nuclear power station (especially spent fuel) with the challenges presented by inspections of classified materials. This analysis is intended to delineate more clearly the problems as well as reveal possible approaches, techniques, and technologies that could allow the adaptation of safeguards to the unprecedented task of inspecting classified materials. It is also hoped that a discussion of these issues can advance ongoing political-technical debates on international inspections of excess classified materials.

  6. The effect of non-coding DNA variations on P53 and cMYC competitive inhibition at cis-overlapping motifs.

    PubMed

    Kin, Katherine; Chen, Xi; Gonzalez-Garay, Manuel; Fakhouri, Walid D

    2016-04-15

    Non-coding DNA variations play a critical role in increasing the risk for development of common complex diseases, and account for the majority of SNPs highly associated with cancer. However, it remains a challenge to identify etiologic variants and to predict their pathological effects on target gene expression for clinical purposes. Cis-overlapping motifs (COMs) are elements of enhancer regions that impact gene expression by enabling competitive binding and switching between transcription factors. Mutations within COMs are especially important when the involved transcription factors have opposing effects on gene regulation, like P53 tumor suppressor and cMYC proto-oncogene. In this study, genome-wide analysis of ChIP-seq data from human cancer and mouse embryonic cells identified a significant number of putative regulatory elements with signals for both P53 and cMYC. Each co-occupied element contains, on average, two COMs, and one common SNP every two COMs. Gene ontology of predicted target genes for COMs showed that the majority are involved in DNA damage, apoptosis, cell cycle regulation, and RNA processing. EMSA results showed that both cMYC and P53 bind to cis-overlapping motifs within a ChIP-seq co-occupied region in Chr12. In vitro functional analysis of selected co-occupied elements verified enhancer activity, and also showed that the occurrence of SNPs within three COMs significantly altered enhancer activity. We identified a list of COM-associated functional SNPs that are in close proximity to SNPs associated with common diseases in large population studies. These results suggest a potential molecular mechanism to identify etiologic regulatory mutations associated with common diseases.

  7. Cloning and characterization of a cDNA coding for Astacus embryonic astacin, a member of the astacin family of metalloproteases from the crayfish Astacus astacus.

    PubMed

    Geier, G; Zwilling, R

    1998-05-01

    The astacin family of zinc endopeptidases was named after the digestive enzyme astacin isolated from the crayfish Astacus astacus. Employing a reverse transcription/PCR strategy with degenerate oligonucleotide primers specific for two signature seqences of the astacin family, we have isolated a 1602-bp cDNA from embryos of developing A. astacus eggs, which was designated Astacus embryonic astacin (AEA). This cDNA was found to code for an astacin-like protease domain which accounts for the N-terminal half of the predicted protein. The C-terminal half mainly consists of two complement subcomponent C1r/C1s/embryonic sea urchin protein Uegf/bone morphogenetic protein 1 (CUB) domains. The metalloprotease domain displays an amino acid sequence identity of 42% with astacin. A higher sequence similarity was found to astacin family members that act as hatching enzymes in different species, e.g. chorioallantoic membrane protein 1 (CAM-1; from quail) and Xenopus hatching enzyme (formerly UVS.2), both of which show 54% identity, and high and low choriolytic enzymes (HCE and LCE) from the teleost Oryzias latipes (52% and 48% identity, respectively). A relationship to astacin-like hatching enzymes is further supported by a phylogenetic analysis of the protease domains. Expression of AEA mRNA in developing embryos was found to be restricted to unhatched juveniles (larvae) during the last 8 days before hatching. AEA transcripts could not be detected in various tissues of adult animals or in eggs and embryos from an earlier developmental stage. AEA expression starts about 8 days prior to hatching, followed by a strong (18-fold) induction with a maximum at day 4 before hatching. Newly hatched juveniles were found not to express the AEA mRNA.

  8. Polar Codes

    DTIC Science & Technology

    2014-12-01

    density parity check (LDPC) code, a Reed–Solomon code, and three convolutional codes. iii CONTENTS EXECUTIVE SUMMARY...the most common. Many civilian systems use low density parity check (LDPC) FEC codes, and the Navy is planning to use LDPC for some future systems...other forward error correction methods: a turbo code, a low density parity check (LDPC) code, a Reed–Solomon code, and three convolutional codes

  9. Building classifiers using Bayesian networks

    SciTech Connect

    Friedman, N.; Goldszmidt, M.

    1996-12-31

    Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state of the art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we examine and evaluate approaches for inducing classifiers from data, based on recent results in the theory of learning Bayesian networks. Bayesian networks are factored representations of probability distributions that generalize the naive Bayes classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness which are characteristic of naive Bayes. We experimentally tested these approaches using benchmark problems from the U. C. Irvine repository, and compared them against C4.5, naive Bayes, and wrapper-based feature selection methods.

  10. Isolation and sequencing of cDNA clones coding for the catalytic unit of glucose-6-phosphatase from two haplochromine cichlid fishes.

    PubMed

    Nagl, S; Mayer, W E; Klein, J

    1999-01-01

    Complementary DNA clones coding for the catalytic unit of the enzyme glucose-6-phosphatase (G6Pase) were obtained from Haplochromis nubilus and Haplochromis xenognathus, two cichlid fish species from Lake Victoria. The translated sequence of these two cDNAs identifies a polypeptide consisting of 352 amino acid residues and showing a 54.4% similarity to the human form of G6Pase. The amino acid sequences of the two fish species are identical. The comparison of the fish amino acid sequence with the corresponding sequences of rat, mouse, and human G6Pase revealed that the amino acid residues, which are involved in G6Pase catalysis in humans, are also conserved in fish G6Pase. Northern blot analysis showed that G6Pase is expressed at the same level in 6- and 10-day-old fish. A three base pair insertion/deletion polymorphism was found in the 3'-untranslated region of the fish G6Pase gene. The polymorphism will be a useful marker in a phylogenetic study of Lake Victoria cichlids.

  11. Translator, Traitor, Source of Data: Classifying Translations of "Foreign Phrases" as an Awareness-Raising Exercise.

    ERIC Educational Resources Information Center

    Parkinson, Brian

    1998-01-01

    A system for classifying (coding) translations of sentence-length or similar material is presented and illustrated with codings of entries in the "Dictionary of Foreign Phrases and Classical Quotations." Problems in coding are discussed, relating especially to intertextuality, intention, and ownership. The system is intended for pedagogic use, and…

  12. A region of the polyoma virus genome between the replication origin and late protein coding sequences is required in cis for both early gene expression and viral DNA replication.

    PubMed Central

    Tyndall, C; La Mantia, G; Thacker, C M; Favaloro, J; Kamen, R

    1981-01-01

    Deletion mutants within the Py DNA region between the replication origin and the beginning of late protein coding sequences have been constructed and analysed for viability, early gene expression and viral DNA replication. Assay of replicative competence was facilitated by the use of Py transformed mouse cells (COP lines) which express functional large T-protein but contain no free viral DNA. Viable mutants defined three new nonessential regions of the genome. Certain deletions spanning the PvuII site at nt 5130 (67.4 mu) were unable to express early genes and had a cis-acting defect in DNA replication. Other mutants had intermediate phenotypes. Relevance of these results to eucaryotic "enhancer" elements is discussed. Images PMID:6275353

  13. Clinical coding. Code breakers.

    PubMed

    Mathieson, Steve

    2005-02-24

    --The advent of payment by results has seen the role of the clinical coder pushed to the fore in England. --Examinations for a clinical coding qualification began in 1999. In 2004, approximately 200 people took the qualification. --Trusts are attracting people to the role by offering training from scratch or through modern apprenticeships.

  14. Maximum margin Bayesian network classifiers.

    PubMed

    Pernkopf, Franz; Wohlmayr, Michael; Tschiatschek, Sebastian

    2012-03-01

    We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.

  15. Classifying Cereal Data (Earlier Methods)

    Cancer.gov

    The DSQ includes questions about cereal intake and allows respondents up to two responses on which cereals they consume. We classified each cereal reported first by hot or cold, and then along four dimensions: density of added sugars, whole grains, fiber, and calcium.

  16. A Framework for Classifying Decision Support Systems

    PubMed Central

    Sim, Ida; Berlin, Amy

    2003-01-01

    Background Computer-based clinical decision support systems (CDSSs) vary greatly in design and function. A taxonomy for classifying CDSS structure and function would help efforts to describe and understand the variety of CDSSs in the literature, and to explore predictors of CDSS effectiveness and generalizability. Objective To define and test a taxonomy for characterizing the contextual, technical, and workflow features of CDSSs. Methods We retrieved and analyzed 150 English language articles published between 1975 and 2002 that described computer systems designed to assist physicians and/or patients with clinical decision making. We identified aspects of CDSS structure or function and iterated our taxonomy until additional article reviews did not result in any new descriptors or taxonomic modifications. Results Our taxonomy comprises 95 descriptors along 24 descriptive axes. These axes are in 5 categories: Context, Knowledge and Data Source, Decision Support, Information Delivery, and Workflow. The axes had an average of 3.96 coded choices each. 75% of the descriptors had an inter-rater agreement kappa of greater than 0.6. Conclusions We have defined and tested a comprehensive, multi-faceted taxonomy of CDSSs that shows promising reliability for classifying CDSSs reported in the literature. PMID:14728243

  17. Explosive Formulation Code Naming SOP

    SciTech Connect

    Martz, H. E.

    2014-09-19

    The purpose of this SOP is to provide a procedure for giving individual HME formulations code names. A code name for an individual HME formulation consists of an explosive family code, given by the classified guide, followed by a dash, -, and a number. If the formulation requires preparation such as packing or aging, these add additional groups of symbols to the X-ray specimen name.

  18. 76 FR 34761 - Classified National Security Information

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-14

    ... Classified National Security Information AGENCY: Marine Mammal Commission. ACTION: Notice. SUMMARY: This... information, as directed by Information Security Oversight Office regulations. FOR FURTHER INFORMATION CONTACT..., ``Classified National Security Information,'' and 32 CFR part 2001, ``Classified National Security...

  19. Energy-Efficient Neuromorphic Classifiers.

    PubMed

    Martí, Daniel; Rigotti, Mattia; Seok, Mingoo; Fusi, Stefano

    2016-10-01

    Neuromorphic engineering combines the architectural and computational principles of systems neuroscience with semiconductor electronics, with the aim of building efficient and compact devices that mimic the synaptic and neural machinery of the brain. The energy consumptions promised by neuromorphic engineering are extremely low, comparable to those of the nervous system. Until now, however, the neuromorphic approach has been restricted to relatively simple circuits and specialized functions, thereby obfuscating a direct comparison of their energy consumption to that used by conventional von Neumann digital machines solving real-world tasks. Here we show that a recent technology developed by IBM can be leveraged to realize neuromorphic circuits that operate as classifiers of complex real-world stimuli. Specifically, we provide a set of general prescriptions to enable the practical implementation of neural architectures that compete with state-of-the-art classifiers. We also show that the energy consumption of these architectures, realized on the IBM chip, is typically two or more orders of magnitude lower than that of conventional digital machines implementing classifiers with comparable performance. Moreover, the spike-based dynamics display a trade-off between integration time and accuracy, which naturally translates into algorithms that can be flexibly deployed for either fast and approximate classifications, or more accurate classifications at the mere expense of longer running times and higher energy costs. This work finally proves that the neuromorphic approach can be efficiently used in real-world applications and has significant advantages over conventional digital devices when energy consumption is considered.

  20. An enhanced MITOMAP with a global mtDNA mutational phylogeny

    PubMed Central

    Ruiz-Pesini, Eduardo; Lott, Marie T.; Procaccio, Vincent; Poole, Jason C.; Brandon, Marty C.; Mishmar, Dan; Yi, Christina; Kreuziger, James; Baldi, Pierre; Wallace, Douglas C.

    2007-01-01

    The MITOMAP () data system for the human mitochondrial genome has been greatly enhanced by the addition of a navigable mutational mitochondrial DNA (mtDNA) phylogenetic tree of ∼3000 mtDNA coding region sequences plus expanded pathogenic mutation tables and a nuclear-mtDNA pseudogene (NUMT) data base. The phylogeny reconstructs the entire mutational history of the human mtDNA, thus defining the mtDNA haplogroups and differentiating ancient from recent mtDNA mutations. Pathogenic mutations are classified by both genotype and phenotype, and the NUMT sequences permits detection of spurious inclusion of pseudogene variants during mutation analysis. These additions position MITOMAP for the implementation of our automated mtDNA sequence analysis system, Mitomaster. PMID:17178747

  1. Learning to classify species with barcodes

    PubMed Central

    Bertolazzi, Paola; Felici, Giovanni; Weitschek, Emanuel

    2009-01-01

    Background According to many field experts, specimens classification based on morphological keys needs to be supported with automated techniques based on the analysis of DNA fragments. The most successful results in this area are those obtained from a particular fragment of mitochondrial DNA, the gene cytochrome c oxidase I (COI) (the "barcode"). Since 2004 the Consortium for the Barcode of Life (CBOL) promotes the collection of barcode specimens and the development of methods to analyze the barcode for several tasks, among which the identification of rules to correctly classify an individual into its species by reading its barcode. Results We adopt a Logic Mining method based on two optimization models and present the results obtained on two datasets where a number of COI fragments are used to describe the individuals that belong to different species. The method proposed exhibits high correct recognition rates on a training-testing split of the available data using a small proportion of the information available (e.g., correct recognition approx. 97% when only 20 sites of the 648 available are used). The method is able to provide compact formulas on the values (A, C, G, T) at the selected sites that synthesize the characteristic of each species, a relevant information for taxonomists. Conclusion We have presented a Logic Mining technique designed to analyze barcode data and to provide detailed output of interest to the taxonomists and the barcode community represented in the CBOL Consortium. The method has proven to be effective, efficient and precise. PMID:19900303

  2. Dimensionality Reduction Through Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.; Tumer, Kagan; Norwig, Peter (Technical Monitor)

    1999-01-01

    In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.

  3. Classifying sex biased congenital anomalies

    SciTech Connect

    Lubinsky, M.S.

    1997-03-31

    The reasons for sex biases in congenital anomalies that arise before structural or hormonal dimorphisms are established has long been unclear. A review of such disorders shows that patterning and tissue anomalies are female biased, and structural findings are more common in males. This suggests different gender dependent susceptibilities to developmental disturbances, with female vulnerabilities focused on early blastogenesis/determination, while males are more likely to involve later organogenesis/morphogenesis. A dual origin for some anomalies explains paradoxical reductions of sex biases with greater severity (i.e., multiple rather than single malformations), presumably as more severe events increase the involvement of an otherwise minor process with opposite biases to those of the primary mechanism. The cause for these sex differences is unknown, but early dimorphisms, such as differences in growth or presence of H-Y antigen, may be responsible. This model provides a useful rationale for understanding and classifying sex-biased congenital anomalies. 42 refs., 7 tabs.

  4. Ethical coding.

    PubMed

    Resnik, Barry I

    2009-01-01

    It is ethical, legal, and proper for a dermatologist to maximize income through proper coding of patient encounters and procedures. The overzealous physician can misinterpret reimbursement requirements or receive bad advice from other physicians and cross the line from aggressive coding to coding fraud. Several of the more common problem areas are discussed.

  5. 75 FR 707 - Classified National Security Information

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-01-05

    ... National Security Information Memorandum of December 29, 2009--Implementation of the Executive Order ``Classified National Security Information'' Order of December 29, 2009--Original Classification Authority #0... 13526 of December 29, 2009 Classified National Security Information This order prescribes a...

  6. Efficient DNA barcode regions for classifying Piper species (Piperaceae).

    PubMed

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium, Piper magnibaccum and Piper kraense, Piper phuwuaense and Piper dominantinervium, Piper phuwuaense and Piper kraense, Piper pilobracteatum and Piper dominantinervium, Piper pilobracteatum and Piper kraense, Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species.

  7. Efficient DNA barcode regions for classifying Piper species (Piperaceae)

    PubMed Central

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Abstract Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium, Piper magnibaccum and Piper kraense, Piper phuwuaense and Piper dominantinervium, Piper phuwuaense and Piper kraense, Piper pilobracteatum and Piper dominantinervium, Piper pilobracteatum and Piper kraense, Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species. PMID:27829794

  8. 32 CFR 775.5 - Classified actions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Air Act (42 U.S.C. 7609 et seq.). (b) It should be noted that a classified EA/EIS serves the same “informed decisionmaking” purpose as does a published unclassified EA/EIS. Even though the classified EA/EIS... be considered by the decisionmaker for the proposed action. The content of a classified EA/EIS...

  9. 15 CFR 4.8 - Classified Information.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Classified Information. 4.8 Section 4... INFORMATION Freedom of Information Act § 4.8 Classified Information. In processing a request for information..., the information shall be reviewed to determine whether it should remain classified. Ordinarily...

  10. 32 CFR 1602.8 - Classifying authority.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Classifying authority. 1602.8 Section 1602.8 National Defense Other Regulations Relating to National Defense SELECTIVE SERVICE SYSTEM DEFINITIONS § 1602.8 Classifying authority. The term classifying authority refers to any official or board who...

  11. Sharing code.

    PubMed

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  12. Measuring Diagnoses: ICD Code Accuracy

    PubMed Central

    O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M

    2005-01-01

    Objective To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. Data Sources/Study Setting The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. Study Design/Methods We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Principle Findings Main error sources along the “patient trajectory” include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the “paper trail” include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. Conclusions By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways. PMID:16178999

  13. DNA polymorphism in morels: complete sequences of the internal transcribed spacer of genes coding for rRNA in Morchella esculenta (yellow morel) and Morchella conica (black morel).

    PubMed

    Wipf, D; Munch, J C; Botton, B; Buscot, F

    1996-09-01

    The internal transcribed spacer (ITS) of the gene coding for rRNA was sequenced in both directions with the gene walking technique in a black morel (Morchella conica) and a yellow morel (M. esculenta) to elucidate the ITS length discrepancy between the two species groups (750-bp ITS in black morels and 1,150-bp ITS in yellow morels.

  14. Isolation and expression of a novel chick G-protein cDNA coding for a G alpha i3 protein with a G alpha 0 N-terminus.

    PubMed Central

    Kilbourne, E J; Galper, J B

    1994-01-01

    We have cloned cDNAs coding for G-protein alpha subunits from a chick brain cDNA library. Based on sequence similarity to G-protein alpha subunits from other eukaryotes, one clone was designated G alpha i3. A second clone, G alpha i3-o, was identical to the G alpha i3 clone over 932 bases on the 3' end. The 5' end of G alpha i3-o, however, contained an alternative sequence in which the first 45 amino acids coded for are 100% identical to the conserved N-terminus of G alpha o from species such as rat, mouse, human, bovine and hamster. Both clones were found to be expressed in all tissues studied. The unusual alpha o-alpha i3-like G-protein chimera, G alpha i3-o, was found to be expressed at significantly lower levels than G alpha i3. In vitro transcription and translation of the G alpha i3-o cDNA clone gave a protein of approx. 41 kDa which stably bound guanosine 5'-[gamma-thio]triphosphate. G alpha i3-o appears to be the first G-protein alpha subunit cloned which contains ends that are homologous to two different alpha subunit isoforms, G alpha o and G alpha i3. Images Figure 4 Figure 5 Figure 6 Figure 7 PMID:8297335

  15. 22 CFR 125.3 - Exports of classified technical data and classified defense articles.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Exports of classified technical data and... IN ARMS REGULATIONS LICENSES FOR THE EXPORT OF TECHNICAL DATA AND CLASSIFIED DEFENSE ARTICLES § 125.3 Exports of classified technical data and classified defense articles. (a) A request for authority...

  16. Monitoring tool wear using classifier fusion

    NASA Astrophysics Data System (ADS)

    Kannatey-Asibu, Elijah; Yum, Juil; Kim, T. H.

    2017-02-01

    Real time monitoring of manufacturing processes using a single sensor often poses significant challenge. Sensor fusion has thus been extensively investigated in recent years for process monitoring with significant improvement in performance. This paper presents the results for a monitoring system based on the concept of classifier fusion, and class-weighted voting is investigated to further enhance the system performance. Classifier weights are based on the overall performances of individual classifiers, and majority voting is used in decision making. Acoustic emission monitoring of tool wear during the coroning process is used to illustrate the concept. A classification rate of 87.7% was obtained for classifier fusion with unity weighting. When weighting was based on overall performance of the respective classifiers, the classification rate improved to 95.6%. Further using state performance weighting resulted in a 98.5% classification. Finally, the classifier fusion performance further increased to 99.7% when a penalty vote was applied on the weighting factor.

  17. Error minimizing algorithms for nearest eighbor classifiers

    SciTech Connect

    Porter, Reid B; Hush, Don; Zimmer, G. Beate

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.

  18. Classifying the Quantum Phases of Matter

    DTIC Science & Technology

    2015-01-01

    CLASSIFYING THE QUANTUM PHASES OF MATTER CALIFORNIA INSTITUTE OF TECHNOLOGY JANUARY 2015 FINAL TECHNICAL REPORT...REPORT 3. DATES COVERED (From - To) JAN 2012 – AUG 2014 4. TITLE AND SUBTITLE CLASSIFYING THE QUANTUM PHASES OF MATTER 5a. CONTRACT NUMBER FA8750-12-2...16 Jan 09. 13. SUPPLEMENTARY NOTES 14. ABSTRACT This is the final report for "Classifying the Quantum Phases of Matter," FA8750-12-2-0308. Among

  19. Sharing code

    PubMed Central

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing. PMID:25165519

  20. The changing epitome of species identification – DNA barcoding

    PubMed Central

    Ajmal Ali, M.; Gyulai, Gábor; Hidvégi, Norbert; Kerti, Balázs; Al Hemaid, Fahad M.A.; Pandey, Arun K.; Lee, Joongku

    2014-01-01

    The discipline taxonomy (the science of naming and classifying organisms, the original bioinformatics and a basis for all biology) is fundamentally important in ensuring the quality of life of future human generation on the earth; yet over the past few decades, the teaching and research funding in taxonomy have declined because of its classical way of practice which lead the discipline many a times to a subject of opinion, and this ultimately gave birth to several problems and challenges, and therefore the taxonomist became an endangered race in the era of genomics. Now taxonomy suddenly became fashionable again due to revolutionary approaches in taxonomy called DNA barcoding (a novel technology to provide rapid, accurate, and automated species identifications using short orthologous DNA sequences). In DNA barcoding, complete data set can be obtained from a single specimen irrespective to morphological or life stage characters. The core idea of DNA barcoding is based on the fact that the highly conserved stretches of DNA, either coding or non coding regions, vary at very minor degree during the evolution within the species. Sequences suggested to be useful in DNA barcoding include cytoplasmic mitochondrial DNA (e.g. cox1) and chloroplast DNA (e.g. rbcL, trnL-F, matK, ndhF, and atpB rbcL), and nuclear DNA (ITS, and house keeping genes e.g. gapdh). The plant DNA barcoding is now transitioning the epitome of species identification; and thus, ultimately helping in the molecularization of taxonomy, a need of the hour. The ‘DNA barcodes’ show promise in providing a practical, standardized, species-level identification tool that can be used for biodiversity assessment, life history and ecological studies, forensic analysis, and many more. PMID:24955007

  1. The changing epitome of species identification - DNA barcoding.

    PubMed

    Ajmal Ali, M; Gyulai, Gábor; Hidvégi, Norbert; Kerti, Balázs; Al Hemaid, Fahad M A; Pandey, Arun K; Lee, Joongku

    2014-07-01

    The discipline taxonomy (the science of naming and classifying organisms, the original bioinformatics and a basis for all biology) is fundamentally important in ensuring the quality of life of future human generation on the earth; yet over the past few decades, the teaching and research funding in taxonomy have declined because of its classical way of practice which lead the discipline many a times to a subject of opinion, and this ultimately gave birth to several problems and challenges, and therefore the taxonomist became an endangered race in the era of genomics. Now taxonomy suddenly became fashionable again due to revolutionary approaches in taxonomy called DNA barcoding (a novel technology to provide rapid, accurate, and automated species identifications using short orthologous DNA sequences). In DNA barcoding, complete data set can be obtained from a single specimen irrespective to morphological or life stage characters. The core idea of DNA barcoding is based on the fact that the highly conserved stretches of DNA, either coding or non coding regions, vary at very minor degree during the evolution within the species. Sequences suggested to be useful in DNA barcoding include cytoplasmic mitochondrial DNA (e.g. cox1) and chloroplast DNA (e.g. rbcL, trnL-F, matK, ndhF, and atpB rbcL), and nuclear DNA (ITS, and house keeping genes e.g. gapdh). The plant DNA barcoding is now transitioning the epitome of species identification; and thus, ultimately helping in the molecularization of taxonomy, a need of the hour. The 'DNA barcodes' show promise in providing a practical, standardized, species-level identification tool that can be used for biodiversity assessment, life history and ecological studies, forensic analysis, and many more.

  2. 48 CFR 927.207 - Classified contracts.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Classified contracts. 927.207 Section 927.207 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL CONTRACTING REQUIREMENTS PATENTS, DATA, AND COPYRIGHTS Patents 927.207 Classified contracts....

  3. A fuzzy classifier system for process control

    NASA Technical Reports Server (NTRS)

    Karr, C. L.; Phillips, J. C.

    1994-01-01

    A fuzzy classifier system that discovers rules for controlling a mathematical model of a pH titration system was developed by researchers at the U.S. Bureau of Mines (USBM). Fuzzy classifier systems successfully combine the strengths of learning classifier systems and fuzzy logic controllers. Learning classifier systems resemble familiar production rule-based systems, but they represent their IF-THEN rules by strings of characters rather than in the traditional linguistic terms. Fuzzy logic is a tool that allows for the incorporation of abstract concepts into rule based-systems, thereby allowing the rules to resemble the familiar 'rules-of-thumb' commonly used by humans when solving difficult process control and reasoning problems. Like learning classifier systems, fuzzy classifier systems employ a genetic algorithm to explore and sample new rules for manipulating the problem environment. Like fuzzy logic controllers, fuzzy classifier systems encapsulate knowledge in the form of production rules. The results presented in this paper demonstrate the ability of fuzzy classifier systems to generate a fuzzy logic-based process control system.

  4. The coding region of the UFGT gene is a source of diagnostic SNP markers that allow single-locus DNA genotyping for the assessment of cultivar identity and ancestry in grapevine (Vitis vinifera L.)

    PubMed Central

    2013-01-01

    Background Vitis vinifera L. is one of society’s most important agricultural crops with a broad genetic variability. The difficulty in recognizing grapevine genotypes based on ampelographic traits and secondary metabolites prompted the development of molecular markers suitable for achieving variety genetic identification. Findings Here, we propose a comparison between a multi-locus barcoding approach based on six chloroplast markers and a single-copy nuclear gene sequencing method using five coding regions combined with a character-based system with the aim of reconstructing cultivar-specific haplotypes and genotypes to be exploited for the molecular characterization of 157 V. vinifera accessions. The analysis of the chloroplast target regions proved the inadequacy of the DNA barcoding approach at the subspecies level, and hence further DNA genotyping analyses were targeted on the sequences of five nuclear single-copy genes amplified across all of the accessions. The sequencing of the coding region of the UFGT nuclear gene (UDP-glucose: flavonoid 3-0-glucosyltransferase, the key enzyme for the accumulation of anthocyanins in berry skins) enabled the discovery of discriminant SNPs (1/34 bp) and the reconstruction of 130 V. vinifera distinct genotypes. Most of the genotypes proved to be cultivar-specific, and only few genotypes were shared by more, although strictly related, cultivars. Conclusion On the whole, this technique was successful for inferring SNP-based genotypes of grapevine accessions suitable for assessing the genetic identity and ancestry of international cultivars and also useful for corroborating some hypotheses regarding the origin of local varieties, suggesting several issues of misidentification (synonymy/homonymy). PMID:24298902

  5. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  6. Logarithmic learning for generalized classifier neural network.

    PubMed

    Ozyildirim, Buse Melis; Avci, Mutlu

    2014-12-01

    Generalized classifier neural network is introduced as an efficient classifier among the others. Unless the initial smoothing parameter value is close to the optimal one, generalized classifier neural network suffers from convergence problem and requires quite a long time to converge. In this work, to overcome this problem, a logarithmic learning approach is proposed. The proposed method uses logarithmic cost function instead of squared error. Minimization of this cost function reduces the number of iterations used for reaching the minima. The proposed method is tested on 15 different data sets and performance of logarithmic learning generalized classifier neural network is compared with that of standard one. Thanks to operation range of radial basis function included by generalized classifier neural network, proposed logarithmic approach and its derivative has continuous values. This makes it possible to adopt the advantage of logarithmic fast convergence by the proposed learning method. Due to fast convergence ability of logarithmic cost function, training time is maximally decreased to 99.2%. In addition to decrease in training time, classification performance may also be improved till 60%. According to the test results, while the proposed method provides a solution for time requirement problem of generalized classifier neural network, it may also improve the classification accuracy. The proposed method can be considered as an efficient way for reducing the time requirement problem of generalized classifier neural network.

  7. Phylogenetic footprinting of non-coding RNA: hammerhead ribozyme sequences in a satellite DNA family of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae)

    PubMed Central

    2010-01-01

    Background The great variety in sequence, length, complexity, and abundance of satellite DNA has made it difficult to ascribe any function to this genome component. Recent studies have shown that satellite DNA can be transcribed and be involved in regulation of chromatin structure and gene expression. Some satellite DNAs, such as the pDo500 sequence family in Dolichopoda cave crickets, have a catalytic hammerhead (HH) ribozyme structure and activity embedded within each repeat. Results We assessed the phylogenetic footprints of the HH ribozyme within the pDo500 sequences from 38 different populations representing 12 species of Dolichopoda. The HH region was significantly more conserved than the non-hammerhead (NHH) region of the pDo500 repeat. In addition, stems were more conserved than loops. In stems, several compensatory mutations were detected that maintain base pairing. The core region of the HH ribozyme was affected by very few nucleotide substitutions and the cleavage position was altered only once among 198 sequences. RNA folding of the HH sequences revealed that a potentially active HH ribozyme can be found in most of the Dolichopoda populations and species. Conclusions The phylogenetic footprints suggest that the HH region of the pDo500 sequence family is selected for function in Dolichopoda cave crickets. However, the functional role of HH ribozymes in eukaryotic organisms is unclear. The possible functions have been related to trans cleavage of an RNA target by a ribonucleoprotein and regulation of gene expression. Whether the HH ribozyme in Dolichopoda is involved in similar functions remains to be investigated. Future studies need to demonstrate how the observed nucleotide changes and evolutionary constraint have affected the catalytic efficiency of the hammerhead. PMID:20047671

  8. Epigenetic DNA-methylation regulation of genes coding for lipid raft-associated components: a role for raft proteins in cell transformation and cancer progression (review).

    PubMed

    Patra, Samir K; Bettuzzi, Saverio

    2007-06-01

    Metastatic progression is the cause of most cancer deaths. Host tumour cell separation (fission) is accompanied by simultaneous acquisition of migrating capability of cancer cells, remodeling of cellular architecture and effective 'homing' in body host environment. Cell remodeling involves cytoskeletal protein-protein and lipid-protein interaction together with altered signaling. Alteration of signaling in tumour cells may affect expression of many genes also by DNA-methylation/demethylation. This would alter the steady-state intracellular level of structural proteins or metabolic enzymes, and notably enzymes involved in the biosynthesis of lipids, affecting the composition of membranes. Lipid rafts are small, heterogeneous, highly dynamic, sterol- and sphingolipid-enriched domains that compartmentalize cellular processes. Small rafts can be stabilized to form larger platforms through protein-protein and protein-lipid interactions. Lipid rafts play an important role in intracellular protein transport, membrane fusion and trans-cytosis, also being platforms for cell surface antigens and adhesion molecules which are crucial for cell activation, polarization and signaling. Detachment of individual tumour cells from the host tumour lump requires lipid-protein-lipid raft (LPLR) reordering. Lipid rafts are also involved in angiogenesis and local invasion, which occurs within the host tumour vicinity by exchange of enzymes, cytokines and motility factors that modify the surrounding extracellular matrix (ECM). Many cell surface adhesion, ECM, and signaling proteins (such as E-cadherin, catenin, CD44, MMP-9 and caveolin-1) are known to be absent or reduced following gene promoter-CpG-island hypermethylation in mid-stage growing tumours, but re-expressed (by gene promoter-mCpG-DNA demethylation) in carcinomas such as metastasized lung, prostate and sarcomas. The recent research acquisitions on lipid rafts have tremendous implications in understanding the genetic and

  9. Haplogrouping mitochondrial DNA sequences in Legal Medicine/Forensic Genetics.

    PubMed

    Bandelt, Hans-Jürgen; van Oven, Mannis; Salas, Antonio

    2012-11-01

    Haplogrouping refers to the classification of (partial) mitochondrial DNA (mtDNA) sequences into haplogroups using the current knowledge of the worldwide mtDNA phylogeny. Haplogroup assignment of mtDNA control-region sequences assists in the focused comparison with closely related complete mtDNA sequences and thus serves two main goals in forensic genetics: first is the a posteriori quality analysis of sequencing results and second is the prediction of relevant coding-region sites for confirmation or further refinement of haplogroup status. The latter may be important in forensic casework where discrimination power needs to be as high as possible. However, most articles published in forensic genetics perform haplogrouping only in a rudimentary or incorrect way. The present study features PhyloTree as the key tool for assigning control-region sequences to haplogroups and elaborates on additional Web-based searches for finding near-matches with complete mtDNA genomes in the databases. In contrast, none of the automated haplogrouping tools available can yet compete with manual haplogrouping using PhyloTree plus additional Web-based searches, especially when confronted with artificial recombinants still present in forensic mtDNA datasets. We review and classify the various attempts at haplogrouping by using a multiplex approach or relying on automated haplogrouping. Furthermore, we re-examine a few articles in forensic journals providing mtDNA population data where appropriate haplogrouping following PhyloTree immediately highlights several kinds of sequence errors.

  10. How Is Acute Lymphocytic Leukemia Classified?

    MedlinePlus

    ... Adults Early Detection, Diagnosis, and Types How Is Acute Lymphocytic Leukemia Classified? Most types of cancers are assigned numbered ... ALL are now named as follows: B-cell ALL Early pre-B ALL (also called pro-B ...

  11. 14 CFR 1216.310 - Classified actions.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... actions. (a) Classification does not relieve NASA of the requirement to assess, document, and consider the environmental impacts of a proposed action. (b) When classified information can reasonably be separated...

  12. Cascaded multiple classifiers for secondary structure prediction.

    PubMed Central

    Ouali, M.; King, R. D.

    2000-01-01

    We describe a new classifier for protein secondary structure prediction that is formed by cascading together different types of classifiers using neural networks and linear discrimination. The new classifier achieves an accuracy of 76.7% (assessed by a rigorous full Jack-knife procedure) on a new nonredundant dataset of 496 nonhomologous sequences (obtained from G.J. Barton and J.A. Cuff). This database was especially designed to train and test protein secondary structure prediction methods, and it uses a more stringent definition of homologous sequence than in previous studies. We show that it is possible to design classifiers that can highly discriminate the three classes (H, E, C) with an accuracy of up to 78% for beta-strands, using only a local window and resampling techniques. This indicates that the importance of long-range interactions for the prediction of beta-strands has been probably previously overestimated. PMID:10892809

  13. 32 CFR 1633.1 - Classifying authority.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... reclassify a registrant other than a volunteer for induction, into Class 1-A out of another class prior to... issuing an induction order to a registrant, appropriately classify him if the Secretary of Defense...

  14. Adaptive Bayes classifiers for remotely sensed data

    NASA Technical Reports Server (NTRS)

    Raulston, H. S.; Pace, M. O.; Gonzalez, R. C.

    1975-01-01

    An algorithm is developed for a learning, adaptive, statistical pattern classifier for remotely sensed data. The estimation procedure consists of two steps: (1) an optimal stochastic approximation of the parameters of interest, and (2) a projection of the parameters in time and space. The results reported are for Gaussian data in which the mean vector of each class may vary with time or position after the classifier is trained.

  15. Classification and Coding, An Introduction and Review of Classification and Coding Systems. Management Guide No. 1.

    ERIC Educational Resources Information Center

    MacConnell, W.

    Nearly all organizations are faced with problems of classifying and coding financial data, management and technical information, components, stores, etc. and need to apply some logical and meaningful system of identification. This report examines the objectives and applications of classification and coding systems and reviews eight systems…

  16. The Rocchio classifier and second generation wavelets

    NASA Astrophysics Data System (ADS)

    Carter, Patricia H.

    2007-04-01

    Classification and characterization of text is of ever growing importance in defense and national security. The text classification task is an instance of classification using sparse features residing in a high dimensional feature space. Two standard (of a wide selection of available) algorithms for this task are the naive Bayes classifier and the Rocchio linear classifier. Naive Bayes classifiers are widely applied; the Rocchio algorithm is primarily used in document classification and information retrieval. Both these classifiers are popular because of their simplicity and ease of application, computational speed and reasonable performance. One aspect of the Rocchio approach, inherited from its information retrieval origin, is that it explicitly uses both positive and negative models. Parameters have been introduced which make it adaptive to the particulars of the corpora of interest and thereby improve its performance. The ideas inherent in these classifiers and in second generation wavelets can be recombined into new algorithms for classification. An example is a classifier using second generation wavelet-like functions for class probes that mimic the Rocchio positive template - negative template approach.

  17. QR Codes

    ERIC Educational Resources Information Center

    Lai, Hsin-Chih; Chang, Chun-Yen; Li, Wen-Shiane; Fan, Yu-Lin; Wu, Ying-Tien

    2013-01-01

    This study presents an m-learning method that incorporates Integrated Quick Response (QR) codes. This learning method not only achieves the objectives of outdoor education, but it also increases applications of Cognitive Theory of Multimedia Learning (CTML) (Mayer, 2001) in m-learning for practical use in a diverse range of outdoor locations. When…

  18. Orthopedics coding and funding.

    PubMed

    Baron, S; Duclos, C; Thoreux, P

    2014-02-01

    The French tarification à l'activité (T2A) prospective payment system is a financial system in which a health-care institution's resources are based on performed activity. Activity is described via the PMSI medical information system (programme de médicalisation du système d'information). The PMSI classifies hospital cases by clinical and economic categories known as diagnosis-related groups (DRG), each with an associated price tag. Coding a hospital case involves giving as realistic a description as possible so as to categorize it in the right DRG and thus ensure appropriate payment. For this, it is essential to understand what determines the pricing of inpatient stay: namely, the code for the surgical procedure, the patient's principal diagnosis (reason for admission), codes for comorbidities (everything that adds to management burden), and the management of the length of inpatient stay. The PMSI is used to analyze the institution's activity and dynamism: change on previous year, relation to target, and comparison with competing institutions based on indicators such as the mean length of stay performance indicator (MLS PI). The T2A system improves overall care efficiency. Quality of care, however, is not presently taken account of in the payment made to the institution, as there are no indicators for this; work needs to be done on this topic.

  19. Double-coding nucleic acids: introduction of a nucleobase sequence in the major groove of the DNA duplex using double-headed nucleotides.

    PubMed

    Kumar, Pawan; Sorinas, Antoni Figueras; Nielsen, Lise J; Slot, Maria; Skytte, Kirstine; Nielsen, Annie S; Jensen, Michael D; Sharma, Pawan K; Vester, Birte; Petersen, Michael; Nielsen, Poul

    2014-09-05

    A series of double-headed nucleosides were synthesized using the Sonogashira cross-coupling reaction. In the reactions, additional nucleobases (thymine, cytosine, adenine, or guanine) were attached to the 5-position of 2'-deoxyuridine or 2'-deoxycytidine through a propyne linker. The modified nucleosides were incorporated into oligonucleotides, and these were combined in different duplexes that were analyzed by thermal denaturation studies. All of the monomers were well tolerated in the DNA duplexes and induced only small changes in the thermal stability. Consecutive incorporations of the monomers led to increases in duplex stability owing to increased stacking interactions. The modified nucleotide monomers maintained the Watson-Crick base pair fidelity. Stable duplexes were observed with heavily modified oligonucleotides featuring 14 consecutive incorporations of different double-headed nucleotide monomers. Thus, modified duplexes with an array of nucleobases on the exterior of the duplex were designed. Molecular dynamics simulations demonstrated that the additional nucleobases could expose their Watson-Crick and/or Hoogsteen faces for recognition in the major groove. This presentation of nucleobases may find applications in providing molecular information without unwinding the duplex.

  20. Isolation and characterization of an atypical LEA protein coding cDNA and its promoter from drought-tolerant plant Prosopis juliflora.

    PubMed

    George, Suja; Usha, B; Parida, Ajay

    2009-05-01

    Plant growth and productivity are adversely affected by various abiotic and biotic stress factors. Despite the wealth of information on abiotic stress and stress tolerance in plants, many aspects still remain unclear. Prosopis juliflora is a hardy plant reported to be tolerant to drought, salinity, extremes of soil pH, and heavy metal stress. In this paper, we report the isolation and characterization of the complementary DNA clone for an atypical late embryogenesis abundant (LEA) protein (Pj LEA3) and its putative promoter sequence from P. juliflora. Unlike typical LEA proteins, rich in glycine, Pj LEA3 has alanine as the most abundant amino acid followed by serine and shows an average negative hydropathy. Pj LEA3 is significantly different from other LEA proteins in the NCBI database and shows high similarity to indole-3 acetic-acid-induced protein ARG2 from Vigna radiata. Northern analysis for Pj LEA3 in P. juliflora leaves under 90 mM H2O2 stress revealed up-regulation of transcript at 24 and 48 h. A 1.5-kb fragment upstream the 5' UTR of this gene (putative promoter) was isolated and analyzed in silico. The possible reasons for changes in gene expression during stress in relation to the host plant's stress tolerance mechanisms are discussed.

  1. Breaking the DNA-binding code of Ralstonia solanacearum TAL effectors provides new possibilities to generate plant resistance genes against bacterial wilt disease.

    PubMed

    de Lange, Orlando; Schreiber, Tom; Schandry, Niklas; Radeck, Jara; Braun, Karl Heinz; Koszinowski, Julia; Heuer, Holger; Strauß, Annett; Lahaye, Thomas

    2013-08-01

    Ralstonia solanacearum is a devastating bacterial phytopathogen with a broad host range. Ralstonia solanacearum injected effector proteins (Rips) are key to the successful invasion of host plants. We have characterized Brg11(hrpB-regulated 11), the first identified member of a class of Rips with high sequence similarity to the transcription activator-like (TAL) effectors of Xanthomonas spp., collectively termed RipTALs. Fluorescence microscopy of in planta expressed RipTALs showed nuclear localization. Domain swaps between Brg11 and Xanthomonas TAL effector (TALE) AvrBs3 (avirulence protein triggering Bs3 resistance) showed the functional interchangeability of DNA-binding and transcriptional activation domains. PCR was used to determine the sequence of brg11 homologs from strains infecting phylogenetically diverse host plants. Brg11 localizes to the nucleus and activates promoters containing a matching effector-binding element (EBE). Brg11 and homologs preferentially activate promoters containing EBEs with a 5' terminal guanine, contrasting with the TALE preference for a 5' thymine. Brg11 and other RipTALs probably promote disease through the transcriptional activation of host genes. Brg11 and the majority of homologs identified in this study were shown to activate similar or identical target sequences, in contrast to TALEs, which generally show highly diverse target preferences. This information provides new options for the engineering of plants resistant to R. solanacearum.

  2. Sequencing of the coding exons of the LRP1 and LDLR genes on individual DNA samples reveals novel mutations in both genes.

    PubMed

    Van Leuven, F; Thiry, E; Lambrechts, M; Stas, L; Boon, T; Bruynseels, K; Muls, E; Descamps, O

    2001-02-15

    Five coding polymorphisms in de LRP1 gene, i.e. A217V, A775P, D2080N, D2632E and G4379S were discovered by sequencing its 89 exons in three test-groups of 22 healthy individuals, 29 Alzheimer patients and 18 individuals with different clinical and molecularly uncharacterized lipid metabolism problems. No genetic defect was evident in the LRP1 gene of any of the Alzheimer's disease (AD) patients, further excluding LRP1 as a major genetic problem in AD. Lipoprotein receptor related protein (LRP) A217V (exon 6) was clearly present in all groups as a polymorphism, while D2632E was observed only once in a healthy volunteer. On the other hand, LRP1 alleles A775P, D2080N, and G4379 were encountered only in patients with FH or with undefined problems of lipid metabolism. This finding forced one to also analyze the LDL receptor (LDLR) gene, for which a method was devised to sequence the entire region comprising LDLR exons 2-18. The resulting sequence contig of 33567 nucleotides yielded finally an exact physical map that corrects published and listed LDLR gene maps in many positions. In addition, next to known mutations in LDLR that cause FH, four novel LDLR defects were defined, i.e. del e7-10, exon 9 mutation N407T, a 20 bp insertion in exon 4, and a double mutation C292W/K290R in exon 6. No evidence for pathology connected to the LRP1 'mutations' was obtained by subsequent screening for the five LRP1 variants in larger groups of 110 FH patients and 118 patients with molecularly undefined, clinical problems of cholesterol and/or lipid metabolism. In three individuals with a mutant LDLR gene a variant LRP1 allele was also present, but without direct, obvious clinical compound effects, indicating that the variant LRP1 alleles must, for the present, be considered polymorphisms.

  3. A Method for the Annotation of Functional Similarities of Coding DNA Sequences: the Case of a Populated Cluster of Transmembrane Proteins.

    PubMed

    Fuertes, Miguel Angel; Rodrigo, José Ramón; Alonso, Carlos

    2017-01-01

    The analysis of a large number of human and mouse genes codifying for a populated cluster of transmembrane proteins revealed that some of the genes significantly vary in their primary nucleotide sequence inter-species and also intra-species. In spite of that divergence and of the fact that all these genes share a common parental function we asked the question of whether at DNA level they have some kind of common compositional structure, not evident from the analysis of their primary nucleotide sequence. To reveal the existence of gene clusters not based on primary sequence relationships we have analyzed 13574 human and 14047 mouse genes by the composon-clustering methodology. The data presented show that most of the genes from each one of the samples are distributed in 18 clusters sharing the common compositional features between the particular human and mouse clusters. It was observed, in addition, that between particular human and mouse clusters having similar composon-profiles large variations in gene population were detected as an indication that a significant amount of orthologs between both species differs in compositional features. A gene cluster containing exclusively genes codifying for transmembrane proteins, an important fraction of which belongs to the Rhodopsin G-protein coupled receptor superfamily, was also detected. This indicates that even though some of them display low sequence similarity, all of them, in both species, participate with similar compositional features in terms of composons. We conclude that in this family of transmembrane proteins in general and in the Rhodopsin G-protein coupled receptor in particular, the composon-clustering reveals the existence of a type of common compositional structure underlying the primary nucleotide sequence closely correlated to function.

  4. Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.

    PubMed

    Hua, Wei; Wang, Jiasong; Zhao, Jian

    2014-01-01

    Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method.

  5. FY05 LDRD Fianl Report Investigation of AAA+ protein machines that participate in DNA replication, recombination, and in response to DNA damage LDRD Project Tracking Code: 04-LW-049

    SciTech Connect

    Sawicka, D; de Carvalho-Kavanagh, M S; Barsky, D; Venclovas, C

    2006-12-04

    The AAA+ proteins are remarkable macromolecules that are able to self-assemble into nanoscale machines. These protein machines play critical roles in many cellular processes, including the processes that manage a cell's genetic material, but the mechanism at the molecular level has remained elusive. We applied computational molecular modeling, combined with advanced sequence analysis and available biochemical and genetic data, to structurally characterize eukaryotic AAA+ proteins and the protein machines they form. With these models we have examined intermolecular interactions in three-dimensions (3D), including both interactions between the components of the AAA+ complexes and the interactions of these protein machines with their partners. These computational studies have provided new insights into the molecular structure and the mechanism of action for AAA+ protein machines, thereby facilitating a deeper understanding of processes involved in DNA metabolism.

  6. Pairwise Classifier Ensemble with Adaptive Sub-Classifiers for fMRI Pattern Analysis.

    PubMed

    Kim, Eunwoo; Park, HyunWook

    2017-02-01

    The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-classifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classifiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.

  7. Rudolph Focke and the Theory of the Classified Catalog. Occasional Paper No. 145.

    ERIC Educational Resources Information Center

    Stevenson, Gordon

    Between 1900 and 1905, Rudolph Focke published a series of papers on classification theory and a draft of a code for the construction of classified catalogs. His work was the direct result of the reform of librarianship during the last decades of the nineteenth century. The large number of classification systems used by German university and…

  8. Role of classifiers in multimedia content management

    NASA Astrophysics Data System (ADS)

    Naphade, Milind R.; Smith, John R.

    2003-01-01

    Enabling semantic detection and indexing is an important task in multimedia content management. Learning and classification techniques are increasingly relevant to the state of the art content management systems. From relevance feedback to semantic detection, there is a shift in the amount of supervision that precedes retrieval from light weight classifiers to heavy weight classifiers. In this paper we compare the performance of some popular classifiers for semantic video indexing. We mainly compare among other techniques, one technique for generative modeling and one for discriminant learning and show how they behave depending on the number of examples that the user is willing to provide to the system. We report results using the NIST TREC Video Corpus.

  9. Ranked Multi-Label Rules Associative Classifier

    NASA Astrophysics Data System (ADS)

    Thabtah, Fadi

    Associative classification is a promising approach in data mining, which integrates association rule discovery and classification. In this paper, we present a novel associative classification technique called Ranked Multilabel Rule (RMR) that derives rules with multiple class labels. Rules derived by current associative classification algorithms overlap in their training data records, resulting in many redundant and useless rules. However, RMR removes the overlapping between rules using a pruning heuristic and ensures that rules in the final classifier do not share training records, resulting in more accurate classifiers. Experimental results obtained on twenty data sets show that the classifiers produced by RMR are highly competitive if compared with those generated by decision trees and other popular associative techniques such as CBA, with respect to prediction accuracy.

  10. On Asymmetric Classifier Training for Detector Cascades

    SciTech Connect

    Gee, Timothy Felix

    2006-01-01

    This paper examines the Asymmetric AdaBoost algorithm introduced by Viola and Jones for cascaded face detection. The Viola and Jones face detector uses cascaded classifiers to successively filter, or reject, non-faces. In this approach most non-faces are easily rejected by the earlier classifiers in the cascade, thus reducing the overall number of computations. This requires earlier cascade classifiers to very seldomly reject true instances of faces. To reflect this training goal, Viola and Jones introduce a weighting parameter for AdaBoost iterations and show it enforces a desirable bound. During their implementation, a modification to the proposed weighting was introduced, while enforcing the same bound. The goal of this paper is to examine their asymmetric weighting by putting AdaBoost in the form of Additive Regression as was done by Friedman, Hastie, and Tibshirani. The author believes this helps to explain the approach and adds another connection between AdaBoost and Additive Regression.

  11. Reinforcement learning based artificial immune classifier.

    PubMed

    Karakose, Mehmet

    2013-01-01

    One of the widely used methods for classification that is a decision-making process is artificial immune systems. Artificial immune systems based on natural immunity system can be successfully applied for classification, optimization, recognition, and learning in real-world problems. In this study, a reinforcement learning based artificial immune classifier is proposed as a new approach. This approach uses reinforcement learning to find better antibody with immune operators. The proposed new approach has many contributions according to other methods in the literature such as effectiveness, less memory cell, high accuracy, speed, and data adaptability. The performance of the proposed approach is demonstrated by simulation and experimental results using real data in Matlab and FPGA. Some benchmark data and remote image data are used for experimental results. The comparative results with supervised/unsupervised based artificial immune system, negative selection classifier, and resource limited artificial immune classifier are given to demonstrate the effectiveness of the proposed new method.

  12. Reinforcement Learning Based Artificial Immune Classifier

    PubMed Central

    Karakose, Mehmet

    2013-01-01

    One of the widely used methods for classification that is a decision-making process is artificial immune systems. Artificial immune systems based on natural immunity system can be successfully applied for classification, optimization, recognition, and learning in real-world problems. In this study, a reinforcement learning based artificial immune classifier is proposed as a new approach. This approach uses reinforcement learning to find better antibody with immune operators. The proposed new approach has many contributions according to other methods in the literature such as effectiveness, less memory cell, high accuracy, speed, and data adaptability. The performance of the proposed approach is demonstrated by simulation and experimental results using real data in Matlab and FPGA. Some benchmark data and remote image data are used for experimental results. The comparative results with supervised/unsupervised based artificial immune system, negative selection classifier, and resource limited artificial immune classifier are given to demonstrate the effectiveness of the proposed new method. PMID:23935424

  13. A survey of decision tree classifier methodology

    NASA Technical Reports Server (NTRS)

    Safavian, S. Rasoul; Landgrebe, David

    1990-01-01

    Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps, the most important feature of DTC's is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issue. After considering potential advantages of DTC's over single stage classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  14. A survey of decision tree classifier methodology

    NASA Technical Reports Server (NTRS)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  15. Use of robust estimators in parametric classifiers

    NASA Technical Reports Server (NTRS)

    Safavian, S. Rasoul; Landgrebe, David A.

    1989-01-01

    The parametric approach to density estimation and classifier design is a well studied subject. The parametric approach is desirable because basically it reduces the problem of classifier design to that of estimating a few parameters for each of the pattern classes. The class parameters are usually estimated using maximum-likelihood (ML) estimators. ML estimators are, however, very sensitive to the presence of outliers. Several robust estimators of mean and covariance matrix and their effect on the probability of error in classification are examined. Comments are made about alpha-ranked (alpha-trimmed) estimators.

  16. Improved method for predicting protein fold patterns with ensemble classifiers.

    PubMed

    Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C

    2012-01-27

    Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.

  17. The classifier problem in Chinese aphasia.

    PubMed

    Tzeng, O J; Chen, S; Hung, D L

    1991-08-01

    In recent years, research on the relationship between brain organization and language processing has benefited tremendously from cross-linguistic comparisons of language disorders among different types of aphasic patients. Results from these cross-linguistic studies have shown that the same aphasic syndromes often look very different from one language to another, suggesting that language-specific knowledge is largely preserved in Broca's and Wernicke's aphasics. In this paper, Chinese aphasic patients were examined with respect to their (in)ability to use classifiers in a noun phrase. The Chinese language, in addition to its lack of verb conjugation and an absence of noun declension, is exceptional in yet another respect: articles, numerals, and other such modifiers cannot directly precede their associated nouns, there has to be an intervening morpheme called a classifier. The appropriate usage of nominal classifiers is considered to be one of the most difficult aspects of Chinese grammar. Our examination of Chinese aphasic patients revealed two essential points. First, Chinese aphasic patients experience difficulty in the production of nominal classifiers, committing a significant number of errors of omission and/or substitution. Second, two different kinds of substitution errors are observed in Broca's and Wernicke's patients, and the detailed analysis of the difference demands a rethinking of the distinction between agrammatism and paragrammatism. The result adds to a growing body of evidence suggesting that grammar is impaired in fluent as well as nonfluent aphasia.

  18. Large margin classifier-based ensemble tracking

    NASA Astrophysics Data System (ADS)

    Wang, Yuru; Liu, Qiaoyuan; Yin, Minghao; Wang, ShengSheng

    2016-07-01

    In recent years, many studies consider visual tracking as a two-class classification problem. The key problem is to construct a classifier with sufficient accuracy in distinguishing the target from its background and sufficient generalize ability in handling new frames. However, the variable tracking conditions challenges the existing methods. The difficulty mainly comes from the confused boundary between the foreground and background. This paper handles this difficulty by generalizing the classifier's learning step. By introducing the distribution data of samples, the classifier learns more essential characteristics in discriminating the two classes. Specifically, the samples are represented in a multiscale visual model. For features with different scales, several large margin distribution machine (LDMs) with adaptive kernels are combined in a Baysian way as a strong classifier. Where, in order to improve the accuracy and generalization ability, not only the margin distance but also the sample distribution is optimized in the learning step. Comprehensive experiments are performed on several challenging video sequences, through parameter analysis and field comparison, the proposed LDM combined ensemble tracker is demonstrated to perform with sufficient accuracy and generalize ability in handling various typical tracking difficulties.

  19. Visual Classifier Training for Text Document Retrieval.

    PubMed

    Heimerl, F; Koch, S; Bosch, H; Ertl, T

    2012-12-01

    Performing exhaustive searches over a large number of text documents can be tedious, since it is very hard to formulate search queries or define filter criteria that capture an analyst's information need adequately. Classification through machine learning has the potential to improve search and filter tasks encompassing either complex or very specific information needs, individually. Unfortunately, analysts who are knowledgeable in their field are typically not machine learning specialists. Most classification methods, however, require a certain expertise regarding their parametrization to achieve good results. Supervised machine learning algorithms, in contrast, rely on labeled data, which can be provided by analysts. However, the effort for labeling can be very high, which shifts the problem from composing complex queries or defining accurate filters to another laborious task, in addition to the need for judging the trained classifier's quality. We therefore compare three approaches for interactive classifier training in a user study. All of the approaches are potential candidates for the integration into a larger retrieval system. They incorporate active learning to various degrees in order to reduce the labeling effort as well as to increase effectiveness. Two of them encompass interactive visualization for letting users explore the status of the classifier in context of the labeled documents, as well as for judging the quality of the classifier in iterative feedback loops. We see our work as a step towards introducing user controlled classification methods in addition to text search and filtering for increasing recall in analytics scenarios involving large corpora.

  20. Performance of a 20-target MSE classifier

    NASA Astrophysics Data System (ADS)

    Novak, Leslie M.; Owirka, Gregory J.; Brower, William S.

    1998-08-01

    MIT Lincoln Laboratory is responsible for developing the ATR system for the DARPA/DARO/NIMA/OSD-sponsored SAIP program; the baseline ATR system recognizes 10 GOB targets; the enhanced version of SAIP requires the ATR system to recognize 20 GOB targets. This paper compares ATR performance results for 10- and 20-target MSE classifiers using high-resolution SAR imagery.

  1. Classifying and quantifying basins of attraction

    SciTech Connect

    Sprott, J. C.; Xiong, Anda

    2015-08-15

    A scheme is proposed to classify the basins for attractors of dynamical systems in arbitrary dimensions. There are four basic classes depending on their size and extent, and each class can be further quantified to facilitate comparisons. The calculation uses a Monte Carlo method and is applied to numerous common dissipative chaotic maps and flows in various dimensions.

  2. Performance Evaluation of a Semantic Perception Classifier

    DTIC Science & Technology

    2013-09-01

    Performance Evaluation of a Semantic Perception Classifier by Craig Lennon, Barry Bodt, Marshal Childers, Rick Camden, Arne Suppe, Luis...Camden and Nicoleta Florea Engility Corporation Luis Navarro-Serment and Arne Suppe Carnegie Mellon University...Lennon, Barry Bodt, Marshal Childers, Rick Camden,* Arne Suppe, † Luis Navarro-Serment, † and Nicoleta Florea* 5d. PROJECT NUMBER 5e. TASK

  3. 32 CFR 148.2 - Classified programs.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 1 2010-07-01 2010-07-01 false Classified programs. 148.2 Section 148.2 National Defense Department of Defense OFFICE OF THE SECRETARY OF DEFENSE PERSONNEL, MILITARY AND CIVILIAN NATIONAL POLICY AND IMPLEMENTATION OF RECIPROCITY OF FACILITIES National Policy on Reciprocity of Use...

  4. 32 CFR 148.2 - Classified programs.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 1 2011-07-01 2011-07-01 false Classified programs. 148.2 Section 148.2 National Defense Department of Defense OFFICE OF THE SECRETARY OF DEFENSE PERSONNEL, MILITARY AND CIVILIAN NATIONAL POLICY AND IMPLEMENTATION OF RECIPROCITY OF FACILITIES National Policy on Reciprocity of Use...

  5. Shape and Function in Hmong Classifier Choices

    ERIC Educational Resources Information Center

    Sakuragi, Toshiyuki; Fuller, Judith W.

    2013-01-01

    This study examined classifiers in the Hmong language with a particular focus on gaining insights into the underlying cognitive process of categorization. Forty-three Hmong speakers participated in three experiments. In the first experiment, designed to verify the previously postulated configurational (saliently one-dimensional, saliently…

  6. 5 CFR 1312.4 - Classified designations.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.4 Classified designations. (a) Except as provided by the Atomic Energy Act of 1954, as amended, (42 U.S.C. 2011) or the National Security Act of 1947, as...

  7. 5 CFR 1312.4 - Classified designations.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.4 Classified designations. (a) Except as provided by the Atomic Energy Act of 1954, as amended, (42 U.S.C. 2011) or the National Security Act of 1947, as...

  8. 5 CFR 1312.4 - Classified designations.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.4 Classified designations. (a) Except as provided by the Atomic Energy Act of 1954, as amended, (42 U.S.C. 2011) or the National Security Act of 1947, as...

  9. 5 CFR 1312.4 - Classified designations.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.4 Classified designations. (a) Except as provided by the Atomic Energy Act of 1954, as amended, (42 U.S.C. 2011) or the National Security Act of 1947, as...

  10. 32 CFR 651.13 - Classified actions.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and the Decision Process..., AR 380-5 (Department of the Army Information Security Program) will be followed. (b) Classification... makers in accordance with AR 380-5. (d) When classified information is such an integral part of...

  11. 32 CFR 651.13 - Classified actions.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and the Decision Process..., AR 380-5 (Department of the Army Information Security Program) will be followed. (b) Classification... makers in accordance with AR 380-5. (d) When classified information is such an integral part of...

  12. 32 CFR 651.13 - Classified actions.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and the Decision Process..., AR 380-5 (Department of the Army Information Security Program) will be followed. (b) Classification... makers in accordance with AR 380-5. (d) When classified information is such an integral part of...

  13. 32 CFR 651.13 - Classified actions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and the Decision Process..., AR 380-5 (Department of the Army Information Security Program) will be followed. (b) Classification... makers in accordance with AR 380-5. (d) When classified information is such an integral part of...

  14. 32 CFR 651.13 - Classified actions.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and the Decision Process..., AR 380-5 (Department of the Army Information Security Program) will be followed. (b) Classification... makers in accordance with AR 380-5. (d) When classified information is such an integral part of...

  15. Bayes Error Rate Estimation Using Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2003-01-01

    The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

  16. Evolving Coevolutionary Classifiers Under Large Attribute Spaces

    NASA Astrophysics Data System (ADS)

    Doucette, John; Lichodzijewski, Peter; Heywood, Malcolm

    Model-building under the supervised learning domain potentially face a dual learning problem of identifying both the parameters of the model and the subset of (domain) attributes necessary to support the model, thus using an embedded as opposed to wrapper or filter based design. Genetic Programming (GP) has always addressed this dual problem, however, further implicit assumptions are made which potentially increase the complexity of the resulting solutions. In this work we are specifically interested in the case of classification under very large attribute spaces. As such it might be expected that multiple independent/ overlapping attribute subspaces support the mapping to class labels; whereas GP approaches to classification generally assume a single binary classifier per class, forcing the model to provide a solution in terms of a single attribute subspace and single mapping to class labels. Supporting the more general goal is considered as a requirement for identifying a 'team' of classifiers with non-overlapping classifier behaviors, in which each classifier responds to different subsets of exemplars. Moreover, the subsets of attributes associated with each team member might utilize a unique 'subspace' of attributes. This work investigates the utility of coevolutionary model building for the case of classification problems with attribute vectors consisting of 650 to 100,000 dimensions. The resulting team based coevolutionary evolutionary method-Symbiotic Bid-based (SBB) GP-is compared to alternative embedded classifier approaches of C4.5 and Maximum Entropy Classification (MaxEnt). SSB solutions demonstrate up to an order of magnitude lower attribute count relative to C4.5 and up to two orders of magnitude lower attribute count than MaxEnt while retaining comparable or better classification performance. Moreover, relative to the attribute count of individual models participating within a team, no more than six attributes are ever utilized; adding a further

  17. DNA rearrangements located over 100 kb 5' of the Steel (Sl)-coding region in Steel-panda and Steel-contrasted mice deregulate Sl expression and cause female sterility by disrupting ovarian follicle development.

    PubMed

    Bedell, M A; Brannan, C I; Evans, E P; Copeland, N G; Jenkins, N A; Donovan, P J

    1995-02-15

    The Steel (Sl) locus is essential for the development of germ cells, hematopoietic cells, and melanocytes and encodes a growth factor (Mgf) that is the ligand for c-kit, a receptor tyrosine kinase encoded by the W locus. We have identified the molecular and germ cell defects in two mutant Sl alleles, Steel-panda (Slpan) and Steel-contrasted (Slcon), that cause sterility only in females. Unexpectedly, both mutant alleles are shown to contain DNA rearrangements, located > 100 kb 5' of Mgf-coding sequences, that lead to tissue-specific effects on Mgf mRNA expression. In Slpan embryos, decreased Mgf mRNA expression in the gonads causes a reduced number of primordial germ cells in both sexes. However, Mgf expression and spermatogenesis in the postnatal mutant tests is normal, and spermatogonial proliferation compensates for deficiencies in germ cell numbers. In Slpan and Slcon homozygous females, decreased Mgf mRNA expression causes sterility by affecting the initiation and maintenance of ovarian follicle development. Thus, regulated expression of Mgf is required for multiple stages of embryonic and postnatal germ cell development. Surprisingly, other areas of the Slcon female reproductive tract displayed ectopic expression of Mgf mRNA. We propose that the Slpan and Slcon rearrangements alter Mgf mRNA abundance through position effects on expression that act at a distance from the Sl gene.

  18. DNA fingerprinting of Chinese melon provides evidentiary support of seed quality appraisal.

    PubMed

    Gao, Peng; Ma, Hongyan; Luan, Feishi; Song, Haibin

    2012-01-01

    Melon, Cucumis melo L. is an important vegetable crop worldwide. At present, there are phenomena of homonyms and synonyms present in the melon seed markets of China, which could cause variety authenticity issues influencing the process of melon breeding, production, marketing and other aspects. Molecular markers, especially microsatellites or simple sequence repeats (SSRs) are playing increasingly important roles for cultivar identification. The aim of this study was to construct a DNA fingerprinting database of major melon cultivars, which could provide a possibility for the establishment of a technical standard system for purity and authenticity identification of melon seeds. In this study, to develop the core set SSR markers, 470 polymorphic SSRs were selected as the candidate markers from 1219 SSRs using 20 representative melon varieties (lines). Eighteen SSR markers, evenly distributed across the genome and with the highest contents of polymorphism information (PIC) were identified as the core marker set for melon DNA fingerprinting analysis. Fingerprint codes for 471 melon varieties (lines) were established. There were 51 materials which were classified into17 groups based on sharing the same fingerprint code, while field traits survey results showed that these plants in the same group were synonyms because of the same or similar field characters. Furthermore, DNA fingerprinting quick response (QR) codes of 471 melon varieties (lines) were constructed. Due to its fast readability and large storage capacity, QR coding melon DNA fingerprinting is in favor of read convenience and commercial applications.

  19. Evolutionary design of a fuzzy classifier from data.

    PubMed

    Chang, Xiaoguang; Lilly, John H

    2004-08-01

    Genetic algorithms show powerful capabilities for automatically designing fuzzy systems from data, but many proposed methods must be subjected to some minimal structure assumptions, such as rule base size. In this paper, we also address the design of fuzzy systems from data. A new evolutionary approach is proposed for deriving a compact fuzzy classification system directly from data without any a priori knowledge or assumptions on the distribution of the data. At the beginning of the algorithm, the fuzzy classifier is empty with no rules in the rule base and no membership functions assigned to fuzzy variables. Then, rules and membership functions are automatically created and optimized in an evolutionary process. To accomplish this, parameters of the variable input spread inference training (VISIT) algorithm are used to code fuzzy systems on the training data set. Therefore, we can derive each individual fuzzy system via the VISIT algorithm, and then search the best one via genetic operations. To evaluate the fuzzy classifier, a fuzzy expert system acts as the fitness function. This fuzzy expert system can effectively evaluate the accuracy and compactness at the same time. In the application section, we consider four benchmark classification problems: the iris data, wine data, Wisconsin breast cancer data, and Pima Indian diabetes data. Comparisons of our method with others in the literature show the effectiveness of the proposed method.

  20. Transcriptome-based functional classifiers for direct immunotoxicity.

    PubMed

    Shao, Jia; Berger, Laura F; Hendriksen, Peter J M; Peijnenburg, Ad A C M; van Loveren, Henk; Volger, Oscar L

    2014-03-01

    Current screening methods for direct immunotoxic chemicals are mainly based on general toxicity studies with rodents. The present study aimed to identify transcriptome-based functional classifiers that can eventually be exploited for the development of in vitro screening assays for direct immunotoxicity. To this end, a toxicogenomics approach was applied in which gene expression changes in human Jurkat lymphoblastic T cells were investigated in response to a wide range of compounds, including direct immunotoxicants, immunosuppressive drugs, and non-immunotoxic control chemicals. On the basis of DNA microarray data previously obtained by the exposure of Jurkat cells to 31 test compounds (Shao et al. in Toxicol Sci 135(2):328-346, 2013), we identified a set of 93 genes, of which 80 were significantly regulated (|numerical ratio| ≥1.62) by at least three compounds and the other 13 genes were significantly regulated by either one single compound or compound class. A total of 28 most differentially regulated genes were selected for qRT-PCR verification using a training set of 44 compounds consisting of the above-mentioned 31 compounds (23 immunotoxic and 8 non-immunotoxic) and 13 additional immunotoxicants. Good correlation between the results of microarray and qRT-PCR (Pearson's correlation, R ≥ 0.69) was found for 27 out of the 28 genes. Redundancy analysis of these 27 potential classifiers led to a final set of 25 genes. To assess the performance of these genes, Jurkat cells were exposed to 20 additional compounds (external verification set) followed by qRT-PCR. The classifier set of 25 genes gave a good performance in the external verification: accuracy 85 %, true positive rate (sensitivity) 88 %, and true negative rate (specificity) 67 %. Furthermore, on the basis of the gene ontology annotation of the 25 classifier genes, the immunotoxicants examined in this study could be categorized into distinct functional subclasses. In conclusion, we have identified and

  1. What Advances Are Being Made in DNA Sequencing?

    MedlinePlus

    ... DNA building blocks (nucleotides) in an individual's genetic code, called DNA sequencing, has advanced the study of ... breakthrough that helped scientists determine the human genetic code, but it is time-consuming and expensive. The ...

  2. Maximal dinucleotide comma-free codes.

    PubMed

    Fimmel, Elena; Strüngmann, Lutz

    2016-01-21

    The problem of retrieval and maintenance of the correct reading frame plays a significant role in RNA transcription. Circular codes, and especially comma-free codes, can help to understand the underlying mechanisms of error-detection in this process. In recent years much attention has been paid to the investigation of trinucleotide circular codes (see, for instance, Fimmel et al., 2014; Fimmel and Strüngmann, 2015a; Michel and Pirillo, 2012; Michel et al., 2012, 2008), while dinucleotide codes had been touched on only marginally, even though dinucleotides are associated to important biological functions. Recently, all maximal dinucleotide circular codes were classified (Fimmel et al., 2015; Michel and Pirillo, 2013). The present paper studies maximal dinucleotide comma-free codes and their close connection to maximal dinucleotide circular codes. We give a construction principle for such codes and provide a graphical representation that allows them to be visualized geometrically. Moreover, we compare the results for dinucleotide codes with the corresponding situation for trinucleotide maximal self-complementary C(3)-codes. Finally, the results obtained are discussed with respect to Crick׳s hypothesis about frame-shift-detecting codes without commas.

  3. Letter identification and the neural image classifier.

    PubMed

    Watson, Andrew B; Ahumada, Albert J

    2015-02-12

    Letter identification is an important visual task for both practical and theoretical reasons. To extend and test existing models, we have reviewed published data for contrast sensitivity for letter identification as a function of size and have also collected new data. Contrast sensitivity increases rapidly from the acuity limit but slows and asymptotes at a symbol size of about 1 degree. We recast these data in terms of contrast difference energy: the average of the squared distances between the letter images and the average letter image. In terms of sensitivity to contrast difference energy, and thus visual efficiency, there is a peak around ¼ degree, followed by a marked decline at larger sizes. These results are explained by a Neural Image Classifier model that includes optical filtering and retinal neural filtering, sampling, and noise, followed by an optimal classifier. As letters are enlarged, sensitivity declines because of the increasing size and spacing of the midget retinal ganglion cell receptive fields in the periphery.

  4. Comparing cosmic web classifiers using information theory

    NASA Astrophysics Data System (ADS)

    Leclercq, Florent; Lavaux, Guilhem; Jasche, Jens; Wandelt, Benjamin

    2016-08-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  5. Classifying objects in LWIR imagery via CNNs

    NASA Astrophysics Data System (ADS)

    Rodger, Iain; Connor, Barry; Robertson, Neil M.

    2016-10-01

    The aim of the presented work is to demonstrate enhanced target recognition and improved false alarm rates for a mid to long range detection system, utilising a Long Wave Infrared (LWIR) sensor. By exploiting high quality thermal image data and recent techniques in machine learning, the system can provide automatic target recognition capabilities. A Convolutional Neural Network (CNN) is trained and the classifier achieves an overall accuracy of > 95% for 6 object classes related to land defence. While the highly accurate CNN struggles to recognise long range target classes, due to low signal quality, robust target discrimination is achieved for challenging candidates. The overall performance of the methodology presented is assessed using human ground truth information, generating classifier evaluation metrics for thermal image sequences.

  6. Classifying Land Cover Using Spectral Signature

    NASA Astrophysics Data System (ADS)

    Alawiye, F. S.

    2012-12-01

    Studying land cover has become increasingly important as countries try to overcome the destruction of wetlands; its impact on local climate due to seasonal variation, radiation balance, and deteriorating environmental quality. In this investigation, we have been studying the spectral signatures of the Jamaica Bay wetland area based on remotely sensed satellite input data from LANDSAT TM and ASTER. We applied various remote sensing techniques to generate classified land cover output maps. Our classifiers relied on input from both the remote sensing and in-situ spectral field data. Based upon spectral separability and data collected in the field, a supervised and unsupervised classification was carried out. First results suggest good agreement between the land cover units mapped and those observed in the field.

  7. Classification Studies in an Advanced Air Classifier

    NASA Astrophysics Data System (ADS)

    Routray, Sunita; Bhima Rao, R.

    2016-10-01

    In the present paper, experiments are carried out using VSK separator which is an advanced air classifier to recover heavy minerals from beach sand. In classification experiments the cage wheel speed and the feed rate are set and the material is fed to the air cyclone and split into fine and coarse particles which are collected in separate bags. The size distribution of each fraction was measured by sieve analysis. A model is developed to predict the performance of the air classifier. The objective of the present model is to predict the grade efficiency curve for a given set of operating parameters such as cage wheel speed and feed rate. The overall experimental data with all variables studied in this investigation is fitted to several models. It is found that the present model is fitting good to the logistic model.

  8. Semantic Features for Classifying Referring Search Terms

    SciTech Connect

    May, Chandler J.; Henry, Michael J.; McGrath, Liam R.; Bell, Eric B.; Marshall, Eric J.; Gregory, Michelle L.

    2012-05-11

    When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.

  9. Classifying bed inclination using pressure images.

    PubMed

    Baran Pouyan, M; Ostadabbas, S; Nourani, M; Pompeo, M

    2014-01-01

    Pressure ulcer is one of the most prevalent problems for bed-bound patients in hospitals and nursing homes. Pressure ulcers are painful for patients and costly for healthcare systems. Accurate in-bed posture analysis can significantly help in preventing pressure ulcers. Specifically, bed inclination (back angle) is a factor contributing to pressure ulcer development. In this paper, an efficient methodology is proposed to classify bed inclination. Our approach uses pressure values collected from a commercial pressure mat system. Then, by applying a number of image processing and machine learning techniques, the approximate degree of bed is estimated and classified. The proposed algorithm was tested on 15 subjects with various sizes and weights. The experimental results indicate that our method predicts bed inclination in three classes with 80.3% average accuracy.

  10. Perfect teleportation and superdense coding with W states

    SciTech Connect

    Agrawal, Pankaj; Pati, Arun

    2006-12-15

    True tripartite entanglement of the state of a system of three qubits can be classified on the basis of stochastic local operations and classical communications. Such states can be classified into two categories: GHZ states and W states. It is known that GHZ states can be used for teleportation and superdense coding, but the prototype W state cannot be. However, we show that there is a class of W states that can be used for perfect teleportation and superdense coding.

  11. Bayes classifiers for imbalanced traffic accidents datasets.

    PubMed

    Mujalli, Randa Oqab; López, Griselda; Garach, Laura

    2016-03-01

    Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under-sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents.

  12. Double Ramp Loss Based Reject Option Classifier

    DTIC Science & Technology

    2015-05-22

    choose 10% of these points uniformly at random and flip their labels. 2. Ionosphere Dataset [2] : This dataset describes the problem of discrimi- nating...good versus bad radars based on whether they send some useful infor- mation about the Ionosphere . There are 34 variables and 351 observations. 3... Ionosphere dataset (nonlinear classifiers using RBF kernel for both the approaches) d LDR (C = 2, γ = 0.125) LDH (C = 16, γ = 0.125) Risk RR Acc(unrej

  13. Characterizing and Classifying Acoustical Ambient Sound Profiles

    DTIC Science & Technology

    2015-03-26

    of sound . The value for the speed of sound varies depending on the medium which the sound wave travels through as well as the temperature and...Characterizing and Classifying Acoustical Ambient Sound Profiles THESIS MARCH 2015 Paul T. Gaski, Second Lieutenant, USAF AFIT-ENS-MS-15-M-122... SOUND PROFILES THESIS Presented to the Faculty Department of Operational Sciences Graduate School of Engineering and Management Air Force Institute of

  14. Detecting non-coding selective pressure in coding regions

    PubMed Central

    Chen, Hui; Blanchette, Mathieu

    2007-01-01

    Background Comparative genomics approaches, where orthologous DNA regions are compared and inter-species conserved regions are identified, have proven extremely powerful for identifying non-coding regulatory regions located in intergenic or intronic regions. However, non-coding functional elements can also be located within coding region, as is common for exonic splicing enhancers, some transcription factor binding sites, and RNA secondary structure elements affecting mRNA stability, localization, or translation. Since these functional elements are located in regions that are themselves highly conserved because they are coding for a protein, they generally escaped detection by comparative genomics approaches. Results We introduce a comparative genomics approach for detecting non-coding functional elements located within coding regions. Codon evolution is modeled as a mixture of codon substitution models, where each component of the mixture describes the evolution of codons under a specific type of coding selective pressure. We show how to compute the posterior distribution of the entropy and parsimony scores under this null model of codon evolution. The method is applied to a set of growth hormone 1 orthologous mRNA sequences and a known exonic splicing elements is detected. The analysis of a set of CORTBP2 orthologous genes reveals a region of several hundred base pairs under strong non-coding selective pressure whose function remains unknown. Conclusion Non-coding functional elements, in particular those involved in post-transcriptional regulation, are likely to be much more prevalent than is currently known. With the numerous genome sequencing projects underway, comparative genomics approaches like that proposed here are likely to become increasingly powerful at detecting such elements. PMID:17288582

  15. Error-correction coding

    NASA Technical Reports Server (NTRS)

    Hinds, Erold W. (Principal Investigator)

    1996-01-01

    This report describes the progress made towards the completion of a specific task on error-correcting coding. The proposed research consisted of investigating the use of modulation block codes as the inner code of a concatenated coding system in order to improve the overall space link communications performance. The study proposed to identify and analyze candidate codes that will complement the performance of the overall coding system which uses the interleaved RS (255,223) code as the outer code.

  16. Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    PubMed Central

    Arshad, Sannia; Rho, Seungmin

    2014-01-01

    We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes. PMID:25295302

  17. Reconfiguration-based implementation of SVM classifier on FPGA for Classifying Microarray data.

    PubMed

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2013-01-01

    Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the analysis of Microarray data but also requires high computational power due to its complex mathematical architecture. Implementing SVM on hardware exploits the parallelism available within the algorithm kernels to accelerate the classification of Microarray data. In this work, a flexible, dynamically and partially reconfigurable implementation of the SVM classifier on Field Programmable Gate Array (FPGA) is presented. The SVM architecture achieved up to 85× speed-up over equivalent general purpose processor (GPP) showing the capability of FPGAs in enhancing the performance of SVM-based analysis of Microarray data as well as future bioinformatics applications.

  18. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  19. cDNA cloning and sequence analysis of human pancreatic procarboxypeptidase A1.

    PubMed Central

    Catasús, L; Villegas, V; Pascual, R; Avilés, F X; Wicker-Planquart, C; Puigserver, A

    1992-01-01

    Using polyclonal antibodies raised against human pancreatic procarboxypeptidases, a full-length cDNA coding for an A-type proenzyme was isolated from a lambda gt11 human pancreatic library. This cDNA contains standard 3' and 5' flanking regions, a poly(A)+ tail and a central region of 1260 nucleotides coding for a protein of 419 amino acids. On the basis of sequence comparisons, the human protein was classified as a procarboxypeptidase A1 which is very similar to the previously described A1 forms from rat and bovine pancreatic glands. The presence of the amino acid sequences assumed to be of importance for the zymogen inhibition by its activation segment, primarily on the basis of the recently reported crystal structure of the B form, further supports the proposed classification. PMID:1417781

  20. DNA Nanotechnology-- Architectures Designed with DNA

    NASA Astrophysics Data System (ADS)

    Han, Dongran

    As the genetic information storage vehicle, deoxyribonucleic acid (DNA) molecules are essential to all known living organisms and many viruses. It is amazing that such a large amount of information about how life develops can be stored in these tiny molecules. Countless scientists, especially some biologists, are trying to decipher the genetic information stored in these captivating molecules. Meanwhile, another group of researchers, nanotechnologists in particular, have discovered that the unique and concise structural features of DNA together with its information coding ability can be utilized for nano-construction efforts. This idea culminated in the birth of the field of DNA nanotechnology which is the main topic of this dissertation. The ability of rationally designed DNA strands to self-assemble into arbitrary nanostructures without external direction is the basis of this field. A series of novel design principles for DNA nanotechnology are presented here, from topological DNA nanostructures to complex and curved DNA nanostructures, from pure DNA nanostructures to hybrid RNA/DNA nanostructures. As one of the most important and pioneering fields in controlling the assembly of materials (both DNA and other materials) at the nanoscale, DNA nanotechnology is developing at a dramatic speed and as more and more construction approaches are invented, exciting advances will emerge in ways that we may or may not predict.

  1. Chilean Pitavia more closely related to Oceania and Old World Rutaceae than to Neotropical groups: evidence from two cpDNA non-coding regions, with a new subfamilial classification of the family

    PubMed Central

    Groppo, Milton; Kallunki, Jacquelyn A.; Pirani, José Rubens; Antonelli, Alexandre

    2012-01-01

    Abstract The position of the plant genus Pitavia within an infrafamilial phylogeny of Rutaceae (rue, or orange family) was investigated with the use of two non-coding regions from cpDNA, the trnL-trnF region and the rps16 intron. The only species of the genus, Pitavia punctata Molina, is restricted to the temperate forests of the Coastal Cordillera of Central-Southern Chile and threatened by loss of habitat. The genus traditionally has been treated as part of tribe Zanthoxyleae (subfamily Rutoideae) where it constitutes the monogeneric tribe Pitaviinae. This tribe and genus are characterized by fruits of 1 to 4 fleshy drupelets, unlike the dehiscent fruits typical of the subfamily. Fifty-five taxa of Rutaceae, representing 53 genera (nearly one-third of those in the family) and all subfamilies, tribes, and almost all subtribes of the family were included. Parsimony and Bayesian inference were used to infer the phylogeny; six taxa of Meliaceae, Sapindaceae, and Simaroubaceae, all members of Sapindales, were also used as out-groups. Results from both analyses were congruent and showed Pitavia as sister to Flindersia and Lunasia, both genera with species scattered through Australia, Philippines, Moluccas, New Guinea and the Malayan region, and phylogenetically far from other Neotropical Rutaceae, such as the Galipeinae (Galipeeae, Rutoideae) and Pteleinae (Toddalieae, former Toddalioideae). Additionally, a new circumscription of the subfamilies of Rutaceae is presented and discussed. Only two subfamilies (both monophyletic) are recognized: Cneoroideae (including Dictyolomatoideae, Spathelioideae, Cneoraceae, and Ptaeroxylaceae) and Rutoideae (including not only traditional Rutoideae but also Aurantioideae, Flindersioideae, and Toddalioideae). As a consequence, Aurantioideae (Citrus and allies) is reduced to tribal rank as Aurantieae. PMID:23717188

  2. TU-EF-304-10: Efficient Multiscale Simulation of the Proton Relative Biological Effectiveness (RBE) for DNA Double Strand Break (DSB) Induction and Bio-Effective Dose in the FLUKA Monte Carlo Radiation Transport Code

    SciTech Connect

    Moskvin, V; Tsiamas, P; Axente, M; Farr, J; Stewart, R

    2015-06-15

    Purpose: One of the more critical initiating events for reproductive cell death is the creation of a DNA double strand break (DSB). In this study, we present a computationally efficient way to determine spatial variations in the relative biological effectiveness (RBE) of proton therapy beams within the FLUKA Monte Carlo (MC) code. Methods: We used the independently tested Monte Carlo Damage Simulation (MCDS) developed by Stewart and colleagues (Radiat. Res. 176, 587–602 2011) to estimate the RBE for DSB induction of monoenergetic protons, tritium, deuterium, hellium-3, hellium-4 ions and delta-electrons. The dose-weighted (RBE) coefficients were incorporated into FLUKA to determine the equivalent {sup 6}°60Co γ-ray dose for representative proton beams incident on cells in an aerobic and anoxic environment. Results: We found that the proton beam RBE for DSB induction at the tip of the Bragg peak, including primary and secondary particles, is close to 1.2. Furthermore, the RBE increases laterally to the beam axis at the area of Bragg peak. At the distal edge, the RBE is in the range from 1.3–1.4 for cells irradiated under aerobic conditions and may be as large as 1.5–1.8 for cells irradiated under anoxic conditions. Across the plateau region, the recorded RBE for DSB induction is 1.02 for aerobic cells and 1.05 for cells irradiated under anoxic conditions. The contribution to total effective dose from secondary heavy ions decreases with depth and is higher at shallow depths (e.g., at the surface of the skin). Conclusion: Multiscale simulation of the RBE for DSB induction provides useful insights into spatial variations in proton RBE within pristine Bragg peaks. This methodology is potentially useful for the biological optimization of proton therapy for the treatment of cancer. The study highlights the need to incorporate spatial variations in proton RBE into proton therapy treatment plans.

  3. Intelligent neural network classifier for automatic testing

    NASA Astrophysics Data System (ADS)

    Bai, Baoxing; Yu, Heping

    1996-10-01

    This paper is concerned with an application of a multilayer feedforward neural network for the vision detection of industrial pictures, and introduces a high characteristics image processing and recognizing system which can be used for real-time testing blemishes, streaks and cracks, etc. on the inner walls of high-accuracy pipes. To take full advantage of the functions of the artificial neural network, such as the information distributed memory, large scale self-adapting parallel processing, high fault-tolerance ability, this system uses a multilayer perceptron as a regular detector to extract features of the images to be inspected and classify them.

  4. Classifying Bugs is a Tricky Business.

    DTIC Science & Technology

    1983-08-01

    REPORT II PERIOD COVERED Classifying Bugs is a Tricky Business Technical 6. PERFORMING *"a. REPORT "UNDER 7- AUTHON(a S. CONTRACT on GRANT MuNDER () W...WRITELN(’ BAD INPUT. TRY AGAIN’); READ(RAINFALL) END; IF RAINFALL 4) 99999 THEN BEGIN TOTAL TOTAL + RAINFALL; DAYS DAYS + 1; READ(RAINFALL) END; END...this last question. READ(RAINFALL) WHILE RAINFALL 0, 99999 DO BEGIN WHILE RAINFALL < 0 DO BEGIN VRITELN(’ BAD INPUT. TRY AGAIN’); READ(RAINFALL) END

  5. 46 CFR 503.59 - Safeguarding classified information.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Information Security Program § 503.59 Safeguarding classified information. (a) All classified information... security; (2) Takes appropriate steps to protect classified information from unauthorized disclosure or... security check; (2) To protect the classified information in accordance with the provisions of...

  6. 70. PRIMARY MILL AND CLASSIFIER No. 2 FROM NORTHWEST. MILL ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    70. PRIMARY MILL AND CLASSIFIER No. 2 FROM NORTHWEST. MILL DISCHARGED INTO LAUNDER WHICH PIERCED THE SIDE OF THE CLASSIFIER PAN. WOOD LAUNDER WITHIN CLASSIFIER VISIBLE (FILLED WITH DEBRIS). HORIZONTAL WOOD PLANKING BEHIND MILL IS FEED BOX. MILL SOLUTION PIPING RUNS ALONG BASE OF WEST SIDE OF CLASSIFIER. - Bald Mountain Gold Mill, Nevada Gulch at head of False Bottom Creek, Lead, Lawrence County, SD

  7. 49 CFR 1280.6 - Storage of classified documents.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... SECURITY INFORMATION AND CLASSIFIED MATERIAL § 1280.6 Storage of classified documents. All classified... 49 Transportation 9 2010-10-01 2010-10-01 false Storage of classified documents. 1280.6 Section 1280.6 Transportation Other Regulations Relating to Transportation (Continued) SURFACE...

  8. 46 CFR 503.59 - Safeguarding classified information.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... Information Security Program § 503.59 Safeguarding classified information. (a) All classified information... security; (2) Takes appropriate steps to protect classified information from unauthorized disclosure or... security check; (2) To protect the classified information in accordance with the provisions of...

  9. 46 CFR 503.59 - Safeguarding classified information.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... Information Security Program § 503.59 Safeguarding classified information. (a) All classified information... security; (2) Takes appropriate steps to protect classified information from unauthorized disclosure or... security check; (2) To protect the classified information in accordance with the provisions of...

  10. 36 CFR 1256.46 - National security-classified information.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 36 Parks, Forests, and Public Property 3 2011-07-01 2011-07-01 false National security-classified... Restrictions § 1256.46 National security-classified information. In accordance with 5 U.S.C. 552(b)(1), NARA... properly classified under the provisions of the pertinent Executive Order on Classified National...

  11. 36 CFR 1256.46 - National security-classified information.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 36 Parks, Forests, and Public Property 3 2013-07-01 2012-07-01 true National security-classified... Restrictions § 1256.46 National security-classified information. In accordance with 5 U.S.C. 552(b)(1), NARA... properly classified under the provisions of the pertinent Executive Order on Classified National...

  12. 36 CFR 1256.46 - National security-classified information.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 36 Parks, Forests, and Public Property 3 2012-07-01 2012-07-01 false National security-classified... Restrictions § 1256.46 National security-classified information. In accordance with 5 U.S.C. 552(b)(1), NARA... properly classified under the provisions of the pertinent Executive Order on Classified National...

  13. 5 CFR 1312.23 - Access to classified information.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Access to classified information. 1312.23... Classified Information § 1312.23 Access to classified information. Classified information may be made... “need to know” and the access is essential to the accomplishment of official government duties....

  14. Intelligent query by humming system based on score level fusion of multiple classifiers

    NASA Astrophysics Data System (ADS)

    Pyo Nam, Gi; Thu Trang Luong, Thi; Ha Nam, Hyun; Ryoung Park, Kang; Park, Sung-Joo

    2011-12-01

    Recently, the necessity for content-based music retrieval that can return results even if a user does not know information such as the title or singer has increased. Query-by-humming (QBH) systems have been introduced to address this need, as they allow the user to simply hum snatches of the tune to find the right song. Even though there have been many studies on QBH, few have combined multiple classifiers based on various fusion methods. Here we propose a new QBH system based on the score level fusion of multiple classifiers. This research is novel in the following three respects: three local classifiers [quantized binary (QB) code-based linear scaling (LS), pitch-based dynamic time warping (DTW), and LS] are employed; local maximum and minimum point-based LS and pitch distribution feature-based LS are used as global classifiers; and the combination of local and global classifiers based on the score level fusion by the PRODUCT rule is used to achieve enhanced matching accuracy. Experimental results with the 2006 MIREX QBSH and 2009 MIR-QBSH corpus databases show that the performance of the proposed method is better than that of single classifier and other fusion methods.

  15. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  16. Classifying multispectral data by neural networks

    NASA Technical Reports Server (NTRS)

    Telfer, Brian A.; Szu, Harold H.; Kiang, Richard K.

    1993-01-01

    Several energy functions for synthesizing neural networks are tested on 2-D synthetic data and on Landsat-4 Thematic Mapper data. These new energy functions, designed specifically for minimizing misclassification error, in some cases yield significant improvements in classification accuracy over the standard least mean squares energy function. In addition to operating on networks with one output unit per class, a new energy function is tested for binary encoded outputs, which result in smaller network sizes. The Thematic Mapper data (four bands were used) is classified on a single pixel basis, to provide a starting benchmark against which further improvements will be measured. Improvements are underway to make use of both subpixel and superpixel (i.e. contextual or neighborhood) information in tile processing. For single pixel classification, the best neural network result is 78.7 percent, compared with 71.7 percent for a classical nearest neighbor classifier. The 78.7 percent result also improves on several earlier neural network results on this data.

  17. Mercury⊕: An evidential reasoning image classifier

    NASA Astrophysics Data System (ADS)

    Peddle, Derek R.

    1995-12-01

    MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.

  18. Rotational Study of Ambiguous Taxonomic Classified Asteroids

    NASA Astrophysics Data System (ADS)

    Linder, Tyler R.; Sanchez, Rick; Wuerker, Wolfgang; Clayson, Timothy; Giles, Tucker

    2017-01-01

    The Sloan Digital Sky Survey (SDSS) moving object catalog (MOC4) provided the largest ever catalog of asteroid spectrophotometry observations. Carvano et al. (2010), while analyzing MOC4, discovered that individual observations of asteroids which were observed multiple times did not classify into the same photometric-based taxonomic class. A small subset of those asteroids were classified as having both the presence and absence of a 1um silicate absorption feature. If these variations are linked to differences in surface mineralogy, the prevailing assumption that an asteroid’s surface composition is predominantly homogenous would need to be reexamined. Furthermore, our understanding of the evolution of the asteroid belt, as well as the linkage between certain asteroids and meteorite types may need to be modified.This research is an investigation to determine the rotational rates of these taxonomically ambiguous asteroids. Initial questions to be answered:Do these asteroids have unique or nonstandard rotational rates?Is there any evidence in their light curve to suggest an abnormality?Observations were taken using PROMPT6 a 0.41-m telescope apart of the SKYNET network at Cerro Tololo Inter-American Observatory (CTIO). Observations were calibrated and analyzed using Canopus software. Initial results will be presented at AAS.

  19. Adaptive classifier for steel strip surface defects

    NASA Astrophysics Data System (ADS)

    Jiang, Mingming; Li, Guangyao; Xie, Li; Xiao, Mang; Yi, Li

    2017-01-01

    Surface defects detection system has been receiving increased attention as its precision, speed and less cost. One of the most challenges is reacting to accuracy deterioration with time as aged equipment and changed processes. These variables will make a tiny change to the real world model but a big impact on the classification result. In this paper, we propose a new adaptive classifier with a Bayes kernel (BYEC) which update the model with small sample to it adaptive for accuracy deterioration. Firstly, abundant features were introduced to cover lots of information about the defects. Secondly, we constructed a series of SVMs with the random subspace of the features. Then, a Bayes classifier was trained as an evolutionary kernel to fuse the results from base SVMs. Finally, we proposed the method to update the Bayes evolutionary kernel. The proposed algorithm is experimentally compared with different algorithms, experimental results demonstrate that the proposed method can be updated with small sample and fit the changed model well. Robustness, low requirement for samples and adaptive is presented in the experiment.

  20. Just-in-time adaptive classifiers-part II: designing the classifier.

    PubMed

    Alippi, Cesare; Roveri, Manuel

    2008-12-01

    Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few results addressing nonstationary environments. This paper proposes a methodology based on k-nearest neighbor (NN) classifiers for designing adaptive classification systems able to react to changing conditions just-in-time (JIT), i.e., exactly when it is needed. k-NN classifiers have been selected for their computational-free training phase, the possibility to easily estimate the model complexity k and keep under control the computational complexity of the classifier through suitable data reduction mechanisms. A JIT classifier requires a temporal detection of a (possible) process deviation (aspect tackled in a companion paper) followed by an adaptive management of the knowledge base (KB) of the classifier to cope with the process change. The novelty of the proposed approach resides in the general framework supporting the real-time update of the KB of the classification system in response to novel information coming from the process both in stationary conditions (accuracy improvement) and in nonstationary ones (process tracking) and in providing a suitable estimate of k. It is shown that the classification system grants consistency once the change targets the process generating the data in a new stationary state, as it is the case in many real applications.

  1. Diagnostic Coding for Epilepsy.

    PubMed

    Williams, Korwyn; Nuwer, Marc R; Buchhalter, Jeffrey R

    2016-02-01

    Accurate coding is an important function of neurologic practice. This contribution to Continuum is part of an ongoing series that presents helpful coding information along with examples related to the issue topic. Tips for diagnosis coding, Evaluation and Management coding, procedure coding, or a combination are presented, depending on which is most applicable to the subject area of the issue.

  2. Model Children's Code.

    ERIC Educational Resources Information Center

    New Mexico Univ., Albuquerque. American Indian Law Center.

    The Model Children's Code was developed to provide a legally correct model code that American Indian tribes can use to enact children's codes that fulfill their legal, cultural and economic needs. Code sections cover the court system, jurisdiction, juvenile offender procedures, minor-in-need-of-care, and termination. Almost every Code section is…

  3. Learning algorithms for stack filter classifiers

    SciTech Connect

    Porter, Reid B; Hush, Don; Zimmer, Beate G

    2009-01-01

    Stack Filters define a large class of increasing filter that is used widely in image and signal processing. The motivations for using an increasing filter instead of an unconstrained filter have been described as: (1) fast and efficient implementation, (2) the relationship to mathematical morphology and (3) more precise estimation with finite sample data. This last motivation is related to methods developed in machine learning and the relationship was explored in an earlier paper. In this paper we investigate this relationship by applying Stack Filters directly to classification problems. This provides a new perspective on how monotonicity constraints can help control estimation and approximation errors, and also suggests several new learning algorithms for Boolean function classifiers when they are applied to real-valued inputs.

  4. Classifying prion and prion-like phenomena.

    PubMed

    Harbi, Djamel; Harrison, Paul M

    2014-01-01

    The universe of prion and prion-like phenomena has expanded significantly in the past several years. Here, we overview the challenges in classifying this data informatically, given that terms such as "prion-like", "prion-related" or "prion-forming" do not have a stable meaning in the scientific literature. We examine the spectrum of proteins that have been described in the literature as forming prions, and discuss how "prion" can have a range of meaning, with a strict definition being for demonstration of infection with in vitro-derived recombinant prions. We suggest that although prion/prion-like phenomena can largely be apportioned into a small number of broad groups dependent on the type of transmissibility evidence for them, as new phenomena are discovered in the coming years, a detailed ontological approach might be necessary that allows for subtle definition of different "flavors" of prion / prion-like phenomena.

  5. [Ne V] Emission in Optically Classified Starbursts

    NASA Astrophysics Data System (ADS)

    Abel, N. P.; Satyapal, S.

    2008-05-01

    Detecting active galactic nuclei (AGNs) in galaxies dominated by powerful nuclear star formation and extinction effects poses a unique challenge. Due to the longer wavelength emission and the ionization potential of Ne4+, infrared [Ne V] emission lines are thought to be excellent AGN diagnostics. However, stellar evolution models predict that Wolf-Rayet stars in young stellar clusters emit significant numbers of photons capable of creating Ne4+. Recent observations of [Ne V] emission in optically classified starburst galaxies require us to investigate whether [Ne V] can arise from star formation activity and not an AGN. In this work, we calculate the optical and IR spectrum of gas exposed to a young starburst and AGN SED. We find: (1) a range of parameters where [Ne V] emission can be explained solely by star formation and (2) a range of relative AGN to starburst luminosities that reproduces the [Ne V] observations, yet leaves the optical spectrum looking like a starburst. We also find that infrared emission-line diagnostics are much more sensitive to the AGNs than optical diagnostics, particularly for weak AGNs. We apply our model to the optically classified, yet [Ne V] emitting, starburst galaxy NGC 3621. We find, when taking the infrared and optical spectrum into account, ~30%-50% of the galaxy's total luminosity is due to an AGN. Our calculations show that [Ne V] emission is almost always the result of AGN activity. The models presented in this work can be used to determine the AGN contribution to a galaxy's power output.

  6. MLgsc: A Maximum-Likelihood General Sequence Classifier

    PubMed Central

    Junier, Thomas; Hervé, Vincent; Wunderlin, Tina; Junier, Pilar

    2015-01-01

    We present software package for classifying protein or nucleotide sequences to user-specified sets of reference sequences. The software trains a model using a multiple sequence alignment and a phylogenetic tree, both supplied by the user. The latter is used to guide model construction and as a decision tree to speed up the classification process. The software was evaluated on all the 16S rRNA gene sequences of the reference dataset found in the GreenGenes database. On this dataset, the software was shown to achieve an error rate of around 1% at genus level. Examples of applications based on the nitrogenase subunit NifH gene and a protein-coding gene found in endospore-forming Firmicutes is also presented. The programs in the package have a simple, straightforward command-line interface for the Unix shell, and are free and open-source. The package has minimal dependencies and thus can be easily integrated in command-line based classification pipelines. PMID:26148002

  7. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    PubMed

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  8. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    PubMed Central

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions. PMID:26089862

  9. Monocular precrash vehicle detection: features and classifiers.

    PubMed

    Sun, Zehang; Bebis, George; Miller, Ronald

    2006-07-01

    Robust and reliable vehicle detection from images acquired by a moving vehicle (i.e., on-road vehicle detection) is an important problem with applications to driver assistance systems and autonomous, self-guided vehicles. The focus of this work is on the issues of feature extraction and classification for rear-view vehicle detection. Specifically, by treating the problem of vehicle detection as a two-class classification problem, we have investigated several different feature extraction methods such as principal component analysis, wavelets, and Gabor filters. To evaluate the extracted features, we have experimented with two popular classifiers, neural networks and support vector machines (SVMs). Based on our evaluation results, we have developed an on-board real-time monocular vehicle detection system that is capable of acquiring grey-scale images, using Ford's proprietary low-light camera, achieving an average detection rate of 10 Hz. Our vehicle detection algorithm consists of two main steps: a multiscale driven hypothesis generation step and an appearance-based hypothesis verification step. During the hypothesis generation step, image locations where vehicles might be present are extracted. This step uses multiscale techniques not only to speed up detection, but also to improve system robustness. The appearance-based hypothesis verification step verifies the hypotheses using Gabor features and SVMs. The system has been tested in Ford's concept vehicle under different traffic conditions (e.g., structured highway, complex urban streets, and varying weather conditions), illustrating good performance.

  10. Classifying auroras using artificial neural networks

    NASA Astrophysics Data System (ADS)

    Rydesater, Peter; Brandstrom, Urban; Steen, Ake; Gustavsson, Bjorn

    1999-03-01

    In Auroral Large Imaging System (ALIS) there is need of stable methods for analysis and classification of auroral images and images with for example mother of pearl clouds. This part of ALIS is called Selective Imaging Techniques (SIT) and is intended to sort out images of scientific interest. It's also used to find out what and where in the images there is for example different auroral phenomena's. We will discuss some about the SIT units main functionality but this work is mainly concentrated on how to find auroral arcs and how they are placed in images. Special case have been taken to make the algorithm robust since it's going to be implemented in a SIT unit which will work automatic and often unsupervised and some extends control the data taking of ALIS. The method for finding auroral arcs is based on a local operator that detects intensity differens. This gives arc orientation values as a preprocessing which is fed to a neural network classifier. We will show some preliminary results and possibilities to use and improve this algorithm for use in the future SIT unit.

  11. Identifying and classifying juvenile stalking behavior.

    PubMed

    Evans, Thomas M; Reid Meloy, J

    2011-01-01

    Despite the growing research in the area of stalking, the focus has been on adults who engage in this behavior. Unfortunately, almost no studies investigate the prevalence of this behavior in adolescents. Two cases are presented demonstrating not only that stalking occurs during the period of adolescence, but also that there is a significant difference in the motivation underlying this behavior that can be classified similarly to that of adult stalkers. Further, a suggested classification based on these two cases as well as our experience with other juveniles who have exhibited stalking behaviors is proposed. The first case involves a narcissistic youth who also possesses psychopathic traits, while the second involves a lonely, severely socially awkward teen. Juvenile stalking is a societal problem that has not yet garnered the attention it deserves, and all systems that deal with juvenile delinquency (juvenile court, law enforcement, and mental health personnel) as well as the school system must be educated to the prevalence and severity of this yet-to-be-recognized problem.

  12. 14 CFR § 1216.310 - Classified actions.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... Classified actions. (a) Classification does not relieve NASA of the requirement to assess, document, and consider the environmental impacts of a proposed action. (b) When classified information can reasonably...

  13. Procedures to cover Spillage of Classified Information Onto Unclassified Systems

    EPA Pesticide Factsheets

    The purpose of this is to implement the security control requirements and outline actions required when responding to electronic spillage of classified national security information (classified information) onto unclassified information systems or devices.

  14. 41 CFR 105-62.102 - Authority to originally classify.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... originally classify. (a) Top secret, secret, and confidential. The authority to originally classify information as Top Secret, Secret, or Confidential may be exercised only by the Administrator and is...

  15. 69. VIEW FROM ABOVE OF PRIMARY MILL AND CLASSIFIER No. ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    69. VIEW FROM ABOVE OF PRIMARY MILL AND CLASSIFIER No. 2. PRIMARY CLASSIFIER No. 1 AT RIGHT EDGE OF VIEW. - Bald Mountain Gold Mill, Nevada Gulch at head of False Bottom Creek, Lead, Lawrence County, SD

  16. DNA structure and function.

    PubMed

    Travers, Andrew; Muskhelishvili, Georgi

    2015-06-01

    The proposal of a double-helical structure for DNA over 60 years ago provided an eminently satisfying explanation for the heritability of genetic information. But why is DNA, and not RNA, now the dominant biological information store? We argue that, in addition to its coding function, the ability of DNA, unlike RNA, to adopt a B-DNA structure confers advantages both for information accessibility and for packaging. The information encoded by DNA is both digital - the precise base specifying, for example, amino acid sequences - and analogue. The latter determines the sequence-dependent physicochemical properties of DNA, for example, its stiffness and susceptibility to strand separation. Most importantly, DNA chirality enables the formation of supercoiling under torsional stress. We review recent evidence suggesting that DNA supercoiling, particularly that generated by DNA translocases, is a major driver of gene regulation and patterns of chromosomal gene organization, and in its guise as a promoter of DNA packaging enables DNA to act as an energy store to facilitate the passage of translocating enzymes such as RNA polymerase.

  17. Accumulate repeat accumulate codes

    NASA Technical Reports Server (NTRS)

    Abbasfar, Aliazam; Divsalar, Dariush; Yao, Kung

    2004-01-01

    In this paper we propose an innovative channel coding scheme called 'Accumulate Repeat Accumulate codes' (ARA). This class of codes can be viewed as serial turbo-like codes, or as a subclass of Low Density Parity Check (LDPC) codes, thus belief propagation can be used for iterative decoding of ARA codes on a graph. The structure of encoder for this class can be viewed as precoded Repeat Accumulate (RA) code or as precoded Irregular Repeat Accumulate (IRA) code, where simply an accumulator is chosen as a precoder. Thus ARA codes have simple, and very fast encoder structure when they representing LDPC codes. Based on density evolution for LDPC codes through some examples for ARA codes, we show that for maximum variable node degree 5 a minimum bit SNR as low as 0.08 dB from channel capacity for rate 1/2 can be achieved as the block size goes to infinity. Thus based on fixed low maximum variable node degree, its threshold outperforms not only the RA and IRA codes but also the best known LDPC codes with the dame maximum node degree. Furthermore by puncturing the accumulators any desired high rate codes close to code rate 1 can be obtained with thresholds that stay close to the channel capacity thresholds uniformly. Iterative decoding simulation results are provided. The ARA codes also have projected graph or protograph representation that allows for high speed decoder implementation.

  18. The validity of administrative data to classify patients with spinal column and cord injuries.

    PubMed

    Noonan, Vanessa K; Thorogood, Nancy P; Fingas, Matthew; Batke, Juliet; Bélanger, Lise; Kwon, Brian K; Dvorak, Marcel F

    2013-02-01

    International Classification of Diseases (ICD) codes are used to document patient morbidity in administrative databases. Although administrative data are used for research purposes, the validity of the data to accurately describe clinical diagnostic information is uncertain. We compared the clinical diagnoses for spinal cord and column injuries from a longitudinal patient registry, the Rick Hansen Spinal Cord Injury Registry (RHSCIR), to the ICD-10 spinal injury codes from the Discharge Abstract Database (DAD) at one institution. There were 603 RHSCIR participants with data describing the spinal cord injury, and 341 had data on the spinal column injury. The validity of DAD data to describe spinal injuries was evaluated using the sensitivity and positive predictive values of specific ICD-10 codes; 5.3% of the spinal column injuries and 10.9% of the spinal cord injuries documented in RHSCIR were missed in data from the DAD using ICD-10 codes. The most problematic spinal column ICD-10 code was the dislocation of the cervical vertebra (S13.1); only 14.0% of the dislocations of the cervical vertebrae in RHSCIR were correctly coded in the DAD. The most problematic spinal cord injury ICD-10 code was the incomplete lesion of the lumbar spinal cord (S34.1X); 66.7% of incomplete lesions of the lumbar spinal cord in RHSCIR were correctly coded in the DAD. The validity of DAD data to code spinal injuries is variable, and cannot be reliably used to classify all types of spinal injuries. Patient registries, such as RHSCIR, should be used if accurate detailed diagnostic data are required.

  19. Concatenated Coding Using Trellis-Coded Modulation

    NASA Technical Reports Server (NTRS)

    Thompson, Michael W.

    1997-01-01

    In the late seventies and early eighties a technique known as Trellis Coded Modulation (TCM) was developed for providing spectrally efficient error correction coding. Instead of adding redundant information in the form of parity bits, redundancy is added at the modulation stage thereby increasing bandwidth efficiency. A digital communications system can be designed to use bandwidth-efficient multilevel/phase modulation such as Amplitude Shift Keying (ASK), Phase Shift Keying (PSK), Differential Phase Shift Keying (DPSK) or Quadrature Amplitude Modulation (QAM). Performance gain can be achieved by increasing the number of signals over the corresponding uncoded system to compensate for the redundancy introduced by the code. A considerable amount of research and development has been devoted toward developing good TCM codes for severely bandlimited applications. More recently, the use of TCM for satellite and deep space communications applications has received increased attention. This report describes the general approach of using a concatenated coding scheme that features TCM and RS coding. Results have indicated that substantial (6-10 dB) performance gains can be achieved with this approach with comparatively little bandwidth expansion. Since all of the bandwidth expansion is due to the RS code we see that TCM based concatenated coding results in roughly 10-50% bandwidth expansion compared to 70-150% expansion for similar concatenated scheme which use convolution code. We stress that combined coding and modulation optimization is important for achieving performance gains while maintaining spectral efficiency.

  20. Coset Codes Viewed as Terminated Convolutional Codes

    NASA Technical Reports Server (NTRS)

    Fossorier, Marc P. C.; Lin, Shu

    1996-01-01

    In this paper, coset codes are considered as terminated convolutional codes. Based on this approach, three new general results are presented. First, it is shown that the iterative squaring construction can equivalently be defined from a convolutional code whose trellis terminates. This convolutional code determines a simple encoder for the coset code considered, and the state and branch labelings of the associated trellis diagram become straightforward. Also, from the generator matrix of the code in its convolutional code form, much information about the trade-off between the state connectivity and complexity at each section, and the parallel structure of the trellis, is directly available. Based on this generator matrix, it is shown that the parallel branches in the trellis diagram of the convolutional code represent the same coset code C(sub 1), of smaller dimension and shorter length. Utilizing this fact, a two-stage optimum trellis decoding method is devised. The first stage decodes C(sub 1), while the second stage decodes the associated convolutional code, using the branch metrics delivered by stage 1. Finally, a bidirectional decoding of each received block starting at both ends is presented. If about the same number of computations is required, this approach remains very attractive from a practical point of view as it roughly doubles the decoding speed. This fact is particularly interesting whenever the second half of the trellis is the mirror image of the first half, since the same decoder can be implemented for both parts.

  1. 46 CFR 108.185 - Ventilation for enclosed classified locations.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Ventilation § 108.185 Ventilation for enclosed classified locations. (a) The ventilation system for each enclosed classified location must be... 46 Shipping 4 2013-10-01 2013-10-01 false Ventilation for enclosed classified locations....

  2. 46 CFR 108.185 - Ventilation for enclosed classified locations.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Ventilation § 108.185 Ventilation for enclosed classified locations. (a) The ventilation system for each enclosed classified location must be... 46 Shipping 4 2012-10-01 2012-10-01 false Ventilation for enclosed classified locations....

  3. 46 CFR 108.185 - Ventilation for enclosed classified locations.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Ventilation § 108.185 Ventilation for enclosed classified locations. (a) The ventilation system for each enclosed classified location must be... 46 Shipping 4 2014-10-01 2014-10-01 false Ventilation for enclosed classified locations....

  4. Mental Representation and Cognitive Consequences of Chinese Individual Classifiers

    ERIC Educational Resources Information Center

    Gao, Ming Y.; Malt, Barbara C.

    2009-01-01

    Classifier languages are spoken by a large portion of the world's population, but psychologists have only recently begun to investigate the psychological reality of classifier categories and their potential for influencing non-linguistic thought. The current work evaluates both the mental representation of classifiers and potential cognitive…

  5. Method of generating features optimal to a dataset and classifier

    DOEpatents

    Bruillard, Paul J.; Gosink, Luke J.; Jarman, Kenneth D.

    2016-10-18

    A method of generating features optimal to a particular dataset and classifier is disclosed. A dataset of messages is inputted and a classifier is selected. An algebra of features is encoded. Computable features that are capable of describing the dataset from the algebra of features are selected. Irredundant features that are optimal for the classifier and the dataset are selected.

  6. 14 CFR 1203.402 - Classifying material other than documentation.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., application or use. The overall classification assigned to equipment or objects shall be at least as high as... INFORMATION SECURITY PROGRAM Guides for Original Classification § 1203.402 Classifying material other than documentation. Items of equipment or other physical objects may be classified only where classified...

  7. 14 CFR 1203.402 - Classifying material other than documentation.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ..., application or use. The overall classification assigned to equipment or objects shall be at least as high as... INFORMATION SECURITY PROGRAM Guides for Original Classification § 1203.402 Classifying material other than documentation. Items of equipment or other physical objects may be classified only where classified...

  8. 14 CFR 1203.402 - Classifying material other than documentation.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., application or use. The overall classification assigned to equipment or objects shall be at least as high as... INFORMATION SECURITY PROGRAM Guides for Original Classification § 1203.402 Classifying material other than documentation. Items of equipment or other physical objects may be classified only where classified...

  9. 14 CFR 1203.402 - Classifying material other than documentation.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ..., application or use. The overall classification assigned to equipment or objects shall be at least as high as... INFORMATION SECURITY PROGRAM Guides for Original Classification § 1203.402 Classifying material other than documentation. Items of equipment or other physical objects may be classified only where classified...

  10. 29 CFR 1910.307 - Hazardous (classified) locations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 29 Labor 5 2010-07-01 2010-07-01 false Hazardous (classified) locations. 1910.307 Section 1910.307... Electrical Systems § 1910.307 Hazardous (classified) locations. (a) Scope—(1) Applicability. This section covers the requirements for electric equipment and wiring in locations that are classified depending...

  11. Discussion on LDPC Codes and Uplink Coding

    NASA Technical Reports Server (NTRS)

    Andrews, Ken; Divsalar, Dariush; Dolinar, Sam; Moision, Bruce; Hamkins, Jon; Pollara, Fabrizio

    2007-01-01

    This slide presentation reviews the progress that the workgroup on Low-Density Parity-Check (LDPC) for space link coding. The workgroup is tasked with developing and recommending new error correcting codes for near-Earth, Lunar, and deep space applications. Included in the presentation is a summary of the technical progress of the workgroup. Charts that show the LDPC decoder sensitivity to symbol scaling errors are reviewed, as well as a chart showing the performance of several frame synchronizer algorithms compared to that of some good codes and LDPC decoder tests at ESTL. Also reviewed is a study on Coding, Modulation, and Link Protocol (CMLP), and the recommended codes. A design for the Pseudo-Randomizer with LDPC Decoder and CRC is also reviewed. A chart that summarizes the three proposed coding systems is also presented.

  12. Development of a combined GIS, neural network and Bayesian classifier methodology for classifying remotely sensed data

    NASA Astrophysics Data System (ADS)

    Schneider, Claudio Albert

    This research is aimed at the solution of two common but still largely unsolved problems in the classification of remotely sensed data: (1) Classification accuracy of remotely sensed data decreases significantly in mountainous terrain, where topography strongly influences the spectral response of the features on the ground; and (2) when attempting to obtain more detailed classifications, e.g. forest cover types or species, rather than just broad categories of forest such as coniferous or deciduous, the accuracy of the classification generally decreases significantly. The main objective of the study was to develop a widely applicable and efficient classification procedure for mapping forest and other cover types in mountainous terrain, using an integrated GIS/neural network/Bayesian classification approach. The performance of this new technique was compared to a standard supervised Maximum Likelihood classification technique, a "conventional" Bayesian/Maximum Likelihood classification, and to a "conventional" neural network classifier. Results indicate a considerable improvement of the new technique over the standard Maximum Likelihood classification technique, as well as a better accuracy than the "conventional" Bayesian/Maximum Likelihood classifier (13.08 percent improvement in overall accuracy), but the "conventional" neural network classifiers outperformed all the techniques compared in this study, with an overall accuracy improvement of 15.94 percent as compared to the standard Maximum Likelihood classifier (from 46.77 percent to 62.71 percent). However, the overall accuracies of all the classification techniques compared in this study were relative low. It is believed that this was caused by problems related to the inadequacy of the reference data. On the other hand, the results also indicate the need to develop a different sampling design to more effectively cover the variability across all the parameters needed by the neural network classification technique

  13. Bar Codes for Libraries.

    ERIC Educational Resources Information Center

    Rahn, Erwin

    1984-01-01

    Discusses the evolution of standards for bar codes (series of printed lines and spaces that represent numbers, symbols, and/or letters of alphabet) and describes the two types most frequently adopted by libraries--Code-A-Bar and CODE 39. Format of the codes is illustrated. Six references and definitions of terminology are appended. (EJS)

  14. Manually operated coded switch

    DOEpatents

    Barnette, Jon H.

    1978-01-01

    The disclosure relates to a manually operated recodable coded switch in which a code may be inserted, tried and used to actuate a lever controlling an external device. After attempting a code, the switch's code wheels must be returned to their zero positions before another try is made.

  15. 22 CFR 125.7 - Procedures for the export of classified technical data and other classified defense articles.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... DSP-85. (b) An application for the export of classified technical data or other classified defense articles must be accompanied by seven copies of the data and a completed Form DSP-83 (see § 123.10 of...

  16. 22 CFR 125.7 - Procedures for the export of classified technical data and other classified defense articles.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... DSP-85. (b) An application for the export of classified technical data or other classified defense articles must be accompanied by seven copies of the data and a completed Form DSP-83 (see § 123.10 of...

  17. 22 CFR 125.7 - Procedures for the export of classified technical data and other classified defense articles.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... DSP-85. (b) An application for the export of classified technical data or other classified defense articles must be accompanied by seven copies of the data and a completed Form DSP-83 (see § 123.10 of...

  18. 22 CFR 125.7 - Procedures for the export of classified technical data and other classified defense articles.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... DSP-85. (b) An application for the export of classified technical data or other classified defense articles must be accompanied by seven copies of the data and a completed Form DSP-83 (see § 123.10 of...

  19. 22 CFR 125.7 - Procedures for the export of classified technical data and other classified defense articles.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... DSP-85. (b) An application for the export of classified technical data or other classified defense articles must be accompanied by seven copies of the data and a completed Form DSP-83 (see § 123.10 of...

  20. The Genomic Code for Nucleosome Positioning

    NASA Astrophysics Data System (ADS)

    Widom, Jonathan

    2008-03-01

    Eukaryotic genomes encode an additional layer of genetic information, superimposed on top of the regulatory and coding information, that controls the organization of the genomic DNA into arrays of nucleosomes. We have developed a partial ability to read this nucleosome positioning code and predict the in vivo locations of nucleosomes. Our results suggest that genomes utilize the nucleosome positioning code to facilitate specific chromosome functions including to delineate functional versus nonfunctional binding sites for key gene regulatory proteins, and to define the next higher level of chromosome structure itself.

  1. Categories of Code-Switching in Hispanic Communities: Untangling the Terminology. Sociolinguistic Working Paper Number 76.

    ERIC Educational Resources Information Center

    Baker, Opal Ruth

    Research on Spanish/English code switching is reviewed and the definitions and categories set up by the investigators are examined. Their methods of locating, limiting, and classifying true code switches, and the terms used and results obtained, are compared. It is found that in these studies, conversational (intra-discourse) code switching is…

  2. QR Codes 101

    ERIC Educational Resources Information Center

    Crompton, Helen; LaFrance, Jason; van 't Hooft, Mark

    2012-01-01

    A QR (quick-response) code is a two-dimensional scannable code, similar in function to a traditional bar code that one might find on a product at the supermarket. The main difference between the two is that, while a traditional bar code can hold a maximum of only 20 digits, a QR code can hold up to 7,089 characters, so it can contain much more…

  3. ARA type protograph codes

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Abbasfar, Aliazam (Inventor); Jones, Christopher R. (Inventor); Dolinar, Samuel J. (Inventor); Thorpe, Jeremy C. (Inventor); Andrews, Kenneth S. (Inventor); Yao, Kung (Inventor)

    2008-01-01

    An apparatus and method for encoding low-density parity check codes. Together with a repeater, an interleaver and an accumulator, the apparatus comprises a precoder, thus forming accumulate-repeat-accumulate (ARA codes). Protographs representing various types of ARA codes, including AR3A, AR4A and ARJA codes, are described. High performance is obtained when compared to the performance of current repeat-accumulate (RA) or irregular-repeat-accumulate (IRA) codes.

  4. A mathematical formulation of DNA computation.

    PubMed

    Zhang, Mingjun; Cheng, Maggie X; Tarn, Tzyh-Jong

    2006-03-01

    DNA computation is to use DNA molecules for information storing and processing. The task is accomplished by encoding and interpreting DNA molecules in suspended solutions before and after the complementary binding reactions. DNA computation is attractive, due to its fast parallel information processing, remarkable energy efficiency, and high storing capacity. Challenges currently faced by DNA computation are: 1) lack of theoretical computational models for applications and 2) high error rate for implementation. This paper attempts to address these problems from mathematical modeling and genetic coding aspects. The first part of this paper presents a mathematical formulation of DNA computation. The model may serve as a theoretical framework for DNA computation. In the second part, a genetic code based DNA computation approach is presented to reduce error rate for implementation, which has been a major concern for DNA computation. The method provides a promising alternative to reduce error rate for DNA computation.

  5. Group-specific amplification of cDNA from DRB1 genes. Complete coding sequences of partially defined alleles and identification of the new alleles DRB1*040602, DRB1*111102, DRB1*080103, and DRB1*0113.

    PubMed

    Balas, Antonio; Vilches, Carlos; Rodríguez, Miguel A; Fernández, Begoña; Martinez, Maria Paz; de Pablo, Rosario; García-Sánchez, Félix; Vicario, Jose L

    2006-12-01

    We present here the complete coding sequences, previously unavailable, of the DRB1 alleles DRB1*030102, *0306, *040701, *0408, *1327, *1356, *1411, *1446, *1503, *1504, *0806, *0813, and *0818. For cDNA isolation, new group-specific primers located at the 5'UT and 3'UT regions were used to carry out allele-specific amplification and a convenient method for determining full-length sequences for DRB1 alleles. Complete coding sequencing of samples previously typed as DRB1*0406, DRB1*080101, and DRB1*1111 revealed new alleles with noncoding nucleotide changes at exons 1 and 3. In addition, we found a novel allele, DRB1*0113, whose second exon carries a sequence motif characteristic of DRB1*07 alleles. The predicted class II haplotypic associations of all alleles are reported and discussed.

  6. cncRNAs: Bi-functional RNAs with protein coding and non-coding functions.

    PubMed

    Kumari, Pooja; Sampath, Karuna

    2015-12-01

    For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as 'cncRNAs', have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions.

  7. Identification of Lactobacillus UFV H2B20 (probiotic strain) using DNA-DNA hybridization

    PubMed Central

    de Magalhães, J.T.; Uetanabaro, A.P. T.; de Moraes, C.A.

    2008-01-01

    Sequence analyses of the 16S rDNA gene and DNA-DNA hybridization tests were performed for identification of the species of the probiotic Lactobacillus UFV H2b20 strain. Using these two tests, we concluded that this strain, originally considered Lact. acidophilus, should be classified as Lact. delbrueckii. PMID:24031263

  8. CRITICA: coding region identification tool invoking comparative analysis

    NASA Technical Reports Server (NTRS)

    Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

    1999-01-01

    Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).

  9. Classifying aging as a disease in the context of ICD-11

    PubMed Central

    Zhavoronkov, Alex; Bhullar, Bhupinder

    2015-01-01

    Aging is a complex continuous multifactorial process leading to loss of function and crystalizing into the many age-related diseases. Here, we explore the arguments for classifying aging as a disease in the context of the upcoming World Health Organization’s 11th International Statistical Classification of Diseases and Related Health Problems (ICD-11), expected to be finalized in 2018. We hypothesize that classifying aging as a disease with a “non-garbage” set of codes will result in new approaches and business models for addressing aging as a treatable condition, which will lead to both economic and healthcare benefits for all stakeholders. Actionable classification of aging as a disease may lead to more efficient allocation of resources by enabling funding bodies and other stakeholders to use quality-adjusted life years (QALYs) and healthy-years equivalent (HYE) as metrics when evaluating both research and clinical programs. We propose forming a Task Force to interface the WHO in order to develop a multidisciplinary framework for classifying aging as a disease with multiple disease codes facilitating for therapeutic interventions and preventative strategies. PMID:26583032

  10. Analysis of mitochondrial DNA polymorphisms in Guangdong Han Chinese.

    PubMed

    Chen, Feng; Wang, Sha-Yan; Zhang, Ruan-Zhang; Hu, Yu-Hua; Gao, Guo-Feng; Liu, Yan-Hui; Kong, Qing-Peng

    2008-03-01

    Previous investigations on Chinese mitochondrial DNA (mtDNA) variation revealed that the matrilineal gene pool of southern Han Chinese is rather complex, with much higher genetic diversity and more basal/ancient lineages than the northern Hans. The extreme case is Guangdong Han populations, among which pronounced (matrilineal) differentiation has been observed, indicative of complex demography of the region. To get more insights into the maternal makeup of southern Han Chinese, mtDNA variation of a total of 106 individuals sampled from Dongguan, Guangdong Province, China, was analyzed in this study. With the aid of the information from control-region hypervariable segments I and II (HVS-I and -II) as well as some necessary coding-region segments, the phylogenetic status of all mtDNAs under examination were determined according to the reconstructed East Asian mtDNA tree. In this way, the mtDNAs have been classified into various haplogroups or sub-haplogroups. The southern-prevalent haplogroups, such as R9 (20.8%), B (17.9%), M7b (14.2%), show relatively high distribution frequencies in Dongguan Hans; whereas the frequencies of Northern-prevalent haplogroups (with the exception of D) are quite low: C (1.9%), G2 (1.9%) and Z (1.9%), indicating the southern-origin of Dongguan Hans.

  11. Efficient entropy coding for scalable video coding

    NASA Astrophysics Data System (ADS)

    Choi, Woong Il; Yang, Jungyoup; Jeon, Byeungwoo

    2005-10-01

    The standardization for the scalable extension of H.264 has called for additional functionality based on H.264 standard to support the combined spatio-temporal and SNR scalability. For the entropy coding of H.264 scalable extension, Context-based Adaptive Binary Arithmetic Coding (CABAC) scheme is considered so far. In this paper, we present a new context modeling scheme by using inter layer correlation between the syntax elements. As a result, it improves coding efficiency of entropy coding in H.264 scalable extension. In simulation results of applying the proposed scheme to encoding the syntax element mb_type, it is shown that improvement in coding efficiency of the proposed method is up to 16% in terms of bit saving due to estimation of more adequate probability model.

  12. DNA-based watermarks using the DNA-Crypt algorithm

    PubMed Central

    Heider, Dominik; Barnekow, Angelika

    2007-01-01

    Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434

  13. Numerical classification of coding sequences

    NASA Technical Reports Server (NTRS)

    Collins, D. W.; Liu, C. C.; Jukes, T. H.

    1992-01-01

    DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.

  14. EMdeCODE: a novel algorithm capable of reading words of epigenetic code to predict enhancers and retroviral integration sites and to identify H3R2me1 as a distinctive mark of coding versus non-coding genes

    PubMed Central

    Santoni, Federico Andrea

    2013-01-01

    Existence of some extra-genetic (epigenetic) codes has been postulated since the discovery of the primary genetic code. Evident effects of histone post-translational modifications or DNA methylation over the efficiency and the regulation of DNA processes are supporting this postulation. EMdeCODE is an original algorithm that approximate the genomic distribution of given DNA features (e.g. promoter, enhancer, viral integration) by identifying relevant ChIPSeq profiles of post-translational histone marks or DNA binding proteins and combining them in a supermark. EMdeCODE kernel is essentially a two-step procedure: (i) an expectation-maximization process calculates the mixture of epigenetic factors that maximize the Sensitivity (recall) of the association with the feature under study; (ii) the approximated density is then recursively trimmed with respect to a control dataset to increase the precision by reducing the number of false positives. EMdeCODE densities improve significantly the prediction of enhancer loci and retroviral integration sites with respect to previous methods. Importantly, it can also be used to extract distinctive factors between two arbitrary conditions. Indeed EMdeCODE identifies unexpected epigenetic profiles specific for coding versus non-coding RNA, pointing towards a new role for H3R2me1 in coding regions. PMID:23234700

  15. CLASSIFYING X-RAY BINARIES: A PROBABILISTIC APPROACH

    SciTech Connect

    Gopalan, Giri; Bornn, Luke; Vrtilek, Saeqa Dil

    2015-08-10

    In X-ray binary star systems consisting of a compact object that accretes material from an orbiting secondary star, there is no straightforward means to decide whether the compact object is a black hole or a neutron star. To assist in this process, we develop a Bayesian statistical model that makes use of the fact that X-ray binary systems appear to cluster based on their compact object type when viewed from a three-dimensional coordinate system derived from X-ray spectral data where the first coordinate is the ratio of counts in the mid- to low-energy band (color 1), the second coordinate is the ratio of counts in the high- to low-energy band (color 2), and the third coordinate is the sum of counts in all three bands. We use this model to estimate the probabilities of an X-ray binary system containing a black hole, non-pulsing neutron star, or pulsing neutron star. In particular, we utilize a latent variable model in which the latent variables follow a Gaussian process prior distribution, and hence we are able to induce the spatial correlation which we believe exists between systems of the same type. The utility of this approach is demonstrated by the accurate prediction of system types using Rossi X-ray Timing Explorer All Sky Monitor data, but it is not flawless. In particular, non-pulsing neutron systems containing “bursters” that are close to the boundary demarcating systems containing black holes tend to be classified as black hole systems. As a byproduct of our analyses, we provide the astronomer with the public R code which can be used to predict the compact object type of XRBs given training data.

  16. Classifying X-Ray Binaries: A Probabilistic Approach

    NASA Astrophysics Data System (ADS)

    Gopalan, Giri; Dil Vrtilek, Saeqa; Bornn, Luke

    2015-08-01

    In X-ray binary star systems consisting of a compact object that accretes material from an orbiting secondary star, there is no straightforward means to decide whether the compact object is a black hole or a neutron star. To assist in this process, we develop a Bayesian statistical model that makes use of the fact that X-ray binary systems appear to cluster based on their compact object type when viewed from a three-dimensional coordinate system derived from X-ray spectral data where the first coordinate is the ratio of counts in the mid- to low-energy band (color 1), the second coordinate is the ratio of counts in the high- to low-energy band (color 2), and the third coordinate is the sum of counts in all three bands. We use this model to estimate the probabilities of an X-ray binary system containing a black hole, non-pulsing neutron star, or pulsing neutron star. In particular, we utilize a latent variable model in which the latent variables follow a Gaussian process prior distribution, and hence we are able to induce the spatial correlation which we believe exists between systems of the same type. The utility of this approach is demonstrated by the accurate prediction of system types using Rossi X-ray Timing Explorer All Sky Monitor data, but it is not flawless. In particular, non-pulsing neutron systems containing “bursters” that are close to the boundary demarcating systems containing black holes tend to be classified as black hole systems. As a byproduct of our analyses, we provide the astronomer with the public R code which can be used to predict the compact object type of XRBs given training data.

  17. Exons, Introns, and DNA Thermodynamics

    NASA Astrophysics Data System (ADS)

    Carlon, Enrico; Malki, Mehdi Lejard; Blossey, Ralf

    2005-05-01

    The genes of eukaryotes are characterized by protein coding fragments, the exons, interrupted by introns, i.e., stretches of DNA which do not carry useful information for protein synthesis. We have analyzed the melting behavior of randomly selected human cDNA sequences obtained from genomic DNA by removing all introns. A clear correspondence is observed between exons and melting domains. This finding may provide new insights into the physical mechanisms underlying the evolution of genes.

  18. Revisiting the Physico-Chemical Hypothesis of Code Origin: An Analysis Based on Code-Sequence Coevolution in a Finite Population

    NASA Astrophysics Data System (ADS)

    Bandhu, Ashutosh Vishwa; Aggarwal, Neha; Sengupta, Supratim

    2013-12-01

    The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.

  19. Stochastic margin-based structure learning of Bayesian network classifiers.

    PubMed

    Pernkopf, Franz; Wohlmayr, Michael

    2013-02-01

    The margin criterion for parameter learning in graphical models gained significant impact over the last years. We use the maximum margin score for discriminatively optimizing the structure of Bayesian network classifiers. Furthermore, greedy hill-climbing and simulated annealing search heuristics are applied to determine the classifier structures. In the experiments, we demonstrate the advantages of maximum margin optimized Bayesian network structures in terms of classification performance compared to traditionally used discriminative structure learning methods. Stochastic simulated annealing requires less score evaluations than greedy heuristics. Additionally, we compare generative and discriminative parameter learning on both generatively and discriminatively structured Bayesian network classifiers. Margin-optimized Bayesian network classifiers achieve similar classification performance as support vector machines. Moreover, missing feature values during classification can be handled by discriminatively optimized Bayesian network classifiers, a case where purely discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.

  20. Stochastic margin-based structure learning of Bayesian network classifiers

    PubMed Central

    Pernkopf, Franz; Wohlmayr, Michael

    2013-01-01

    The margin criterion for parameter learning in graphical models gained significant impact over the last years. We use the maximum margin score for discriminatively optimizing the structure of Bayesian network classifiers. Furthermore, greedy hill-climbing and simulated annealing search heuristics are applied to determine the classifier structures. In the experiments, we demonstrate the advantages of maximum margin optimized Bayesian network structures in terms of classification performance compared to traditionally used discriminative structure learning methods. Stochastic simulated annealing requires less score evaluations than greedy heuristics. Additionally, we compare generative and discriminative parameter learning on both generatively and discriminatively structured Bayesian network classifiers. Margin-optimized Bayesian network classifiers achieve similar classification performance as support vector machines. Moreover, missing feature values during classification can be handled by discriminatively optimized Bayesian network classifiers, a case where purely discriminative classifiers usually require mechanisms to complete unknown feature values in the data first. PMID:24511159

  1. Bayesian classifiers applied to the Tennessee Eastman process.

    PubMed

    Dos Santos, Edimilson Batista; Ebecken, Nelson F F; Hruschka, Estevam R; Elkamel, Ali; Madhuranthakam, Chandra M R

    2014-03-01

    Fault diagnosis includes the main task of classification. Bayesian networks (BNs) present several advantages in the classification task, and previous works have suggested their use as classifiers. Because a classifier is often only one part of a larger decision process, this article proposes, for industrial process diagnosis, the use of a Bayesian method called dynamic Markov blanket classifier that has as its main goal the induction of accurate Bayesian classifiers having dependable probability estimates and revealing actual relationships among the most relevant variables. In addition, a new method, named variable ordering multiple offspring sampling capable of inducing a BN to be used as a classifier, is presented. The performance of these methods is assessed on the data of a benchmark problem known as the Tennessee Eastman process. The obtained results are compared with naive Bayes and tree augmented network classifiers, and confirm that both proposed algorithms can provide good classification accuracies as well as knowledge about relevant variables.

  2. Structural diversity of supercoiled DNA

    PubMed Central

    Irobalieva, Rossitza N.; Fogg, Jonathan M.; Catanese, Daniel J.; Sutthibutpong, Thana; Chen, Muyuan; Barker, Anna K.; Ludtke, Steven J.; Harris, Sarah A.; Schmid, Michael F.; Chiu, Wah; Zechiedrich, Lynn

    2015-01-01

    By regulating access to the genetic code, DNA supercoiling strongly affects DNA metabolism. Despite its importance, however, much about supercoiled DNA (positively supercoiled DNA, in particular) remains unknown. Here we use electron cryo-tomography together with biochemical analyses to investigate structures of individual purified DNA minicircle topoisomers with defined degrees of supercoiling. Our results reveal that each topoisomer, negative or positive, adopts a unique and surprisingly wide distribution of three-dimensional conformations. Moreover, we uncover striking differences in how the topoisomers handle torsional stress. As negative supercoiling increases, bases are increasingly exposed. Beyond a sharp supercoiling threshold, we also detect exposed bases in positively supercoiled DNA. Molecular dynamics simulations independently confirm the conformational heterogeneity and provide atomistic insight into the flexibility of supercoiled DNA. Our integrated approach reveals the three-dimensional structures of DNA that are essential for its function. PMID:26455586

  3. Mitochondrial DNA haplogroup phylogeny of the dog: Proposal for a cladistic nomenclature.

    PubMed

    Fregel, Rosa; Suárez, Nicolás M; Betancor, Eva; González, Ana M; Cabrera, Vicente M; Pestano, José

    2015-05-01

    Canis lupus familiaris mitochondrial DNA analysis has increased in recent years, not only for the purpose of deciphering dog domestication but also for forensic genetic studies or breed characterization. The resultant accumulation of data has increased the need for a normalized and phylogenetic-based nomenclature like those provided for human maternal lineages. Although a standardized classification has been proposed, haplotype names within clades have been assigned gradually without considering the evolutionary history of dog mtDNA. Moreover, this classification is based only on the D-loop region, proven to be insufficient for phylogenetic purposes due to its high number of recurrent mutations and the lack of relevant information present in the coding region. In this study, we design 1) a refined mtDNA cladistic nomenclature from a phylogenetic tree based on complete sequences, classifying dog maternal lineages into haplogroups defined by specific diagnostic mutations, and 2) a coding region SNP analysis that allows a more accurate classification into haplogroups when combined with D-loop sequencing, thus improving the phylogenetic information obtained in dog mitochondrial DNA studies.

  4. One-hot vector hybrid associative classifier for medical data classification.

    PubMed

    Uriarte-Arcia, Abril Valeria; López-Yáñez, Itzamá; Yáñez-Márquez, Cornelio

    2014-01-01

    Pattern recognition and classification are two of the key topics in computer science. In this paper a novel method for the task of pattern classification is presented. The proposed method combines a hybrid associative classifier (Clasificador Híbrido Asociativo con Traslación, CHAT, in Spanish), a coding technique for output patterns called one-hot vector and majority voting during the classification step. The method is termed as CHAT One-Hot Majority (CHAT-OHM). The performance of the method is validated by comparing the accuracy of CHAT-OHM with other well-known classification algorithms. During the experimental phase, the classifier was applied to four datasets related to the medical field. The results also show that the proposed method outperforms the original CHAT classification accuracy.

  5. One-Hot Vector Hybrid Associative Classifier for Medical Data Classification

    PubMed Central

    Uriarte-Arcia, Abril Valeria; López-Yáñez, Itzamá; Yáñez-Márquez, Cornelio

    2014-01-01

    Pattern recognition and classification are two of the key topics in computer science. In this paper a novel method for the task of pattern classification is presented. The proposed method combines a hybrid associative classifier (Clasificador Híbrido Asociativo con Traslación, CHAT, in Spanish), a coding technique for output patterns called one-hot vector and majority voting during the classification step. The method is termed as CHAT One-Hot Majority (CHAT-OHM). The performance of the method is validated by comparing the accuracy of CHAT-OHM with other well-known classification algorithms. During the experimental phase, the classifier was applied to four datasets related to the medical field. The results also show that the proposed method outperforms the original CHAT classification accuracy. PMID:24752287

  6. Honesty and Honor Codes.

    ERIC Educational Resources Information Center

    McCabe, Donald; Trevino, Linda Klebe

    2002-01-01

    Explores the rise in student cheating and evidence that students cheat less often at schools with an honor code. Discusses effective use of such codes and creation of a peer culture that condemns dishonesty. (EV)

  7. QR Code Mania!

    ERIC Educational Resources Information Center

    Shumack, Kellie A.; Reilly, Erin; Chamberlain, Nik

    2013-01-01

    space, has error-correction capacity, and can be read from any direction. These codes are used in manufacturing, shipping, and marketing, as well as in education. QR codes can be created to produce…

  8. DIANE multiparticle transport code

    NASA Astrophysics Data System (ADS)

    Caillaud, M.; Lemaire, S.; Ménard, S.; Rathouit, P.; Ribes, J. C.; Riz, D.

    2014-06-01

    DIANE is the general Monte Carlo code developed at CEA-DAM. DIANE is a 3D multiparticle multigroup code. DIANE includes automated biasing techniques and is optimized for massive parallel calculations.

  9. Genetic algorithms and classifier systems: Foundations and future directions

    SciTech Connect

    Holland, J.H.

    1987-01-01

    Theoretical questions about classifier systems, with rare exceptions, apply equally to other adaptive nonlinear networks (ANNs) such as the connectionist models of cognitive psychology, the immune system, economic systems, ecologies, and genetic systems. This paper discusses pervasive properties of ANNs and the kinds of mathematics relevant to questions about these properties. It discusses relevant functional extensions of the basic classifier system and extensions of the extant mathematical theory. An appendix briefly reviews some of the key theorems about classifier systems. 6 refs.

  10. 6 CFR 7.23 - Emergency release of classified information.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... Classified Information Non-disclosure Form. In emergency situations requiring immediate verbal release of... information through approved communication channels by the most secure and expeditious method possible, or...

  11. Microfilariae Classification Using Multiple Classifiers for Color and Shape Features

    NASA Astrophysics Data System (ADS)

    AL-Tam, Faroq; dos Anjos, António; Pion, Sébastien; Boussinesq, Michel; Shahbazkia, Hamid Reza

    2016-12-01

    This paper presents a multi-classifier approach for classifying microfilariae in 2-D images. A shape descriptor based on the quench function is described. This descriptor is represented as a feature vector that encodes the shape information. The color feature vector is calculated as a histogram. Two classifiers were used to train both color and shape feature vectors, one for each vector. The posterior probabilities calculated from the scores of each classifier are then used to calculate the final classification decision. The experimental results show that, although the proposed approach is simple, it is efficient when compared to various approaches.

  12. 32 CFR 2700.52 - Classified Review Committee.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... NEGOTIATIONS SECURITY INFORMATION REGULATIONS Implementation and Review § 2700.52 Classified Review Committee... President's Personal Representative, Department of Defense/Legal Advisor and Political/Economic Advisor....

  13. 32 CFR 2700.52 - Classified Review Committee.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... NEGOTIATIONS SECURITY INFORMATION REGULATIONS Implementation and Review § 2700.52 Classified Review Committee... President's Personal Representative, Department of Defense/Legal Advisor and Political/Economic Advisor....

  14. 32 CFR 2700.52 - Classified Review Committee.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... NEGOTIATIONS SECURITY INFORMATION REGULATIONS Implementation and Review § 2700.52 Classified Review Committee... President's Personal Representative, Department of Defense/Legal Advisor and Political/Economic Advisor....

  15. 32 CFR 2700.52 - Classified Review Committee.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... NEGOTIATIONS SECURITY INFORMATION REGULATIONS Implementation and Review § 2700.52 Classified Review Committee... President's Personal Representative, Department of Defense/Legal Advisor and Political/Economic Advisor....

  16. 32 CFR 2700.52 - Classified Review Committee.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... NEGOTIATIONS SECURITY INFORMATION REGULATIONS Implementation and Review § 2700.52 Classified Review Committee... President's Personal Representative, Department of Defense/Legal Advisor and Political/Economic Advisor....

  17. Decision boundary feature selection for non-parametric classifier

    NASA Technical Reports Server (NTRS)

    Lee, Chulhee; Landgrebe, David A.

    1991-01-01

    Feature selection has been one of the most important topics in pattern recognition. Although many authors have studied feature selection for parametric classifiers, few algorithms are available for feature selection for nonparametric classifiers. In this paper we propose a new feature selection algorithm based on decision boundaries for nonparametric classifiers. We first note that feature selection for pattern recognition is equivalent to retaining 'discriminantly informative features', and a discriminantly informative feature is related to the decision boundary. A procedure to extract discriminantly informative features based on a decision boundary for nonparametric classification is proposed. Experiments show that the proposed algorithm finds effective features for the nonparametric classifier with Parzen density estimation.

  18. EMF wire code research

    SciTech Connect

    Jones, T.

    1993-11-01

    This paper examines the results of previous wire code research to determines the relationship with childhood cancer, wire codes and electromagnetic fields. The paper suggests that, in the original Savitz study, biases toward producing a false positive association between high wire codes and childhood cancer were created by the selection procedure.

  19. Universal Noiseless Coding Subroutines

    NASA Technical Reports Server (NTRS)

    Schlutsmeyer, A. P.; Rice, R. F.

    1986-01-01

    Software package consists of FORTRAN subroutines that perform universal noiseless coding and decoding of integer and binary data strings. Purpose of this type of coding to achieve data compression in sense that coded data represents original data perfectly (noiselessly) while taking fewer bits to do so. Routines universal because they apply to virtually any "real-world" data source.

  20. Genetic coding and gene expression - new Quadruplet genetic coding model

    NASA Astrophysics Data System (ADS)

    Shankar Singh, Rama

    2012-07-01

    Successful demonstration of human genome project has opened the door not only for developing personalized medicine and cure for genetic diseases, but it may also answer the complex and difficult question of the origin of life. It may lead to making 21st century, a century of Biological Sciences as well. Based on the central dogma of Biology, genetic codons in conjunction with tRNA play a key role in translating the RNA bases forming sequence of amino acids leading to a synthesized protein. This is the most critical step in synthesizing the right protein needed for personalized medicine and curing genetic diseases. So far, only triplet codons involving three bases of RNA, transcribed from DNA bases, have been used. Since this approach has several inconsistencies and limitations, even the promise of personalized medicine has not been realized. The new Quadruplet genetic coding model proposed and developed here involves all four RNA bases which in conjunction with tRNA will synthesize the right protein. The transcription and translation process used will be the same, but the Quadruplet codons will help overcome most of the inconsistencies and limitations of the triplet codes. Details of this new Quadruplet genetic coding model and its subsequent potential applications including relevance to the origin of life will be presented.

  1. Mapping Local Codes to Read Codes.

    PubMed

    Bonney, Wilfred; Galloway, James; Hall, Christopher; Ghattas, Mikhail; Tramma, Leandro; Nind, Thomas; Donnelly, Louise; Jefferson, Emily; Doney, Alexander

    2017-01-01

    Background & Objectives: Legacy laboratory test codes make it difficult to use clinical datasets for meaningful translational research, where populations are followed for disease risk and outcomes over many years. The Health Informatics Centre (HIC) at the University of Dundee hosts continuous biochemistry data from the clinical laboratories in Tayside and Fife dating back as far as 1987. However, the HIC-managed biochemistry dataset is coupled with incoherent sample types and unstandardised legacy local test codes, which increases the complexity of using the dataset for reasonable population health outcomes. The objective of this study was to map the legacy local test codes to the Scottish 5-byte Version 2 Read Codes using biochemistry data extracted from the repository of the Scottish Care Information (SCI) Store.

  2. Molecular classifiers for gastric cancer and nonmalignant diseases of the gastric mucosa.

    PubMed

    Meireles, Sibele I; Cristo, Elier B; Carvalho, Alex F; Hirata, Roberto; Pelosof, Adriane; Gomes, Luciana I; Martins, Waleska K; Begnami, Maria D; Zitron, Cláudia; Montagnini, André L; Soares, Fernando A; Neves, E Jordão; Reis, Luiz F L

    2004-02-15

    High incidence of gastric cancer-related death is mainly due to diagnosis at an advanced stage in addition to the lack of adequate neoadjuvant therapy. Hence, new tools aimed at early diagnosis would have a positive impact in the outcome of the disease. Using cDNA arrays having 376 genes either identified previously as altered in gastric tumors or known to be altered in human cancer, we determined expression signature of 99 tissue fragments representing normal gastric mucosa, gastritis, intestinal metaplasia, and adenocarcinomas. We first validated the array by identifying molecular markers that are associated with intestinal metaplasia, considered as a transition stage of gastric adenocarcinomas of the intestinal type as well as markers that are associated with diffuse type of gastric adenocarcinomas. Next, we applied Fisher's linear discriminant analysis in an exhaustive search of trios of genes that could be used to build classifiers for class distinction. Many classifiers could distinguish between normal and tumor samples, whereas, for the distinction of gastritis from tumor and for metaplasia from tumor, fewer classifiers were identified. Statistical validations showed that trios that discriminate between normal and tumor samples are powerful classifiers to distinguish between tumor and nontumor samples. More relevant, it was possible to identify samples of intestinal metaplasia that have expression signature resembling that of an adenocarcinoma and can now be used for follow-up of patients to determine their potential as a prognostic test for malignant transformation.

  3. Software Certification - Coding, Code, and Coders

    NASA Technical Reports Server (NTRS)

    Havelund, Klaus; Holzmann, Gerard J.

    2011-01-01

    We describe a certification approach for software development that has been adopted at our organization. JPL develops robotic spacecraft for the exploration of the solar system. The flight software that controls these spacecraft is considered to be mission critical. We argue that the goal of a software certification process cannot be the development of "perfect" software, i.e., software that can be formally proven to be correct under all imaginable and unimaginable circumstances. More realistically, the goal is to guarantee a software development process that is conducted by knowledgeable engineers, who follow generally accepted procedures to control known risks, while meeting agreed upon standards of workmanship. We target three specific issues that must be addressed in such a certification procedure: the coding process, the code that is developed, and the skills of the coders. The coding process is driven by standards (e.g., a coding standard) and tools. The code is mechanically checked against the standard with the help of state-of-the-art static source code analyzers. The coders, finally, are certified in on-site training courses that include formal exams.

  4. Gene and genon concept: coding versus regulation

    PubMed Central

    2007-01-01

    We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various

  5. Transcription of mitochondrial DNA.

    PubMed

    Tabak, H F; Grivell, L A; Borst, P

    1983-01-01

    While mitochondrial DNA (mtDNA) is the simplest DNA in nature, coding for rRNAs and tRNAs, results of DNA sequence, and transcript analysis have demonstrated that both the synthesis and processing of mitochondrial RNAs involve remarkably intricate events. At one extreme, genes in animal mtDNAs are tightly packed, both DNA strands are completely transcribed (symmetric transcription), and the appearance of specific mRNAs is entirely dependent on processing at sites signalled by the sequences of the tRNAs, which abut virtually every gene. At the other extreme, gene organization in yeast (Saccharomyces) is anything but compact, with long stretches of AT-rich DNA interspaced between coding sequences and no obvious logic to the order of genes. Transcription is asymmetric and several RNAs are initiated de novo. Nevertheless, extensive RNA processing occurs due largely to the presence of split genes. RNA splicing is complex, is controlled by both mitochondrial and nuclear genes, and in some cases is accompanied by the formation of RNAs that behave as covalently closed circles. The present article reviews current knowledge of mitochondrial transcription and RNA processing in relation to possible mechanisms for the regulation of mitochondrial gene expression.

  6. Multi-input distributed classifiers for synthetic genetic circuits.

    PubMed

    Kanakov, Oleg; Kotelnikov, Roman; Alsaedi, Ahmed; Tsimring, Lev; Huerta, Ramón; Zaikin, Alexey; Ivanchenko, Mikhail

    2015-01-01

    For practical construction of complex synthetic genetic networks able to perform elaborate functions it is important to have a pool of relatively simple modules with different functionality which can be compounded together. To complement engineering of very different existing synthetic genetic devices such as switches, oscillators or logical gates, we propose and develop here a design of synthetic multi-input classifier based on a recently introduced distributed classifier concept. A heterogeneous population of cells acts as a single classifier, whose output is obtained by summarizing the outputs of individual cells. The learning ability is achieved by pruning the population, instead of tuning parameters of an individual cell. The present paper is focused on evaluating two possible schemes of multi-input gene classifier circuits. We demonstrate their suitability for implementing a multi-input distributed classifier capable of separating data which are inseparable for single-input classifiers, and characterize performance of the classifiers by analytical and numerical results. The simpler scheme implements a linear classifier in a single cell and is targeted at separable classification problems with simple class borders. A hard learning strategy is used to train a distributed classifier by removing from the population any cell answering incorrectly to at least one training example. The other scheme implements a circuit with a bell-shaped response in a single cell to allow potentially arbitrary shape of the classification border in the input space of a distributed classifier. Inseparable classification problems are addressed using soft learning strategy, characterized by probabilistic decision to keep or discard a cell at each training iteration. We expect that our classifier design contributes to the development of robust and predictable synthetic biosensors, which have the potential to affect applications in a lot of fields, including that of medicine and industry.

  7. The EB Factory Project. I. A Fast, Neural-net-based, General Purpose Light Curve Classifier Optimized for Eclipsing Binaries

    NASA Astrophysics Data System (ADS)

    Paegert, Martin; Stassun, Keivan G.; Burger, Dan M.

    2014-08-01

    We describe a new neural-net-based light curve classifier and provide it with documentation as a ready-to-use tool for the community. While optimized for identification and classification of eclipsing binary stars, the classifier is general purpose, and has been developed for speed in the context of upcoming massive surveys such as the Large Synoptic Survey Telescope. A challenge for classifiers in the context of neural-net training and massive data sets is to minimize the number of parameters required to describe each light curve. We show that a simple and fast geometric representation that encodes the overall light curve shape, together with a chi-square parameter to capture higher-order morphology information results in efficient yet robust light curve classification, especially for eclipsing binaries. Testing the classifier on the ASAS light curve database, we achieve a retrieval rate of 98% and a false-positive rate of 2% for eclipsing binaries. We achieve similarly high retrieval rates for most other periodic variable-star classes, including RR Lyrae, Mira, and delta Scuti. However, the classifier currently has difficulty discriminating between different sub-classes of eclipsing binaries, and suffers a relatively low (~60%) retrieval rate for multi-mode delta Cepheid stars. We find that it is imperative to train the classifier's neural network with exemplars that include the full range of light curve quality to which the classifier will be expected to perform; the classifier performs well on noisy light curves only when trained with noisy exemplars. The classifier source code, ancillary programs, a trained neural net, and a guide for use, are provided.

  8. The EB factory project. I. A fast, neural-net-based, general purpose light curve classifier optimized for eclipsing binaries

    SciTech Connect

    Paegert, Martin; Stassun, Keivan G.; Burger, Dan M.

    2014-08-01

    We describe a new neural-net-based light curve classifier and provide it with documentation as a ready-to-use tool for the community. While optimized for identification and classification of eclipsing binary stars, the classifier is general purpose, and has been developed for speed in the context of upcoming massive surveys such as the Large Synoptic Survey Telescope. A challenge for classifiers in the context of neural-net training and massive data sets is to minimize the number of parameters required to describe each light curve. We show that a simple and fast geometric representation that encodes the overall light curve shape, together with a chi-square parameter to capture higher-order morphology information results in efficient yet robust light curve classification, especially for eclipsing binaries. Testing the classifier on the ASAS light curve database, we achieve a retrieval rate of 98% and a false-positive rate of 2% for eclipsing binaries. We achieve similarly high retrieval rates for most other periodic variable-star classes, including RR Lyrae, Mira, and delta Scuti. However, the classifier currently has difficulty discriminating between different sub-classes of eclipsing binaries, and suffers a relatively low (∼60%) retrieval rate for multi-mode delta Cepheid stars. We find that it is imperative to train the classifier's neural network with exemplars that include the full range of light curve quality to which the classifier will be expected to perform; the classifier performs well on noisy light curves only when trained with noisy exemplars. The classifier source code, ancillary programs, a trained neural net, and a guide for use, are provided.

  9. XSOR codes users manual

    SciTech Connect

    Jow, Hong-Nian; Murfin, W.B.; Johnson, J.D.

    1993-11-01

    This report describes the source term estimation codes, XSORs. The codes are written for three pressurized water reactors (Surry, Sequoyah, and Zion) and two boiling water reactors (Peach Bottom and Grand Gulf). The ensemble of codes has been named ``XSOR``. The purpose of XSOR codes is to estimate the source terms which would be released to the atmosphere in severe accidents. A source term includes the release fractions of several radionuclide groups, the timing and duration of releases, the rates of energy release, and the elevation of releases. The codes have been developed by Sandia National Laboratories for the US Nuclear Regulatory Commission (NRC) in support of the NUREG-1150 program. The XSOR codes are fast running parametric codes and are used as surrogates for detailed mechanistic codes. The XSOR codes also provide the capability to explore the phenomena and their uncertainty which are not currently modeled by the mechanistic codes. The uncertainty distributions of input parameters may be used by an. XSOR code to estimate the uncertainty of source terms.

  10. Remote-Handled Transuranic Content Codes

    SciTech Connect

    Washington TRU Solutions

    2001-08-01

    The Remote-Handled Transuranic (RH-TRU) Content Codes (RH-TRUCON) document representsthe development of a uniform content code system for RH-TRU waste to be transported in the 72-Bcask. It will be used to convert existing waste form numbers, content codes, and site-specificidentification codes into a system that is uniform across the U.S. Department of Energy (DOE) sites.The existing waste codes at the sites can be grouped under uniform content codes without any lossof waste characterization information. The RH-TRUCON document provides an all-encompassing|description for each content code and compiles this information for all DOE sites. Compliance withwaste generation, processing, and certification procedures at the sites (outlined in this document foreach content code) ensures that prohibited waste forms are not present in the waste. The contentcode gives an overall description of the RH-TRU waste material in terms of processes and|packaging, as well as the generation location. This helps to provide cradle-to-grave traceability ofthe waste material so that the various actions required to assess its qualification as payload for the72-B cask can be performed. The content codes also impose restrictions and requirements on themanner in which a payload can be assembled.The RH-TRU Waste Authorized Methods for Payload Control (RH-TRAMPAC), Appendix 1.3.7of the 72-B Cask Safety Analysis Report (SAR), describes the current governing procedures|applicable for the qualification of waste as payload for the 72-B cask. The logic for this|classification is presented in the 72-B Cask SAR. Together, these documents (RH-TRUCON,|RH-TRAMPAC, and relevant sections of the 72-B Cask SAR) present the foundation and|justification for classifying RH-TRU waste into content codes. Only content codes described in thisdocument can be considered for transport in the 72-B cask. Revisions to this document will be madeas additional waste qualifies for transport. |Each content code uniquely

  11. DLLExternalCode

    SciTech Connect

    Greg Flach, Frank Smith

    2014-05-14

    DLLExternalCode is the a general dynamic-link library (DLL) interface for linking GoldSim (www.goldsim.com) with external codes. The overall concept is to use GoldSim as top level modeling software with interfaces to external codes for specific calculations. The DLLExternalCode DLL that performs the linking function is designed to take a list of code inputs from GoldSim, create an input file for the external application, run the external code, and return a list of outputs, read from files created by the external application, back to GoldSim. Instructions for creating the input file, running the external code, and reading the output are contained in an instructions file that is read and interpreted by the DLL.

  12. Defeating the coding monsters.

    PubMed

    Colt, Ross

    2007-02-01

    Accuracy in coding is rapidly becoming a required skill for military health care providers. Clinic staffing, equipment purchase decisions, and even reimbursement will soon be based on the coding data that we provide. Learning the complicated myriad of rules to code accurately can seem overwhelming. However, the majority of clinic visits in a typical outpatient clinic generally fall into two major evaluation and management codes, 99213 and 99214. If health care providers can learn the rules required to code a 99214 visit, then this will provide a 90% solution that can enable them to accurately code the majority of their clinic visits. This article demonstrates a step-by-step method to code a 99214 visit, by viewing each of the three requirements as a monster to be defeated.

  13. Named entity recognition and classification in biomedical text using classifier ensemble.

    PubMed

    Saha, Sriparna; Ekbal, Asif; Sikdar, Utpal Kumar

    2015-01-01

    Named Entity Recognition and Classification (NERC) is an important task in information extraction for biomedicine domain. Biomedical Named Entities include mentions of proteins, genes, DNA, RNA, etc. which, in general, have complex structures and are difficult to recognise. In this paper, we propose a Single Objective Optimisation based classifier ensemble technique using the search capability of Genetic Algorithm (GA) for NERC in biomedical texts. Here, GA is used to quantify the amount of voting for each class in each classifier. We use diverse classification methods like Conditional Random Field and Support Vector Machine to build a number of models depending upon the various representations of the set of features and/or feature templates. The proposed technique is evaluated with two benchmark datasets, namely JNLPBA 2004 and GENETAG. Experiments yield the overall F- measure values of 75.97% and 95.90%, respectively. Comparisons with the existing systems show that our proposed system achieves state-of-the-art performance.

  14. Mining Mammalian Transcript Data for Functional Long Non-Coding RNAs

    PubMed Central

    Khachane, Amit N.; Harrison, Paul M.

    2010-01-01

    Background The role of long non-coding RNAs (lncRNAs) in controlling gene expression has garnered increased interest in recent years. Sequencing projects, such as Fantom3 for mouse and H-InvDB for human, have generated abundant data on transcribed components of mammalian cells, the majority of which appear not to be protein-coding. However, much of the non-protein-coding transcriptome could merely be a consequence of ‘transcription noise’. It is therefore essential to use bioinformatic approaches to identify the likely functional candidates in a high throughput manner. Principal Findings We derived a scheme for classifying and annotating likely functional lncRNAs in mammals. Using the available experimental full-length cDNA data sets for human and mouse, we identified 78 lncRNAs that are either syntenically conserved between human and mouse, or that originate from the same protein-coding genes. Of these, 11 have significant sequence homology. We found that these lncRNAs exhibit: (i) patterns of codon substitution typical of non-coding transcripts; (ii) preservation of sequences in distant mammals such as dog and cow, (iii) significant sequence conservation relative to their corresponding flanking regions (in 50% cases, flanking regions do not have homology at all; and in the remaining, the degree of conservation is significantly less); (iv) existence mostly as single-exon forms (8/11); and, (v) presence of conserved and stable secondary structure motifs within them. We further identified orthologous protein-coding genes that are contributing to the pool of lncRNAs; of which, genes implicated in carcinogenesis are significantly over-represented. Conclusion Our comparative mammalian genomics approach coupled with evolutionary analysis identified a small population of conserved long non-protein-coding RNAs (lncRNAs) that are potentially functional across Mammalia. Additionally, our analysis indicates that amongst the orthologous protein-coding genes that produce

  15. Inductive Selectivity in Children's Cross-Classified Concepts

    ERIC Educational Resources Information Center

    Nguyen, Simone P.

    2012-01-01

    Cross-classified items pose an interesting challenge to children's induction as these items belong to many different categories, each of which may serve as a basis for a different type of inference. Inductive selectivity is the ability to appropriately make different types of inferences about a single cross-classifiable item based on its different…

  16. 21 CFR 1402.4 - Information classified by another agency.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 21 Food and Drugs 9 2013-04-01 2013-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  17. 21 CFR 1402.4 - Information classified by another agency.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 21 Food and Drugs 9 2012-04-01 2012-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  18. 21 CFR 1402.4 - Information classified by another agency.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 9 2011-04-01 2011-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  19. 21 CFR 1402.4 - Information classified by another agency.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 21 Food and Drugs 9 2014-04-01 2014-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  20. Verb-raising and Numeral Classifiers in Japanese: Incompatible Bedfellows.

    ERIC Educational Resources Information Center

    Fukushima, Kazuhiko

    2003-01-01

    Examines verb raising in Japanese and looks at Koizumi's (2000) evidence for verb-raising based on data involving, among other things, numeral classifiers. Demonstrates that Koizumi's evidence based on numeral classifiers does not support his claim that verb-raising occurs in Japanese. (Author/VWL)

  1. 41 CFR 105-62.102 - Authority to originally classify.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... originally classify. (a) Top secret, secret, and confidential. The authority to originally classify information as Top Secret, Secret, or Confidential may be exercised only by the Administrator and is delegable only to the Director, Information Security Oversight Office. (b) Limitations on delegation...

  2. Consistency results for the ROC curves of fused classifiers

    NASA Astrophysics Data System (ADS)

    Bjerkaas, Kristopher S.; Oxley, Mark E.; Bauer, Kenneth W., Jr.

    2004-08-01

    The U.S. Air Force is researching the fusion of multiple sensors and classifiers. Given a finite collection of classifiers to be fused one seeks a new classifier with improved performance. An established performance quantifier is the Receiver Operating Characteristic (ROC) curve. This curve allows one to view the probability of detection versus probability of false alarm in one graph. In reality only finite data is available so only an approximate ROC curve can be constructed. Previous research shows that one does not have to perform an experiment for this new fused classifier to determine its ROC curve. If the ROC curve for each individual classifier has been determined, then formulas for the ROC curve of the fused classifier exist for certain fusion rules. This will be an enormous saving in time and money since the performance of many fused classifiers will be determined without having to perform tests on each one. But, again, these will be approximate ROC curves, since they are based on finite data. We show that if the individual approximate ROC curves are consistent then the approximate ROC curve for the fused classifier is also consistent under certain circumstances. We give the details for these circumstances, as well as some examples related to sensor fusion.

  3. Increasing Children's ASL Classifier Production: A Multicomponent Intervention

    ERIC Educational Resources Information Center

    Beal-Alvarez, Jennifer S.; Easterbrooks, Susan R.

    2013-01-01

    The Authors examined classifier production during narrative retells by 10 deaf and hard of hearing students in grades 2-4 at a day school for the deaf following a 6-week intervention of repeated viewings of stories in American Sign Language (ASL) paired with scripted teacher mediation. Classifier production, documented through a…

  4. 45 CFR 601.8 - Access to classified materials.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ....8 Public Welfare Regulations Relating to Public Welfare (Continued) NATIONAL SCIENCE FOUNDATION CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.8 Access to classified materials. No person may be given access to classified information unless that person has been determined to...

  5. 18 CFR 367.18 - Criteria for classifying leases.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 18 Conservation of Power and Water Resources 1 2013-04-01 2013-04-01 false Criteria for classifying leases. 367.18 Section 367.18 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... ACT General Instructions § 367.18 Criteria for classifying leases. (a) If, at its inception, a...

  6. 18 CFR 367.18 - Criteria for classifying leases.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 18 Conservation of Power and Water Resources 1 2011-04-01 2011-04-01 false Criteria for classifying leases. 367.18 Section 367.18 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... ACT General Instructions § 367.18 Criteria for classifying leases. (a) If, at its inception, a...

  7. 18 CFR 367.18 - Criteria for classifying leases.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 18 Conservation of Power and Water Resources 1 2012-04-01 2012-04-01 false Criteria for classifying leases. 367.18 Section 367.18 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... ACT General Instructions § 367.18 Criteria for classifying leases. (a) If, at its inception, a...

  8. 18 CFR 367.18 - Criteria for classifying leases.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 18 Conservation of Power and Water Resources 1 2014-04-01 2014-04-01 false Criteria for classifying leases. 367.18 Section 367.18 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... ACT General Instructions § 367.18 Criteria for classifying leases. (a) If, at its inception, a...

  9. Fisher classifier and its probability of error estimation

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.

  10. 10 CFR 820.12 - Classified, confidential, and controlled information

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 Energy 4 2012-01-01 2012-01-01 false Classified, confidential, and controlled information 820.12 Section 820.12 Energy DEPARTMENT OF ENERGY PROCEDURAL RULES FOR DOE NUCLEAR ACTIVITIES General § 820.12 Classified, confidential, and controlled information (a) General rule. The DOE Official...

  11. A Proposed Methodology to Classify Frontier Capital Markets

    DTIC Science & Technology

    2011-07-31

    Technical Report 11-003 A Proposed Methodology to Classify Frontier Capital Markets Daniel Evans Margaret Moten U.S...Through large- scale quasi-experiments, we are modeling how Frontier Markets succeed or fail. This research will provide quantitative analysis to senior...Technical Report 11-003 A Proposed Methodology to Classify Frontier Capital Markets Daniel Evans and Margaret Moten

  12. 16 CFR 1610.4 - Requirements for classifying textiles.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1, Normal Flammability. Class 1 textiles exhibit normal flammability and are...), when tested as described in § 1610.6 shall be classified as Class 1, Normal flammability, when the...

  13. 16 CFR 1610.4 - Requirements for classifying textiles.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1, Normal Flammability. Class 1 textiles exhibit normal flammability and are...), when tested as described in § 1610.6 shall be classified as Class 1, Normal flammability, when the...

  14. 10 CFR 1016.24 - Special handling of classified material.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Special handling of classified material. 1016.24 Section... Security § 1016.24 Special handling of classified material. When the Restricted Data contained in material is not ascertainable by observation or examination at the place where the material is located...

  15. Hunt for Federal Funds Gives Classified Research a Lift

    ERIC Educational Resources Information Center

    Basken, Paul

    2012-01-01

    For some colleges and professors, classified research promises prestige and money. Powerhouses like the Massachusetts Institute of Technology and the Johns Hopkins University have for decades run large classified laboratories. But most other universities either do not allow such research or conduct it quietly, and in small doses. The…

  16. Faster P300 Classifier Training Using Spatiotemporal Beamforming.

    PubMed

    Wittevrongel, Benjamin; Van Hulle, Marc M

    2016-05-01

    The linearly-constrained minimum-variance (LCMV) beamformer is traditionally used as a spatial filter for source localization, but here we consider its spatiotemporal extension for P300 classification. We compare two variants and show that the spatiotemporal LCMV beamformer is at par with state-of-the-art P300 classifiers, but several orders of magnitude faster in training the classifier.

  17. Deep learning classifier based on NPCA and orthogonal feature selection

    NASA Astrophysics Data System (ADS)

    Jankowski, Stanisław; Szymański, Zbigniew; Dziomin, Uladzimir; Golovko, Vladimir; Barcz, Aleksy

    2016-09-01

    In this paper the idea of deep learning classifier is developed. The effectiveness of discriminative classifier, as e.g. multilayer perceptron, support vector machine can be improved by adding the data preprocessing blocks: orthogonal feature selection (Gram-Schmidt method) and nonlinear principal component analysis. We present the case study of various structures of deep learning systems (scenarios).

  18. 21 CFR 1402.4 - Information classified by another agency.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 9 2010-04-01 2010-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  19. 32 CFR 2001.55 - Foreign disclosure of classified information.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    .... 2001.55 Section 2001.55 National Defense Other Regulations Relating to National Defense INFORMATION... INFORMATION Safeguarding § 2001.55 Foreign disclosure of classified information. Classified information... accordance with § 2001.24(j). With respect to the Intelligence Community, the Director of...

  20. 32 CFR 2001.55 - Foreign disclosure of classified information.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    .... 2001.55 Section 2001.55 National Defense Other Regulations Relating to National Defense INFORMATION... INFORMATION Safeguarding § 2001.55 Foreign disclosure of classified information. Classified information... accordance with § 2001.24(j). With respect to the Intelligence Community, the Director of...

  1. 32 CFR 2001.55 - Foreign disclosure of classified information.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    .... 2001.55 Section 2001.55 National Defense Other Regulations Relating to National Defense INFORMATION... INFORMATION Safeguarding § 2001.55 Foreign disclosure of classified information. Classified information... accordance with § 2001.24(j). With respect to the Intelligence Community, the Director of...

  2. 32 CFR 2001.55 - Foreign disclosure of classified information.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    .... 2001.55 Section 2001.55 National Defense Other Regulations Relating to National Defense INFORMATION... INFORMATION Safeguarding § 2001.55 Foreign disclosure of classified information. Classified information... accordance with § 2001.24(j). With respect to the Intelligence Community, the Director of...

  3. 25 CFR 304.3 - Classifying and marking of silver.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 25 Indians 2 2012-04-01 2012-04-01 false Classifying and marking of silver. 304.3 Section 304.3 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO, PUEBLO, AND HOPI SILVER, USE OF GOVERNMENT MARK § 304.3 Classifying and marking of silver. For the present the Indian Arts and Crafts...

  4. 25 CFR 304.3 - Classifying and marking of silver.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 25 Indians 2 2013-04-01 2013-04-01 false Classifying and marking of silver. 304.3 Section 304.3 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO, PUEBLO, AND HOPI SILVER, USE OF GOVERNMENT MARK § 304.3 Classifying and marking of silver. For the present the Indian Arts and Crafts...

  5. 25 CFR 304.3 - Classifying and marking of silver.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 25 Indians 2 2011-04-01 2011-04-01 false Classifying and marking of silver. 304.3 Section 304.3 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO, PUEBLO, AND HOPI SILVER, USE OF GOVERNMENT MARK § 304.3 Classifying and marking of silver. For the present the Indian Arts and Crafts...

  6. 25 CFR 304.3 - Classifying and marking of silver.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 25 Indians 2 2014-04-01 2014-04-01 false Classifying and marking of silver. 304.3 Section 304.3 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO, PUEBLO, AND HOPI SILVER, USE OF GOVERNMENT MARK § 304.3 Classifying and marking of silver. For the present the Indian Arts and Crafts...

  7. DETAIL VIEW OF THREE CONCENTRATION TABLES, LOADING RAMP, AND CLASSIFIER, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL VIEW OF THREE CONCENTRATION TABLES, LOADING RAMP, AND CLASSIFIER, LOOKING EST. THE RAKE THAT WAS ORIGINALLY INSIDE THE CLASSIFIER IS AT CENTER RIGHT ON TOP OF THE LOADING RAMP. - Gold Hill Mill, Warm Spring Canyon Road, Death Valley Junction, Inyo County, CA

  8. 18 CFR 1301.69 - Safeguarding classified information.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 18 Conservation of Power and Water Resources 2 2014-04-01 2014-04-01 false Safeguarding classified information. 1301.69 Section 1301.69 Conservation of Power and Water Resources TENNESSEE VALLEY AUTHORITY PROCEDURES Protection of National Security Classified Information § 1301.69 Safeguarding...

  9. 32 CFR 2400.30 - Reproduction of classified information.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 6 2011-07-01 2011-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be...

  10. 32 CFR 2400.30 - Reproduction of classified information.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 6 2012-07-01 2012-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be...

  11. 32 CFR 2400.30 - Reproduction of classified information.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be...

  12. 32 CFR 2400.30 - Reproduction of classified information.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 6 2013-07-01 2013-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be...

  13. 32 CFR 2400.30 - Reproduction of classified information.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 6 2014-07-01 2014-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be...

  14. 45 CFR 601.8 - Access to classified materials.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 45 Public Welfare 3 2014-10-01 2014-10-01 false Access to classified materials. 601.8 Section 601.8 Public Welfare Regulations Relating to Public Welfare (Continued) NATIONAL SCIENCE FOUNDATION CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.8 Access to classified materials....

  15. 45 CFR 601.8 - Access to classified materials.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 45 Public Welfare 3 2012-10-01 2012-10-01 false Access to classified materials. 601.8 Section 601.8 Public Welfare Regulations Relating to Public Welfare (Continued) NATIONAL SCIENCE FOUNDATION CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.8 Access to classified materials....

  16. 45 CFR 601.8 - Access to classified materials.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 45 Public Welfare 3 2013-10-01 2013-10-01 false Access to classified materials. 601.8 Section 601.8 Public Welfare Regulations Relating to Public Welfare (Continued) NATIONAL SCIENCE FOUNDATION CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.8 Access to classified materials....

  17. 16 CFR 1610.4 - Requirements for classifying textiles.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 16 Commercial Practices 2 2013-01-01 2013-01-01 false Requirements for classifying textiles. 1610.4 Section 1610.4 Commercial Practices CONSUMER PRODUCT SAFETY COMMISSION FLAMMABLE FABRICS ACT REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1,...

  18. 16 CFR 1610.4 - Requirements for classifying textiles.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 16 Commercial Practices 2 2014-01-01 2014-01-01 false Requirements for classifying textiles. 1610.4 Section 1610.4 Commercial Practices CONSUMER PRODUCT SAFETY COMMISSION FLAMMABLE FABRICS ACT REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1,...

  19. 16 CFR 1610.4 - Requirements for classifying textiles.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 16 Commercial Practices 2 2012-01-01 2012-01-01 false Requirements for classifying textiles. 1610.4 Section 1610.4 Commercial Practices CONSUMER PRODUCT SAFETY COMMISSION FLAMMABLE FABRICS ACT REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1,...

  20. 45 CFR 601.8 - Access to classified materials.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.8 Access to classified materials. No person may be given access to classified information unless that person has been determined to be trustworthy and unless access is essential to the accomplishment of lawful and authorized Government purposes....

  1. 46 CFR 108.177 - Electrical equipment in classified locations.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 4 2010-10-01 2010-10-01 false Electrical equipment in classified locations. 108.177 Section 108.177 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) A-MOBILE OFFSHORE DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Classified Locations § 108.177...

  2. 46 CFR 108.177 - Electrical equipment in classified locations.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 46 Shipping 4 2011-10-01 2011-10-01 false Electrical equipment in classified locations. 108.177 Section 108.177 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) A-MOBILE OFFSHORE DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Classified Locations § 108.177...

  3. 33 CFR 149.405 - How are fire extinguishers classified?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false How are fire extinguishers classified? 149.405 Section 149.405 Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND... Fire Protection Equipment Firefighting Requirements § 149.405 How are fire extinguishers classified?...

  4. Population coding of affect across stimuli, modalities and individuals

    PubMed Central

    Chikazoe, Junichi; Lee, Daniel H.; Kriegeskorte, Nikolaus; Anderson, Adam K.

    2014-01-01

    It remains unclear how the brain represents external objective sensory events alongside our internal subjective impressions of them—affect. Representational mapping of population level activity evoked by complex scenes and basic tastes uncovered a neural code supporting a continuous axis of pleasant-to-unpleasant valence. This valence code was distinct from low-level physical and high-level object properties. While ventral temporal and anterior insular cortices supported valence codes specific to vision and taste, both the medial and lateral orbitofrontal cortices (OFC), maintained a valence code independent of sensory origin. Further only the OFC code could classify experienced affect across participants. The entire valence spectrum is represented as a collective pattern in regional neural activity as sensory-specific and abstract codes, whereby the subjective quality of affect can be objectively quantified across stimuli, modalities, and people. PMID:24952643

  5. Just-in-time classifiers for recurrent concepts.

    PubMed

    Alippi, Cesare; Boracchi, Giacomo; Roveri, Manuel

    2013-04-01

    Just-in-time (JIT) classifiers operate in evolving environments by classifying instances and reacting to concept drift. In stationary conditions, a JIT classifier improves its accuracy over time by exploiting additional supervised information coming from the field. In nonstationary conditions, however, the classifier reacts as soon as concept drift is detected; the current classification setup is discarded and a suitable one activated to keep the accuracy high. We present a novel generation of JIT classifiers able to deal with recurrent concept drift by means of a practical formalization of the concept representation and the definition of a set of operators working on such representations. The concept-drift detection activity, which is crucial in promptly reacting to changes exactly when needed, is advanced by considering change-detection tests monitoring both inputs and classes distributions.

  6. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    NASA Astrophysics Data System (ADS)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  7. The expression level of small non-coding RNAs derived from the first exon of protein-coding genes is predictive of cancer status.

    PubMed

    Zovoilis, Athanasios; Mungall, Andrew J; Moore, Richard; Varhol, Richard; Chu, Andy; Wong, Tina; Marra, Marco; Jones, Steven J M

    2014-04-01

    Small non-coding RNAs (smRNAs) are known to be significantly enriched near the transcriptional start sites of genes. However, the functional relevance of these smRNAs remains unclear, and they have not been associated with human disease. Within the cancer genome atlas project (TCGA), we have generated small RNA datasets for many tumor types. In prior cancer studies, these RNAs have been regarded as transcriptional "noise," due to their apparent chaotic distribution. In contrast, we demonstrate their striking potential to distinguish efficiently between cancer and normal tissues and classify patients with cancer to subgroups of distinct survival outcomes. This potential to predict cancer status is restricted to a subset of these smRNAs, which is encoded within the first exon of genes, highly enriched within CpG islands and negatively correlated with DNA methylation levels. Thus, our data show that genome-wide changes in the expression levels of small non-coding RNAs within first exons are associated with cancer.

  8. Disparity in coding concordance: do physicians and coders agree?

    PubMed

    Lorence, Daniel P; Ibrahim, Ibrahim Awad

    2003-01-01

    Increasing demands for large-scale comparative analysis of health care costs has led to a similar demand for consistently classified data. Evidence-based medicine demands evidence that can be trusted. This study sought to assess managers' observed levels of agreement with physician code selections when classifying patient data. Using a non-sampled research design of both mailed and telephone surveys, we employ a nationwide cross-section of over 16,000 accredited US medical record managers. As a main outcome measure, we evaluate reported levels of agreement between physician and information manager code selections made when classifying patient data. Results indicate about 19 percent of respondents report that coder-physician classification disagreement occurred on more than 5 percent of all patient encounters. In some cases, disagreement occurred in 20 percent or more instances of code selection. This phenomenon shows significant variation across key demographic and market indicators. With the growing practice of measuring coded data quality as an outcome of health care financial performance, along with adoption of electronic classification and patient record systems, the accuracy of coded data is likely to remain uncertain in the absence of more consistent classification and coding practices.

  9. The place of 'codes' in nonlinear neurodynamics.

    PubMed

    Freeman, Walter J

    2007-01-01

    A key problem in cognitive science is to explain the neural mechanisms of the rapid transposition between stimulus energy and abstract concept--between the specific and the generic--in both material and conceptual aspects, not between neural and psychic aspects. Three approaches by researchers to a solution in terms of neural codes are considered. Materialists seek rate and frequency codes in the interspike intervals of trains of action potentials induced by stimuli and carried by topologically organized axonal lines. Cognitivists refer to the symbol grounding problem and search for symbolic codes in firings of hierarchically organized feature-detector neurons of phonemes, lines, odorants, pressures, etc., that object-detector neurons bind into representations of probabilities of stimulus occurrence. Dynamicists seek neural correlates of stimuli and associated behaviors in spatial patterns of oscillatory fields of dendritic activity that self-organize and evolve as trajectories through high-dimensional brain state space; the codes are landscapes of chaotic attractors. Unlike codes in DNA and the periodic table, these codes have neither alphabet nor syntax. They are epistemological metaphors required by experimentalists to measure neural activity and by engineers to model brain functions. Here I review the central neural mechanisms of olfaction as a paradigm for use of codes to explain how brains create cortical activities that mediate sensation, perception, comprehension, prediction, decision, and action or inaction.

  10. Do plant cell walls have a code?

    PubMed

    Tavares, Eveline Q P; Buckeridge, Marcos S

    2015-12-01

    A code is a set of rules that establish correspondence between two worlds, signs (consisting of encrypted information) and meaning (of the decrypted message). A third element, the adaptor, connects both worlds, assigning meaning to a code. We propose that a Glycomic Code exists in plant cell walls where signs are represented by monosaccharides and phenylpropanoids and meaning is cell wall architecture with its highly complex association of polymers. Cell wall biosynthetic mechanisms, structure, architecture and properties are addressed according to Code Biology perspective, focusing on how they oppose to cell wall deconstruction. Cell wall hydrolysis is mainly focused as a mechanism of decryption of the Glycomic Code. Evidence for encoded information in cell wall polymers fine structure is highlighted and the implications of the existence of the Glycomic Code are discussed. Aspects related to fine structure are responsible for polysaccharide packing and polymer-polymer interactions, affecting the final cell wall architecture. The question whether polymers assembly within a wall display similar properties as other biological macromolecules (i.e. proteins, DNA, histones) is addressed, i.e. do they display a code?

  11. Mechanisms of immunity to Leishmania major infection in mice: the contribution of DNA vaccines coding for two novel sets of histones (H2A-H2B or H3-H4).

    PubMed

    Carrión, Javier

    2011-09-01

    The immune phenotype conferred by two different sets of histone genes (H2A-H2B or H3-H4) was assessed. BALB/c mice vaccinated with pcDNA3H2AH2B succumbed to progressive cutaneous leishmaniosis (CL), whereas vaccination with pcDNA3H3H4 resulted in partial resistance to Leishmania major challenge associated with the development of mixed T helper 1 (Th1)/Th2-type response and a reduction in parasite-specific Treg cells number at the site of infection. Therefore, the presence of histones H3 and H4 may be considered essential in the development of vaccine strategies against CL based on the Leishmania histones.

  12. Lectin cDNA and transgenic plants derived therefrom

    DOEpatents

    Raikhel, Natasha V.

    2000-10-03

    Transgenic plants containing cDNA encoding Gramineae lectin are described. The plants preferably contain cDNA coding for barley lectin and store the lectin in the leaves. The transgenic plants, particularly the leaves exhibit insecticidal and fungicidal properties.

  13. Class-specific Error Bounds for Ensemble Classifiers

    SciTech Connect

    Prenger, R; Lemmond, T; Varshney, K; Chen, B; Hanley, W

    2009-10-06

    The generalization error, or probability of misclassification, of ensemble classifiers has been shown to be bounded above by a function of the mean correlation between the constituent (i.e., base) classifiers and their average strength. This bound suggests that increasing the strength and/or decreasing the correlation of an ensemble's base classifiers may yield improved performance under the assumption of equal error costs. However, this and other existing bounds do not directly address application spaces in which error costs are inherently unequal. For applications involving binary classification, Receiver Operating Characteristic (ROC) curves, performance curves that explicitly trade off false alarms and missed detections, are often utilized to support decision making. To address performance optimization in this context, we have developed a lower bound for the entire ROC curve that can be expressed in terms of the class-specific strength and correlation of the base classifiers. We present empirical analyses demonstrating the efficacy of these bounds in predicting relative classifier performance. In addition, we specify performance regions of the ROC curve that are naturally delineated by the class-specific strengths of the base classifiers and show that each of these regions can be associated with a unique set of guidelines for performance optimization of binary classifiers within unequal error cost regimes.

  14. Evaluating the fusion of multiple classifiers via ROC curves

    NASA Astrophysics Data System (ADS)

    Hill, Justin M.; Oxley, Mark E.; Bauer, Kenneth W., Jr.

    2003-08-01

    Given a finite collection of classifiers one might wish to combine, or fuse, the classifiers in hopes that the multiple classifier system (MCS) will perform better than the individuals. One method of fusing classifiers is to combine their final decision using Boolean rules (e.g., a logical OR, AND, or a majority vote of the classifiers in the system). An established method for evaluating a classifier is measuring some aspect of its Receiver Operating Characteristic (ROC) curve, which graphs the trade-off between the conditional probabilities of detection and false alarm. This work presents a unique method of estimating the performance of an MCS in which Boolean rules are used to combine individual decisions. The method requires performance data similar to the data available in the ROC curves for each of the individual classifiers, and the method can be used to estimate the ROC curve for the entire system. A consequence of this result is that one can save time and money by effectively evaluating the performance of an MCS without performing experiments.

  15. Mechanical code comparator

    DOEpatents

    Peter, Frank J.; Dalton, Larry J.; Plummer, David W.

    2002-01-01

    A new class of mechanical code comparators is described which have broad potential for application in safety, surety, and security applications. These devices can be implemented as micro-scale electromechanical systems that isolate a secure or otherwise controlled device until an access code is entered. This access code is converted into a series of mechanical inputs to the mechanical code comparator, which compares the access code to a pre-input combination, entered previously into the mechanical code comparator by an operator at the system security control point. These devices provide extremely high levels of robust security. Being totally mechanical in operation, an access control system properly based on such devices cannot be circumvented by software attack alone.

  16. More box codes

    NASA Technical Reports Server (NTRS)

    Solomon, G.

    1992-01-01

    A new investigation shows that, starting from the BCH (21,15;3) code represented as a 7 x 3 matrix and adding a row and column to add even parity, one obtains an 8 x 4 matrix (32,15;8) code. An additional dimension is obtained by specifying odd parity on the rows and even parity on the columns, i.e., adjoining to the 8 x 4 matrix, the matrix, which is zero except for the fourth column (of all ones). Furthermore, any seven rows and three columns will form the BCH (21,15;3) code. This box code has the same weight structure as the quadratic residue and BCH codes of the same dimensions. Whether there exists an algebraic isomorphism to either code is as yet unknown.

  17. Generating code adapted for interlinking legacy scalar code and extended vector code

    DOEpatents

    Gschwind, Michael K

    2013-06-04

    Mechanisms for intermixing code are provided. Source code is received for compilation using an extended Application Binary Interface (ABI) that extends a legacy ABI and uses a different register configuration than the legacy ABI. First compiled code is generated based on the source code, the first compiled code comprising code for accommodating the difference in register configurations used by the extended ABI and the legacy ABI. The first compiled code and second compiled code are intermixed to generate intermixed code, the second compiled code being compiled code that uses the legacy ABI. The intermixed code comprises at least one call instruction that is one of a call from the first compiled code to the second compiled code or a call from the second compiled code to the first compiled code. The code for accommodating the difference in register configurations is associated with the at least one call instruction.

  18. Industrial Computer Codes

    NASA Technical Reports Server (NTRS)

    Shapiro, Wilbur

    1996-01-01

    This is an overview of new and updated industrial codes for seal design and testing. GCYLT (gas cylindrical seals -- turbulent), SPIRALI (spiral-groove seals -- incompressible), KTK (knife to knife) Labyrinth Seal Code, and DYSEAL (dynamic seal analysis) are covered. CGYLT uses G-factors for Poiseuille and Couette turbulence coefficients. SPIRALI is updated to include turbulence and inertia, but maintains the narrow groove theory. KTK labyrinth seal code handles straight or stepped seals. And DYSEAL provides dynamics for the seal geometry.

  19. Phonological coding during reading

    PubMed Central

    Leinenger, Mallorie

    2014-01-01

    The exact role that phonological coding (the recoding of written, orthographic information into a sound based code) plays during silent reading has been extensively studied for more than a century. Despite the large body of research surrounding the topic, varying theories as to the time course and function of this recoding still exist. The present review synthesizes this body of research, addressing the topics of time course and function in tandem. The varying theories surrounding the function of phonological coding (e.g., that phonological codes aid lexical access, that phonological codes aid comprehension and bolster short-term memory, or that phonological codes are largely epiphenomenal in skilled readers) are first outlined, and the time courses that each maps onto (e.g., that phonological codes come online early (pre-lexical) or that phonological codes come online late (post-lexical)) are discussed. Next the research relevant to each of these proposed functions is reviewed, discussing the varying methodologies that have been used to investigate phonological coding (e.g., response time methods, reading while eyetracking or recording EEG and MEG, concurrent articulation) and highlighting the advantages and limitations of each with respect to the study of phonological coding. In response to the view that phonological coding is largely epiphenomenal in skilled readers, research on the use of phonological codes in prelingually, profoundly deaf readers is reviewed. Finally, implications for current models of word identification (activation-verification model (Van Order, 1987), dual-route model (e.g., Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), parallel distributed processing model (Seidenberg & McClelland, 1989)) are discussed. PMID:25150679

  20. A Spatial Classifier for Multispectral Data Using Contextual Information

    NASA Technical Reports Server (NTRS)

    Hung, Chih-Cheng; Fahsi, Ahmed; Coleman, Tommy

    1998-01-01

    Connectivity describes the spatial relationship among pixels. A spatial classifier which employs the sigma probability concept of the Gaussian distribution and a type of contextual information connectivity of the pixels, is studied in this paper. This spatial classifier attempts to replicate the kind of spatial synthesis done by the human analyst during visual interpretation or to capture the spatial relationships inherent in an aerial photograph. Several classification results of the Landsat TM data using this classifier with different window sizes for capturing the contextual information are illustrated and compared.

  1. Automatic speech recognition using a predictive echo state network classifier.

    PubMed

    Skowronski, Mark D; Harris, John G

    2007-04-01

    We have combined an echo state network (ESN) with a competitive state machine framework to create a classification engine called the predictive ESN classifier. We derive the expressions for training the predictive ESN classifier and show that the model was significantly more noise robust compared to a hidden Markov model in noisy speech classification experiments by 8+/-1 dB signal-to-noise ratio. The simple training algorithm and noise robustness of the predictive ESN classifier make it an attractive classification engine for automatic speech recognition.

  2. Automatically Classifying Question Types for Consumer Health Questions

    PubMed Central

    Roberts, Kirk; Kilicoglu, Halil; Fiszman, Marcelo; Demner-Fushman, Dina

    2014-01-01

    We present a method for automatically classifying consumer health questions. Our thirteen question types are designed to aid in the automatic retrieval of medical answers from consumer health resources. To our knowledge, this is the first machine learning-based method specifically for classifying consumer health questions. We demonstrate how previous approaches to medical question classification are insufficient to achieve high accuracy on this task. Additionally, we describe, manually annotate, and automatically classify three important question elements that improve question classification over previous techniques. Our results and analysis illustrate the difficulty of the task and the future directions that are necessary to achieve high-performing consumer health question classification. PMID:25954411

  3. The Relationship Between Diversity and Accuracy in Multiple Classifier Systems

    DTIC Science & Technology

    2012-03-22

    combination [32]. The classifier output combinations from the training set are used to estimate truth values and their relative frequencies . When a new exemplar...level equal to the relative frequency . Table 2 is an example BKS lookup table with a two class, two classifier problem: Table 2 shows both the strength...classifier gives an level of support that may shift the mean [32]. 2.4.3.3 Median Rule. The median rule is a statistical rule similar to the minimum or

  4. Faint spatial object classifier construction based on data mining technology

    NASA Astrophysics Data System (ADS)

    Lou, Xin; Zhao, Yang; Liao, Yurong; Nie, Yong-ming

    2016-11-01

    Data mining can effectively obtain the faint spatial object's patterns and characteristics, the universal relations and other implicated data characteristics, the key of which is classifier construction. Faint spatial object classifier construction with spatial data mining technology for faint spatial target detection is proposed based on theoretical analysis of design procedures and guidelines in detail. For the one-sidedness weakness during dealing with the fuzziness and randomness using this method, cloud modal classifier is proposed. Simulating analyzing results indicate that this method can realize classification quickly through feature combination and effectively resolve the one-sidedness weakness problem.

  5. Tokamak Systems Code

    SciTech Connect

    Reid, R.L.; Barrett, R.J.; Brown, T.G.; Gorker, G.E.; Hooper, R.J.; Kalsi, S.S.; Metzler, D.H.; Peng, Y.K.M.; Roth, K.E.; Spampinato, P.T.

    1985-03-01

    The FEDC Tokamak Systems Code calculates tokamak performance, cost, and configuration as a function of plasma engineering parameters. This version of the code models experimental tokamaks. It does not currently consider tokamak configurations that generate electrical power or incorporate breeding blankets. The code has a modular (or subroutine) structure to allow independent modeling for each major tokamak component or system. A primary benefit of modularization is that a component module may be updated without disturbing the remainder of the systems code as long as the imput to or output from the module remains unchanged.

  6. Topological subsystem codes

    SciTech Connect

    Bombin, H.

    2010-03-15

    We introduce a family of two-dimensional (2D) topological subsystem quantum error-correcting codes. The gauge group is generated by two-local Pauli operators, so that two-local measurements are enough to recover the error syndrome. We study the computational power of code deformation in these codes and show that boundaries cannot be introduced in the usual way. In addition, we give a general mapping connecting suitable classical statistical mechanical models to optimal error correction in subsystem stabilizer codes that suffer from depolarizing noise.

  7. FAA Smoke Transport Code

    SciTech Connect

    Domino, Stefan; Luketa-Hanlin, Anay; Gallegos, Carlos

    2006-10-27

    FAA Smoke Transport Code, a physics-based Computational Fluid Dynamics tool, which couples heat, mass, and momentum transfer, has been developed to provide information on smoke transport in cargo compartments with various geometries and flight conditions. The software package contains a graphical user interface for specification of geometry and boundary conditions, analysis module for solving the governing equations, and a post-processing tool. The current code was produced by making substantial improvements and additions to a code obtained from a university. The original code was able to compute steady, uniform, isothermal turbulent pressurization. In addition, a preprocessor and postprocessor were added to arrive at the current software package.

  8. Transonic airfoil codes

    NASA Technical Reports Server (NTRS)

    Garabedian, P. R.

    1979-01-01

    Computer codes for the design and analysis of transonic airfoils are considered. The design code relies on the method of complex characteristics in the hodograph plane to construct shockless airfoil. The analysis code uses artificial viscosity to calculate flows with weak shock waves at off-design conditions. Comparisons with experiments show that an excellent simulation of two dimensional wind tunnel tests is obtained. The codes have been widely adopted by the aircraft industry as a tool for the development of supercritical wing technology.

  9. Differences in the Kinesic Codes of Americans and Japanese.

    ERIC Educational Resources Information Center

    Kitao, S. Kathleen; Kitao, Kenji

    Differences between American and Japanese society and culture that contribute to differences in the use of body language are examined. These include historical and social factors. Examples of the differences and of the misunderstandings that can result are analyzed and illustrated, using a system of classifying kinesic codes into categories…

  10. Classifier for evaluating the effects of image processing on character recognition

    NASA Astrophysics Data System (ADS)

    McNamara, James F.; Casey, David W.; Smith, Robert W.; Bradburn, David S.

    1993-06-01

    This paper presents an automated methodology for selecting morphological filters from a given set that will most improve a text image for character recognition. Toward this end, a classifier is described which generates an internal representation of the image qualities affecting readability, and which uses those properties to identify images that will benefit by application of a particular filter. In the study, handprint and machineprint character bitmaps are taken from binarized document images and enhanced using a set of non-recursive neighborhood operators. Features related to the connected components and their morphology are extracted prior to the filtering step. Character recognition results are obtained from commercially available recognition engines, which together with the measured morphological features, form a training set for statistical classifiers. The classifiers derive a partitioning of the input based on the morphological features, and the output yields an indication of the specific filter most appropriate to apply to improve character recognition. Results are presented for handprinted ZIP code digit images, and for average to poor quality and dot matrix alphanumeric machineprint obtained from postal application images. Performance for each case is statistically analyzed.

  11. Numerical investigation of the grinding process in a Beater Wheel mill with classifier

    SciTech Connect

    Anagnostopoulos, J.; Bergeles, G.

    1997-07-01

    A numerical investigation is presented for a two-dimensional simulation of the gas flow field and of the dynamic behavior of lignite particles inside Beater Wheel mills with classifier, installed in large coal-fired plants. A large number of representative particles are tracked using Lagrangian equations of motion, in combination with a stochastic model for particle turbulent dispersion. All the important mechanisms associated with the particle motion through the mill (particle-surface collisions and rebounding phenomena, fuel moisture evaporation and erosion wear of internal surfaces) are modeled. A special model is constructed to simulate the fragmentation of impacting particles and to calculate the size distribution of the final mill product. The models are regulated on the basis of available data from grinding mills of the Greek lignite power stations. The numerical code is capable of predicting the locations of significant erosion and to estimate the amount of particle mass that circulates through the mill via the classifying chamber. Mean impact velocity and impingement angle distributions along all the internal surfaces are also provided. The results indicate remarkable differences in the extent of the erosion caused at different locations of the mill. Also, the significant role of the leading blades arrangement inside the classifier on its classification performance and efficiency is elucidated.

  12. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

    PubMed Central

    Heinson, Ashley I.; Gunawardana, Yawwani; Moesker, Bastiaan; Denman Hume, Carmen C.; Vataga, Elena; Hall, Yper; Stylianou, Elena; McShane, Helen; Williams, Ann; Niranjan, Mahesan; Woelk, Christopher H.

    2017-01-01

    Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future. PMID:28157153

  13. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology.

    PubMed

    Heinson, Ashley I; Gunawardana, Yawwani; Moesker, Bastiaan; Hume, Carmen C Denman; Vataga, Elena; Hall, Yper; Stylianou, Elena; McShane, Helen; Williams, Ann; Niranjan, Mahesan; Woelk, Christopher H

    2017-02-01

    Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.

  14. Genes and Pathways Involved in Adult Onset Disorders Featuring Muscle Mitochondrial DNA Instability

    PubMed Central

    Ahmed, Naghia; Ronchi, Dario; Comi, Giacomo Pietro

    2015-01-01

    Replication and maintenance of mtDNA entirely relies on a set of proteins encoded by the nuclear genome, which include members of the core replicative machinery, proteins involved in the homeostasis of mitochondrial dNTPs pools or deputed to the control of mitochondrial dynamics and morphology. Mutations in their coding genes have been observed in familial and sporadic forms of pediatric and adult-onset clinical phenotypes featuring mtDNA instability. The list of defects involved in these disorders has recently expanded, including mutations in the exo-/endo-nuclease flap-processing proteins MGME1 and DNA2, supporting the notion that an enzymatic DNA repair system actively takes place in mitochondria. The results obtained in the last few years acknowledge the contribution of next-generation sequencing methods in the identification of new disease loci in small groups of patients and even single probands. Although heterogeneous, these genes can be conveniently classified according to the pathway to which they belong. The definition of the molecular and biochemical features of these pathways might be helpful for fundamental knowledge of these disorders, to accelerate genetic diagnosis of patients and the development of rational therapies. In this review, we discuss the molecular findings disclosed in adult patients with muscle pathology hallmarked by mtDNA instability. PMID:26251896

  15. Statistical approaches to account for false-positive errors in environmental DNA samples.

    PubMed

    Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid

    2016-05-01

    Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies.

  16. Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.

    PubMed

    Bauer, Amy L; Hlavacek, William S; Unkefer, Pat J; Mu, Fangping

    2010-11-18

    An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF). Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm. The main advantage of SiteSleuth is a lower false positive rate.

  17. Genes and Pathways Involved in Adult Onset Disorders Featuring Muscle Mitochondrial DNA Instability.

    PubMed

    Ahmed, Naghia; Ronchi, Dario; Comi, Giacomo Pietro

    2015-08-05

    Replication and maintenance of mtDNA entirely relies on a set of proteins encoded by the nuclear genome, which include members of the core replicative machinery, proteins involved in the homeostasis of mitochondrial dNTPs pools or deputed to the control of mitochondrial dynamics and morphology. Mutations in their coding genes have been observed in familial and sporadic forms of pediatric and adult-onset clinical phenotypes featuring mtDNA instability. The list of defects involved in these disorders has recently expanded, including mutations in the exo-/endo-nuclease flap-processing proteins MGME1 and DNA2, supporting the notion that an enzymatic DNA repair system actively takes place in mitochondria. The results obtained in the last few years acknowledge the contribution of next-generation sequencing methods in the identification of new disease loci in small groups of patients and even single probands. Although heterogeneous, these genes can be conveniently classified according to the pathway to which they belong. The definition of the molecular and biochemical features of these pathways might be helpful for fundamental knowledge of these disorders, to accelerate genetic diagnosis of patients and the development of rational therapies. In this review, we discuss the molecular findings disclosed in adult patients with muscle pathology hallmarked by mtDNA instability.

  18. 43 CFR 3809.10 - How does BLM classify operations?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) MINING CLAIMS UNDER THE GENERAL MINING LAWS Surface Management General Information § 3809.10 How does BLM classify operations? BLM...

  19. 72. SECONDARY MILL AND CLASSIFIER FROM NORTHWEST. WOOD FEED BOX ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    72. SECONDARY MILL AND CLASSIFIER FROM NORTHWEST. WOOD FEED BOX BEHIND MILL, BARREN SOLUTION TANK BEHIND TRAVELING CRANE TRUSS AND ABOVE MILL. - Bald Mountain Gold Mill, Nevada Gulch at head of False Bottom Creek, Lead, Lawrence County, SD

  20. Classified Component Disposal at the Nevada National Security Site

    SciTech Connect

    Poling, J.; Arnold, P.; Saad, M.; DiSanza, F.; Cabble, K.

    2012-11-05

    The Nevada National Security Site (NNSS) has added the capability needed for the safe, secure disposal of non-nuclear classified components that have been declared excess to national security requirements. The NNSS has worked with U.S. Department of Energy, National Nuclear Security Administration senior leadership to gain formal approval for permanent burial of classified matter at the NNSS in the Area 5 Radioactive Waste Management Complex owned by the U.S. Department of Energy. Additionally, by working with state regulators, the NNSS added the capability to dispose non-radioactive hazardous and non-hazardous classified components. The NNSS successfully piloted the new disposal pathway with the receipt of classified materials from the Kansas City Plant in March 2012.

  1. The effect of abnormal cell proportion on specimen classifier performance

    NASA Technical Reports Server (NTRS)

    Castleman, K. R.; White, B. S.

    1981-01-01

    An analysis is presented of the results obtained from a cell classifier which is confronted with an abnormal/normal cell ratio which is different from the ratio assumed in the calibration of the classifier. False negative and false positive error rates are determined in advance for classifier operation, along with the necessary sample size in order to validate the predicted distributions. Changes are demonstrated to happen only regarding the false negative rate, where reductions in the abnormal cell rate below the expected rates would cause totally unreliable data. Substantial overproduction of abnormal cells would be quickly noticeable, while production rates beyond, but close to, the expected rates would only require more extensive sampling. Classifier systems for 10% proportions of abnormal cells are concluded to be possible, but difficulties are present with much lower rates

  2. 28 CFR 17.41 - Access to classified information.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... sexual orientation in granting access to classified information. However, the Department may investigate... raised solely on the basis of the sexual orientation of the employee or mental health counseling. (d)...

  3. 32 CFR 2001.55 - Foreign disclosure of classified information.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... containing the classified information. Markings used to implement this section shall be approved in... Intelligence may issue policy directives or guidelines pursuant to section 6.2(b) of the Order that modify...

  4. 46 CFR 108.177 - Electrical equipment in classified locations.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Classified Locations § 108.177 Electrical... by the methods indicated in § 108.175 must only be essential equipment. Ventilation...

  5. 46 CFR 108.177 - Electrical equipment in classified locations.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Classified Locations § 108.177 Electrical... by the methods indicated in § 108.175 must only be essential equipment. Ventilation...

  6. 46 CFR 108.177 - Electrical equipment in classified locations.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... DRILLING UNITS DESIGN AND EQUIPMENT Construction and Arrangement Classified Locations § 108.177 Electrical... by the methods indicated in § 108.175 must only be essential equipment. Ventilation...

  7. 48 CFR 2814.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... of classified contracts. In accordance with FAR 14.409-2, the contracting officer shall advise the unsuccessful bidders, including any who did not bid, to take disposition action in accordance with...

  8. 48 CFR 2814.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... of classified contracts. In accordance with FAR 14.409-2, the contracting officer shall advise the unsuccessful bidders, including any who did not bid, to take disposition action in accordance with...

  9. 48 CFR 914.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 914.409-2 Award of classified contracts. DOE regulations regarding the safeguarding of restricted data and procedures for...

  10. 48 CFR 914.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 914.409-2 Award of classified contracts. DOE regulations regarding the safeguarding of restricted data and procedures for...

  11. 48 CFR 914.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 914.409-2 Award of classified contracts. DOE regulations regarding the safeguarding of restricted data and procedures for...

  12. 48 CFR 914.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 914.409-2 Award of classified contracts. DOE regulations regarding the safeguarding of restricted data and procedures for...

  13. 48 CFR 914.409-2 - Award of classified contracts.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 914.409-2 Award of classified contracts. DOE regulations regarding the safeguarding of restricted data and procedures for...

  14. Robust Combining of Disparate Classifiers Through Order Statistics

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2001-01-01

    Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.

  15. 29. DETAIL OF CLASSIFIER, LOOKING NORTH NORTHWEST. THIS MACHINE WAS ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    29. DETAIL OF CLASSIFIER, LOOKING NORTH NORTHWEST. THIS MACHINE WAS USED TO SEPARATE SLIMES FROM SANDS TO PREPARE THE WET ORE PULP FOR CYANIDE PROCESSING. - Skidoo Mine, Park Route 38 (Skidoo Road), Death Valley Junction, Inyo County, CA

  16. A hierarchical approach to coding chemical, biological and pharmaceutical substances.

    PubMed

    Keefe, Anya R; Bert, Joel L; Grace, John R; Makaroff, Sylvia J; Lang, Barbara J; Band, Pierre R

    2005-01-01

    This hierarchical coding system is designed to classify substances into successively subordinate categories on the basis of chemical, physical and biological properties. Although initially developed for occupational cancer epidemiological studies, it is general in nature and can be used for other purposes where a systematic approach is needed to catalogue or analyze large numbers of substances and/or physical properties. The coding system incorporates a multi level approach, where substances can be coded both on the basis of function and composition. On the first level, a three digit code is assigned to each substance to indicate its primary use in the occupational environment (e.g. pesticide, catalyst, adhesive). Substances can then be coded using a ten digit code to indicate structure and composition (e.g. organic molecule, biomolecule, pharmaceutical). Depending on the complexity required, analysis can incorporate the three digit code, ten digit code, or a combination of both. The approach to coding both chemical and biological agents is modeled in part after conventional approaches used by the International Union of Pure and Applied Chemists (IUPAC) and the International Union of Biochemists (IUB). Development of the coding system was initiated in the 1980's in response to a need for a system allowing analysis of individual agents as well classes or groups of substances. The project was undertaken as a collaborative venture between the BC Cancer Agency, Cancer Control Research program (then Division of Epidemiology) and the Department of Chemical and Biological Engineering at the University of British Columbia.

  17. Dealing with contaminated datasets: An approach to classifier training

    NASA Astrophysics Data System (ADS)

    Homenda, Wladyslaw; Jastrzebska, Agnieszka; Rybnik, Mariusz

    2016-06-01

    The paper presents a novel approach to classification reinforced with rejection mechanism. The method is based on a two-tier set of classifiers. First layer classifies elements, second layer separates native elements from foreign ones in each distinguished class. The key novelty presented here is rejection mechanism training scheme according to the philosophy "one-against-all-other-classes". Proposed method was tested in an empirical study of handwritten digits recognition.

  18. DETAIL VIEW OF CLASSIFIER, TAILINGS LAUNDER TROUGH, LINE SHAFTS, AND ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL VIEW OF CLASSIFIER, TAILINGS LAUNDER TROUGH, LINE SHAFTS, AND CONCENTRATION TABLES, LOOKING SOUTHWEST. SLURRY EXITING THE BALL MILL WAS COLLECTED IN AN AMALGAMATION BOX (MISSING) FROM THE END OF THE MILL, AND INTRODUCED INTO THE CLASSIFIER. THE TAILINGS LAUDER IS ON THE GROUND AT LOWER RIGHT. THE LINE SHAFTING ABOVE PROVIDED POWER TO THE CONCENTRATION TABLES BELOW AT CENTER RIGHT. - Gold Hill Mill, Warm Spring Canyon Road, Death Valley Junction, Inyo County, CA

  19. Evolving fuzzy rules in a learning classifier system

    NASA Technical Reports Server (NTRS)

    Valenzuela-Rendon, Manuel

    1993-01-01

    The fuzzy classifier system (FCS) combines the ideas of fuzzy logic controllers (FLC's) and learning classifier systems (LCS's). It brings together the expressive powers of fuzzy logic as it has been applied in fuzzy controllers to express relations between continuous variables, and the ability of LCS's to evolve co-adapted sets of rules. The goal of the FCS is to develop a rule-based system capable of learning in a reinforcement regime, and that can potentially be used for process control.

  20. One pass learning for generalized classifier neural network.

    PubMed

    Ozyildirim, Buse Melis; Avci, Mutlu

    2016-01-01

    Generalized classifier neural network introduced as a kind of radial basis function neural network, uses gradient descent based optimized smoothing parameter value to provide efficient classification. However, optimization consumes quite a long time and may cause a drawback. In this work, one pass learning for generalized classifier neural network is proposed to overcome this disadvantage. Proposed method utilizes standard deviation of each class to calculate corresponding smoothing parameter. Since different datasets may have different standard deviations and data distributions, proposed method tries to handle these differences by defining two functions for smoothing parameter calculation. Thresholding is applied to determine which function will be used. One of these functions is defined for datasets having different range of values. It provides balanced smoothing parameters for these datasets through logarithmic function and changing the operation range to lower boundary. On the other hand, the other function calculates smoothing parameter value for classes having standard deviation smaller than the threshold value. Proposed method is tested on 14 datasets and performance of one pass learning generalized classifier neural network is compared with that of probabilistic neural network, radial basis function neural network, extreme learning machines, and standard and logarithmic learning generalized classifier neural network in MATLAB environment. One pass learning generalized classifier neural network provides more than a thousand times faster classification than standard and logarithmic generalized classifier neural network. Due to its classification accuracy and speed, one pass generalized classifier neural network can be considered as an efficient alternative to probabilistic neural network. Test results show that proposed method overcomes computational drawback of generalized classifier neural network and may increase the classification performance.