Functional interrogation of non-coding DNA through CRISPR genome editing
Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.
2017-01-01
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828
Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin
2018-01-01
Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712
Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin
2017-10-06
Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.
Functional interrogation of non-coding DNA through CRISPR genome editing.
Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H
2017-05-15
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Non-coding functions of alternative pre-mRNA splicing in development
Mockenhaupt, Stefan; Makeyev, Eugene V.
2015-01-01
A majority of messenger RNA precursors (pre-mRNAs) in the higher eukaryotes undergo alternative splicing to generate more than one mature product. By targeting the open reading frame region this process increases diversity of protein isoforms beyond the nominal coding capacity of the genome. However, alternative splicing also frequently controls output levels and spatiotemporal features of cellular and organismal gene expression programs. Here we discuss how these non-coding functions of alternative splicing contribute to development through regulation of mRNA stability, translational efficiency and cellular localization. PMID:26493705
Kapranov, Philipp; St Laurent, Georges; Raz, Tal; Ozsolak, Fatih; Reynolds, C Patrick; Sorensen, Poul H B; Reaman, Gregory; Milos, Patrice; Arceci, Robert J; Thompson, John F; Triche, Timothy J
2010-12-21
Discovery that the transcriptional output of the human genome is far more complex than predicted by the current set of protein-coding annotations and that most RNAs produced do not appear to encode proteins has transformed our understanding of genome complexity and suggests new paradigms of genome regulation. However, the fraction of all cellular RNA whose function we do not understand and the fraction of the genome that is utilized to produce that RNA remain controversial. This is not simply a bookkeeping issue because the degree to which this un-annotated transcription is present has important implications with respect to its biologic function and to the general architecture of genome regulation. For example, efforts to elucidate how non-coding RNAs (ncRNAs) regulate genome function will be compromised if that class of RNAs is dismissed as simply 'transcriptional noise'. We show that the relative mass of RNA whose function and/or structure we do not understand (the so called 'dark matter' RNAs), as a proportion of all non-ribosomal, non-mitochondrial human RNA (mt-RNA), can be greater than that of protein-encoding transcripts. This observation is obscured in studies that focus only on polyA-selected RNA, a method that enriches for protein coding RNAs and at the same time discards the vast majority of RNA prior to analysis. We further show the presence of a large number of very long, abundantly-transcribed regions (100's of kb) in intergenic space and further show that expression of these regions is associated with neoplastic transformation. These overlap some regions found previously in normal human embryonic tissues and raises an interesting hypothesis as to the function of these ncRNAs in both early development and neoplastic transformation. We conclude that 'dark matter' RNA can constitute the majority of non-ribosomal, non-mitochondrial-RNA and a significant fraction arises from numerous very long, intergenic transcribed regions that could be involved in neoplastic transformation.
Non-coding functions of alternative pre-mRNA splicing in development.
Mockenhaupt, Stefan; Makeyev, Eugene V
2015-12-01
A majority of messenger RNA precursors (pre-mRNAs) in the higher eukaryotes undergo alternative splicing to generate more than one mature product. By targeting the open reading frame region this process increases diversity of protein isoforms beyond the nominal coding capacity of the genome. However, alternative splicing also frequently controls output levels and spatiotemporal features of cellular and organismal gene expression programs. Here we discuss how these non-coding functions of alternative splicing contribute to development through regulation of mRNA stability, translational efficiency and cellular localization. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Sanges, Remo; Hadzhiev, Yavor; Gueroult-Bellone, Marion; Roure, Agnes; Ferg, Marco; Meola, Nicola; Amore, Gabriele; Basu, Swaraj; Brown, Euan R.; De Simone, Marco; Petrera, Francesca; Licastro, Danilo; Strähle, Uwe; Banfi, Sandro; Lemaire, Patrick; Birney, Ewan; Müller, Ferenc; Stupka, Elia
2013-01-01
Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as ‘Olfactores conserved non-coding elements’. PMID:23393190
Javierre, Biola M; Burren, Oliver S; Wilder, Steven P; Kreuzhuber, Roman; Hill, Steven M; Sewitz, Sven; Cairns, Jonathan; Wingett, Steven W; Várnai, Csilla; Thiecke, Michiel J; Burden, Frances; Farrow, Samantha; Cutler, Antony J; Rehnström, Karola; Downes, Kate; Grassi, Luigi; Kostadima, Myrto; Freire-Pritchett, Paula; Wang, Fan; Stunnenberg, Hendrik G; Todd, John A; Zerbino, Daniel R; Stegle, Oliver; Ouwehand, Willem H; Frontini, Mattia; Wallace, Chris; Spivakov, Mikhail; Fraser, Peter
2016-11-17
Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Duan, Aiying; Jiang, Chaowei; Hu, Qiang; Zhang, Huai; Gary, G. Allen; Wu, S. T.; Cao, Jinbin
2017-06-01
Magnetic field extrapolation is an important tool to study the three-dimensional (3D) solar coronal magnetic field, which is difficult to directly measure. Various analytic models and numerical codes exist, but their results often drastically differ. Thus, a critical comparison of the modeled magnetic field lines with the observed coronal loops is strongly required to establish the credibility of the model. Here we compare two different non-potential extrapolation codes, a nonlinear force-free field code (CESE-MHD-NLFFF) and a non-force-free field (NFFF) code, in modeling a solar active region (AR) that has a sigmoidal configuration just before a major flare erupted from the region. A 2D coronal-loop tracing and fitting method is employed to study the 3D misalignment angles between the extrapolated magnetic field lines and the EUV loops as imaged by SDO/AIA. It is found that the CESE-MHD-NLFFF code with preprocessed magnetogram performs the best, outputting a field that matches the coronal loops in the AR core imaged in AIA 94 Å with a misalignment angle of ˜10°. This suggests that the CESE-MHD-NLFFF code, even without using the information of the coronal loops in constraining the magnetic field, performs as good as some coronal-loop forward-fitting models. For the loops as imaged by AIA 171 Å in the outskirts of the AR, all the codes including the potential field give comparable results of the mean misalignment angle (˜30°). Thus, further improvement of the codes is needed for a better reconstruction of the long loops enveloping the core region.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duan, Aiying; Zhang, Huai; Jiang, Chaowei
Magnetic field extrapolation is an important tool to study the three-dimensional (3D) solar coronal magnetic field, which is difficult to directly measure. Various analytic models and numerical codes exist, but their results often drastically differ. Thus, a critical comparison of the modeled magnetic field lines with the observed coronal loops is strongly required to establish the credibility of the model. Here we compare two different non-potential extrapolation codes, a nonlinear force-free field code (CESE–MHD–NLFFF) and a non-force-free field (NFFF) code, in modeling a solar active region (AR) that has a sigmoidal configuration just before a major flare erupted from themore » region. A 2D coronal-loop tracing and fitting method is employed to study the 3D misalignment angles between the extrapolated magnetic field lines and the EUV loops as imaged by SDO /AIA. It is found that the CESE–MHD–NLFFF code with preprocessed magnetogram performs the best, outputting a field that matches the coronal loops in the AR core imaged in AIA 94 Å with a misalignment angle of ∼10°. This suggests that the CESE–MHD–NLFFF code, even without using the information of the coronal loops in constraining the magnetic field, performs as good as some coronal-loop forward-fitting models. For the loops as imaged by AIA 171 Å in the outskirts of the AR, all the codes including the potential field give comparable results of the mean misalignment angle (∼30°). Thus, further improvement of the codes is needed for a better reconstruction of the long loops enveloping the core region.« less
Insights into HLA-G Genetics Provided by Worldwide Haplotype Diversity
Castelli, Erick C.; Ramalho, Jaqueline; Porto, Iane O. P.; Lima, Thálitta H. A.; Felício, Leandro P.; Sabbagh, Audrey; Donadi, Eduardo A.; Mendes-Junior, Celso T.
2014-01-01
Human leukocyte antigen G (HLA-G) belongs to the family of non-classical HLA class I genes, located within the major histocompatibility complex (MHC). HLA-G has been the target of most recent research regarding the function of class I non-classical genes. The main features that distinguish HLA-G from classical class I genes are (a) limited protein variability, (b) alternative splicing generating several membrane bound and soluble isoforms, (c) short cytoplasmic tail, (d) modulation of immune response (immune tolerance), and (e) restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments [promoter, coding, and 3′ untranslated region (UTR)]. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding, and 3′UTRs are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures. PMID:25339953
Lazzarato, F; Franceschinis, G; Botta, M; Cordero, F; Calogero, R A
2004-11-01
RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html
Complete mitochondrial genome of Eagle Owl (Bubo bubo, Strigiformes; Strigidae) from China.
Hengjiu, Tian; Jianwei, Ji; Shi, Yang; Zhiming, Zhang; Laghari, Muhammad Younis; Narejo, Naeem Tariq; Lashari, Punhal
2016-01-01
In the present study, the complete mitochondrial genome sequence of Bubo bubo using PCR amplification, sequencing and assembling has been obtained for the first time. The total length of the mitochondrial genome was 16,250 bp, with the base composition of 29.88% A, 34.16% C, 14.35% G, and 21.58% T. It contained 37 genes (2 ribosomal RNA genes, 13 protein-coding genes and 22 transfer RNA genes) and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Bubo bubo provides an important data set for further investigation on the phylogenetic relationships within Strigiformes.
Identification of coding and non-coding mutational hotspots in cancer genomes.
Piraino, Scott W; Furney, Simon J
2017-01-05
The identification of mutations that play a causal role in tumour development, so called "driver" mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes. To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis. We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from likely passenger regions susceptible to somatic mutation.
Hall, L; Laird, J E; Craig, R K
1984-01-01
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Simard, Frédéric; Licht, Monica; Besansky, Nora J.; Lehmann, Tovi
2007-01-01
Genetic variation in defensin, a gene encoding a major effector molecule of insects immune response was analyzed within and between populations of three members of the Anopheles gambiae complex. The species selected included the two anthropophilic species, An. gambiae and An. arabiensis and the most zoophilic species of the complex, An. quadriannulatus. The first species was represented by four populations spanning its extreme genetic and geographical ranges, whereas each of the other two species was represented by a single population. We found (i) reduced overall polymorphism in the mature peptide region and in the total coding region, together with specific reductions in rare and moderately frequent mutations (sites) in the coding region compared with non coding regions, (ii) markedly reduced rate of nonsynonymous diversity compared with synonymous variation in the mature peptide and virtually identical mature peptide across the three species, and (iii) increased divergence between species in the mature peptide together with reduced differentiation between populations of An. gambiae in the same DNA region. These patterns suggest a strong purifying selection on the mature peptide and probably the whole coding region. Because An. quadriannulatus is not exposed to human pathogens, identical mature peptide and similar pattern of polymorphism across species implies that human pathogens played no role as selective agents on this peptide. PMID:17161659
Ma, Yuanyuan; Zheng, Xiaodong; Cheng, Rubin; Li, Qi
2016-01-01
In this paper, we determined the complete mitochondrial genome of Octopus conispadiceus (Cephalopoda: Octopodidae). The whole mitogenome of O. conispadiceus is 16,027 basepairs (bp) in length with a base composition of 41.4% A, 34.8% T, 16.1% C, 7.7% G and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes, and a major non-coding region (MNR). The gene arrangements of O. conispadiceus showed remarkable similarity to that of O. vulgaris, Amphioctopus fangsiao, Cistopus chinensis and C. taiwanicus.
USDA-ARS?s Scientific Manuscript database
The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich...
Raghavan, Sathees C.; Hsieh, Chih-Lin; Lieber, Michael R.
2005-01-01
The t(14;18) chromosomal translocation is the most common translocation in human cancer, and it occurs in all follicular lymphomas. The 150-bp bcl-2 major breakpoint region (Mbr) on chromosome 18 is a fragile site, because it adopts a non-B DNA conformation that can be cleaved by the RAG complex. The non-B DNA structure and the chromosomal translocation can be recapitulated on intracellular human minichromosomes where immunoglobulin 12- and 23-signals are positioned downstream of the bcl-2 Mbr. Here we show that either of the two coding ends in these V(D)J recombination reactions can recombine with either of the two broken ends of the bcl-2 Mbr but that neither signal end can recombine with the Mbr. Moreover, we show that the rejoining is fully dependent on DNA ligase IV, indicating that the rejoining phase relies on the nonhomologous DNA end-joining pathway. These results permit us to formulate a complete model for the order and types of cleavage and rejoining events in the t(14;18) translocation. PMID:16024785
2012-01-01
Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225
Classification of breast tissue in mammograms using efficient coding.
Costa, Daniel D; Campos, Lúcio F; Barros, Allan K
2011-06-24
Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.
Sikorav, J L; Duval, N; Anselmet, A; Bon, S; Krejci, E; Legay, C; Osterlund, M; Reimund, B; Massoulié, J
1988-01-01
In this paper, we show the existence of alternative splicing in the 3' region of the coding sequence of Torpedo acetylcholinesterase (AChE). We describe two cDNA structures which both diverge from the previously described coding sequence of the catalytic subunit of asymmetric (A) forms (Schumacher et al., 1986; Sikorav et al., 1987). They both contain a coding sequence followed by a non-coding sequence and a poly(A) stretch. Both of these structures were shown to exist in poly(A)+ RNAs, by S1 mapping experiments. The divergent region encoded by the first sequence corresponds to the precursor of the globular dimeric form (G2a), since it contains the expected C-terminal amino acids, Ala-Cys. These amino acids are followed by a 29 amino acid extension which contains a hydrophobic segment and must be replaced by a glycolipid in the mature protein. Analyses of intact G2a AChE showed that the common domain of the protein contains intersubunit disulphide bonds. The divergent region of the second type of cDNA consists of an adjacent genomic sequence, which is removed as an intron in A and Ga mRNAs, but may encode a distinct, less abundant catalytic subunit. The structures of the cDNA clones indicate that they are derived from minor mRNAs, shorter than the three major transcripts which have been described previously (14.5, 10.5 and 5.5 kb). Oligonucleotide probes specific for the asymmetric and globular terminal regions hybridize with the three major transcripts, indicating that their size is determined by 3'-untranslated regions which are not related to the differential splicing leading to A and Ga forms. Images PMID:3181125
NASA Technical Reports Server (NTRS)
Chang, Dong Kyung; Metzgar, David; Wills, Christopher; Boland, C. Richard
2003-01-01
All "minor" components of the human DNA mismatch repair (MMR) system-MSH3, MSH6, PMS2, and the recently discovered MLH3-contain mononucleotide microsatellites in their coding sequences. This intriguing finding contrasts with the situation found in the major components of the DNA MMR system-MSH2 and MLH1-and, in fact, most human genes. Although eukaryotic genomes are rich in microsatellites, non-triplet microsatellites are rare in coding regions. The recurring presence of exonal mononucleotide repeat sequences within a single family of human genes would therefore be considered exceptional.
Huang, Chen; Morlighem, Jean-Étienne R L; Cai, Jing; Liao, Qiwen; Perez, Carlos Daniel; Gomes, Paula Braga; Guo, Min; Rádis-Baptista, Gandhi; Lee, Simon Ming-Yuen
2017-07-13
Long non-coding RNAs (lncRNAs) have been shown to play regulatory roles in a diverse range of biological processes and are associated with the outcomes of various diseases. The majority of studies about lncRNAs focus on model organisms, with lessened investigation in non-model organisms to date. Herein, we have undertaken an investigation on lncRNA in two zoanthids (cnidarian): Protolpalythoa varibilis and Palythoa caribaeorum. A total of 11,206 and 13,240 lncRNAs were detected in P. variabilis and P. caribaeorum transcriptome, respectively. Comparison using NONCODE database indicated that the majority of these lncRNAs is taxonomically species-restricted with no identifiable orthologs. Even so, we found cases in which short regions of P. caribaeorum's lncRNAs were similar to vertebrate species' lncRNAs, and could be associated with lncRNA conserved regulatory functions. Consequently, some high-confidence lncRNA-mRNA interactions were predicted based on such conserved regions, therefore revealing possible involvement of lncRNAs in posttranscriptional processing and regulation in anthozoans. Moreover, investigation of differentially expressed lncRNAs, in healthy colonies and colonial individuals undergoing natural bleaching, indicated that some up-regulated lncRNAs in P. caribaeorum could posttranscriptionally regulate the mRNAs encoding proteins of Ras-mediated signal transduction pathway and components of innate immune-system, which could contribute to the molecular response of coral bleaching.
Identification of G-quadruplex forming sequences in three manatee papillomaviruses
Zahin, Maryam; Dean, William L.; Ghim, Shin-je; Joh, Joongho; Gray, Robert D.; Khanal, Sujita; Bossart, Gregory D.; Mignucci-Giannoni, Antonio A.; Rouchka, Eric C.; Jenson, Alfred B.; Trent, John O.; Chaires, Jonathan B.
2018-01-01
The Florida manatee (Trichechus manatus latirotris) is a threatened aquatic mammal in United States coastal waters. Over the past decade, the appearance of papillomavirus-induced lesions and viral papillomatosis in manatees has been a concern for those involved in the management and rehabilitation of this species. To date, three manatee papillomaviruses (TmPVs) have been identified in Florida manatees, one forming cutaneous lesions (TmPV1) and two forming genital lesions (TmPV3 and TmPV4). We identified DNA sequences with the potential to form G-quadruplex structures (G4) across the three genomes. G4 were located on both DNA strands and across coding and non-coding regions on all TmPVs, offering multiple targets for viral control. Although G4 have been identified in several viral genomes, including human PVs, most research has focused on canonical structures comprised of three G-tetrads. In contrast, the vast majority of sequences we identified would allow the formation of non-canonical structures with only two G-tetrads. Our biophysical analysis confirmed the formation of G4 with parallel topology in three such sequences from the E2 region. Two of the structures appear comprised of multiple stacked two G-tetrad structures, perhaps serving to increase structural stability. Computational analysis demonstrated enrichment of G4 sequences on all TmPVs on the reverse strand in the E2/E4 region and on both strands in the L2 region. Several G4 sequences occurred at similar regional locations on all PVs, most notably on the reverse strand in the E2 region. In other cases, G4 were identified at similar regional locations only on PVs forming genital lesions. On all TmPVs, G4 sequences were located in the non-coding region near putative E2 binding sites. Together, these findings suggest that G4 are possible regulatory elements in TmPVs. PMID:29630682
Current Research on Non-Coding Ribonucleic Acid (RNA).
Wang, Jing; Samuels, David C; Zhao, Shilin; Xiang, Yu; Zhao, Ying-Yong; Guo, Yan
2017-12-05
Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.
Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.
2014-01-01
Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168
Juul, Malene; Bertl, Johanna; Guo, Qianyun; Nielsen, Morten Muhlig; Świtnicki, Michał; Hornshøj, Henrik; Madsen, Tobias; Hobolth, Asger; Pedersen, Jakob Skou
2017-01-01
Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5’UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance. DOI: http://dx.doi.org/10.7554/eLife.21778.001 PMID:28362259
Evaluation of Multi-Vessel Ship Motion Prediction Codes
2008-09-01
each other, and accounting for the hydrodynamic effects between the hulls. The major differences in the capabilities of the codes were in the non...Figure 28. Effects of irregular frequency smoothing has on the resultant pitch transfer function for three meter separation, 135 degree heading, and...and accounting for the hydrodynamic effects between the hulls. The major differences in the capabilities of the codes were in the non-hydrodynamic
Zhuo, Chuanjun; Hou, Weihong; Hu, Lirong; Lin, Chongguang; Chen, Ce; Lin, Xiaodong
2017-01-01
Schizophrenia is a genetically related mental illness, in which the majority of genetic alterations occur in the non-coding regions of the human genome. In the past decade, a growing number of regulatory non-coding RNAs (ncRNAs) including microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) have been identified to be strongly associated with schizophrenia. However, the studies of these ncRNAs in the pathophysiology of schizophrenia and the reverting of their genetic defects in restoration of the normal phenotype have been hampered by insufficient technology to manipulate these ncRNA genes effectively as well as a lack of appropriate animal models. Most recently, a revolutionary gene editing technology known as Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated nuclease 9 (Cas9; CRISPR/Cas9) has been developed that enable researchers to overcome these challenges. In this review article, we mainly focus on the schizophrenia-related ncRNAs and the use of CRISPR/Cas9-mediated editing on the non-coding regions of the genomic DNA in proving causal relationship between the genetic defects and the pathophysiology of schizophrenia. We subsequently discuss the potential of translating this advanced technology into a clinical therapy for schizophrenia, although the CRISPR/Cas9 technology is currently still in its infancy and immature to put into use in the treatment of diseases. Furthermore, we suggest strategies to accelerate the pace from the bench to the bedside. This review describes the application of the powerful and feasible CRISPR/Cas9 technology to manipulate schizophrenia-associated ncRNA genes. This technology could help researchers tackle this complex health problem and perhaps other genetically related mental disorders due to the overlapping genetic alterations of schizophrenia with other mental illnesses. PMID:28217082
Quantifying the mechanisms of domain gain in animal proteins.
Buljan, Marija; Frankish, Adam; Bateman, Alex
2010-01-01
Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechanisms that underlie domain gains in animals are still unknown. By using animal gene phylogenies we were able to identify a set of high confidence domain gain events and by looking at their coding DNA investigate the causative mechanisms. Here we show that the major mechanism for gains of new domains in metazoan proteins is likely to be gene fusion through joining of exons from adjacent genes, possibly mediated by non-allelic homologous recombination. Retroposition and insertion of exons into ancestral introns through intronic recombination are, in contrast to previous expectations, only minor contributors to domain gains and have accounted for less than 1% and 10% of high confidence domain gain events, respectively. Additionally, exonization of previously non-coding regions appears to be an important mechanism for addition of disordered segments to proteins. We observe that gene duplication has preceded domain gain in at least 80% of the gain events. The interplay of gene duplication and domain gain demonstrates an important mechanism for fast neofunctionalization of genes.
Complete mitochondrial genome of the larch hawk moth, Sphinx morio (Lepidoptera: Sphingidae).
Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo
2013-12-01
The larch hawk moth, Sphinx morio, belongs to the lepidopteran family Sphingidae that has long been studied as a family of model insects in a diverse field. In this study, we describe the complete mitochondrial genome (mitogenome) sequences of the species in terms of general genomic features and characteristic short repetitive sequences found in the A + T-rich region. The 15,299-bp-long genome consisted of a typical set of genes (13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes) and one major non-coding A + T-rich region, with the typical arrangement found in Lepidoptera. The 316-bp-long A + T-rich region located between srRNA and tRNA(Met) harbored the conserved sequence blocks that are typically found in lepidopteran insects. Additionally, the A + T-rich region of S. morio contained three characteristic repeat sequences that are rarely found in Lepidoptera: two identical 12-bp repeat, three identical 5-bp-long tandem repeat, and six nearly identical 5-6 bp long repeat sequences.
Analysis and recognition of 5′ UTR intron splice sites in human pre-mRNA
Eden, E.; Brunak, S.
2004-01-01
Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5′ untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to ‘pure’ UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by ‘coding’ noise, thus enhancing significantly the prediction of 5′ UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3′ ends of non-coding exons and 5′ non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2–3-fold better compared with NetGene2 and GenScan in 5′ UTRs. We also tested the 5′ UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR. PMID:14960723
Molecular characterization of Banana streak virus isolate from Musa Acuminata in China.
Zhuang, Jun; Wang, Jian-Hua; Zhang, Xin; Liu, Zhi-Xin
2011-12-01
Banana streak virus (BSV), a member of genus Badnavirus, is a causal agent of banana streak disease throughout the world. The genetic diversity of BSVs from different regions of banana plantations has previously been investigated, but there are relatively few reports of the genetic characteristic of episomal (non-integrated) BSV genomes isolated from China. Here, the complete genome, a total of 7722bp (GenBank accession number DQ092436), of an isolate of Banana streak virus (BSV) on cultivar Cavendish (BSAcYNV) in Yunnan, China was determined. The genome organises in the typical manner of badnaviruses. The intergenic region of genomic DNA contains a large stem-loop, which may contribute to the ribosome shift into the following open reading frames (ORFs). The coding region of BSAcYNV consists of three overlapping ORFs, ORF1 with a non-AUG start codon and ORF2 encoding two small proteins are individually involved in viral movement and ORF3 encodes a polyprotein. Besides the complete genome, a defective genome lacking the whole RNA leader region and a majority of ORF1 and which encompasses 6525bp was also isolated and sequenced from this BSV DNA reservoir in infected banana plants. Sequence analyses showed that BSAcYNV has closest similarity in terms of genome organization and the coding assignments with an BSV isolate from Vietnam (BSAcVNV). The corresponding coding regions shared identities of 88% and -95% at nucleotide and amino acid levels, respectively. Phylogenetic analysis also indicated BSAcYNV shared the closest geographical evolutionary relationship to BSAcVNV among sequenced banana streak badnaviruses.
Seim, Inge; Carter, Shea L; Herington, Adrian C; Chopin, Lisa K
2008-01-01
Background The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH) release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS), which spans the promoter and untranslated regions of the ghrelin gene (GHRL). Here we further characterise GHRLOS. Results We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2). Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis), as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. Conclusion GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA) genes, including 5' capping, polyadenylation, extensive splicing and short open reading frames. The gene is also non-conserved, with differential and tissue-restricted expression. The overlapping genomic arrangement of GHRLOS with the ghrelin gene indicates that it is likely to have interesting regulatory and functional roles in the ghrelin axis. PMID:18954468
Seim, Inge; Carter, Shea L; Herington, Adrian C; Chopin, Lisa K
2008-10-28
The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH) release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS), which spans the promoter and untranslated regions of the ghrelin gene (GHRL). Here we further characterise GHRLOS. We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2). Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis), as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA) genes, including 5' capping, polyadenylation, extensive splicing and short open reading frames. The gene is also non-conserved, with differential and tissue-restricted expression. The overlapping genomic arrangement of GHRLOS with the ghrelin gene indicates that it is likely to have interesting regulatory and functional roles in the ghrelin axis.
Unfiltered Talk--A Challenge to Categories.
ERIC Educational Resources Information Center
McCormick, Kay
A study investigated how and why code switching and mixing occurs between English and Afrikaans in a region of South Africa. In District Six, non-standard Afrikaans seems to be a mixed code, and it is unclear whether non-standard English is a mixed code. Consequently, it is unclear when codes are being switched or mixed. The analysis looks at…
Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes.
Kabra, Ritika; Kapil, Aditi; Attarwala, Kherunnisa; Rai, Piyush Kant; Shanker, Asheesh
2016-04-01
Microsatellites also known as Simple Sequence Repeats are short tandem repeats of 1-6 nucleotides. These repeats are found in coding as well as non-coding regions of both prokaryotic and eukaryotic genomes and play a significant role in the study of gene regulation, genetic mapping, DNA fingerprinting and evolutionary studies. The availability of 73 complete genome sequences of cyanobacteria enabled us to mine and statistically analyze microsatellites in these genomes. The cyanobacterial microsatellites identified through bioinformatics analysis were stored in a user-friendly database named CyanoSat, which is an efficient data representation and query system designed using ASP.net. The information in CyanoSat comprises of perfect, imperfect and compound microsatellites found in coding, non-coding and coding-non-coding regions. Moreover, it contains PCR primers with 200 nucleotides long flanking region. The mined cyanobacterial microsatellites can be freely accessed at www.compubio.in/CyanoSat/home.aspx. In addition to this 82 polymorphic, 13,866 unique and 2390 common microsatellites were also detected. These microsatellites will be useful in strain identification and genetic diversity studies of cyanobacteria.
The complete mitochondrial genome of the sandbar shark Carcharhinus plumbeus.
Blower, Dean C; Ovenden, Jennifer R
2016-01-01
The sandbar shark, Carcharhinus plumbeus, a major representative species in shark fisheries worldwide is now considered vulnerable to overfishing. A pool of 774,234 Roche 454 shotgun sequences from one individual were assembled into a 16,706 bp mitogenome with 33× average coverage depth. It comprised 13 protein coding genes, 22 transfer RNA's, 2 ribosomal genes and 2 non-coding regions, typical of a vertebrate mitogenome. As expected for sharks, an A-T nucleotide bias was evident. This adds to rapidly growing number of mitogenome assemblies for the economically important Carcharhinidae family. The C. plumbeus mitogenome will assist researchers, fisheries and conservation managers interested in shark molecular systematics, phylogeography, conservation genetics, population and stock structure.
Simonen, Marja-Leena; Roivainen, Merja; Iber, Jane; Burns, Cara; Hovi, Tapani
2010-01-01
In 1984, a wild type 3 poliovirus (PV3/FIN84) spread all over Finland causing nine cases of paralytic poliomyelitis and one case of aseptic meningitis. The outbreak was ended in 1985 with an intensive vaccination campaign. By limited sequence comparison with previously isolated PV3 strains, closest relatives of PV3/FIN84 were found among strains circulating in the Mediterranean region. Now we wanted to reanalyse the relationships using approaches currently exploited in poliovirus surveillance. Cell lysates of 22 strains isolated during the outbreak and stored frozen were subjected to RT-PCR amplification in three genomic regions without prior subculture. Sequences of the entire VP1 coding region, 150 nucleotides in the VP1-2A junction, most of the 5' non-coding region, partial sequences of the 3D RNA polymerase coding region and partial 3' non-coding region were compared within the outbreak and with sequences available in data banks. In addition, complete nucleotide sequences were obtained for 2 strains isolated from two different cases of disease during the outbreak. The results confirmed the previously described wide intraepidemic variation of the strains, including amino acid substitutions in antigenic sites, as well as the likely Mediterranean region origin of the strains. Simplot and bootscanning analyses of the complete genomes indicated complicated evolutionary history of the non-capsid coding regions of the genome suggesting several recombinations with different HEV-C viruses in the past.
[Relevance of long non-coding RNAs in tumour biology].
Nagy, Zoltán; Szabó, Diána Rita; Zsippai, Adrienn; Falus, András; Rácz, Károly; Igaz, Péter
2012-09-23
The discovery of the biological relevance of non-coding RNA molecules represents one of the most significant advances in contemporary molecular biology. It has turned out that a major fraction of the non-coding part of the genome is transcribed. Beside small RNAs (including microRNAs) more and more data are disclosed concerning long non-coding RNAs of 200 nucleotides to 100 kb length that are implicated in the regulation of several basic molecular processes (cell proliferation, chromatin functioning, microRNA-mediated effects, etc.). Some of these long non-coding RNAs have been associated with human tumours, including H19, HOTAIR, MALAT1, etc., the different expression of which has been noted in various neoplasms relative to healthy tissues. Long non-coding RNAs may represent novel markers of molecular diagnostics and they might even turn out to be targets of therapeutic intervention.
A-to-I RNA editing independent of ADARs in filamentous fungi
Wang, Chenfang; Xu, Jin-Rong; Liu, Huiquan
2016-01-01
ABSTRACT ADAR mediated A-to-I RNA editing is thought to be unique to animals and occurs mainly in the non-coding regions. Recently filamentous fungi such as Fusarium graminearum were found to lack orthologs of animal ADARs but have stage-specific A-to-I editing during sexual reproduction. Unlike animals, majority of editing sites are in the coding regions and often result in missense and stop loss changes in fungi. Furthermore, whereas As in RNA stems are targeted by animal ADARs, RNA editing in fungi preferentially targets As in hairpin loops, implying that fungal RNA editing involves mechanisms related to editing of the anticodon loop by ADATs. Identification and characterization of fungal adenosine deaminases and their stage-specific co-factors may be helpful to understand the evolution of human ADARs. Fungi also can be used to study biological functions of missense and stop loss RNA editing events in eukaryotic organisms. PMID:27533598
Codes, Ciphers, and Cryptography--An Honors Colloquium
ERIC Educational Resources Information Center
Karls, Michael A.
2010-01-01
At the suggestion of a colleague, I read "The Code Book", [32], by Simon Singh to get a basic introduction to the RSA encryption scheme. Inspired by Singh's book, I designed a Ball State University Honors Colloquium in Mathematics for both majors and non-majors, with material coming from "The Code Book" and many other sources. This course became…
Analysis of 16S-23S rRNA intergenic spacer regions of Vibrio cholerae and Vibrio mimicus.
Chun, J; Huq, A; Colwell, R R
1999-05-01
Vibrio cholerae identification based on molecular sequence data has been hampered by a lack of sequence variation from the closely related Vibrio mimicus. The two species share many genes coding for proteins, such as ctxAB, and show almost identical 16S DNA coding for rRNA (rDNA) sequences. Primers targeting conserved sequences flanking the 3' end of the 16S and the 5' end of the 23S rDNAs were used to amplify the 16S-23S rRNA intergenic spacer regions of V. cholerae and V. mimicus. Two major (ca. 580 and 500 bp) and one minor (ca. 750 bp) amplicons were consistently generated for both species, and their sequences were determined. The largest fragment contains three tRNA genes (tDNAs) coding for tRNAGlu, tRNALys, and tRNAVal, which has not previously been found in bacteria examined to date. The 580-bp amplicon contained tDNAIle and tDNAAla, whereas the 500-bp fragment had single tDNA coding either tRNAGlu or tRNAAla. Little variation, i.e., 0 to 0.4%, was found among V. cholerae O1 classical, O1 El Tor, and O139 epidemic strains. Slightly more variation was found against the non-O1/non-O139 serotypes (ca. 1% difference) and V. mimicus (2 to 3% difference). A pair of oligonucleotide primers were designed, based on the region differentiating all of V. cholerae strains from V. mimicus. The PCR system developed was subsequently evaluated by using representatives of V. cholerae from environmental and clinical sources, and of other taxa, including V. mimicus. This study provides the first molecular tool for identifying the species V. cholerae.
Pietan, Lucas L.; Spradling, Theresa A.
2016-01-01
In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589
Shao, Renfu; Barker, Stephen C
2011-02-15
The mitochondrial (mt) genome of the human body louse, Pediculus humanus, consists of 18 minichromosomes. Each minichromosome is 3 to 4 kb long and has 1 to 3 genes. There is unequivocal evidence for recombination between different mt minichromosomes in P. humanus. It is not known, however, how these minichromosomes recombine. Here, we report the discovery of eight chimeric mt minichromosomes in P. humanus. We classify these chimeric mt minichromosomes into two groups: Group I and Group II. Group I chimeric minichromosomes contain parts of two different protein-coding genes that are from different minichromosomes. The two parts of protein-coding genes in each Group I chimeric minichromosome are joined at a microhomologous nucleotide sequence; microhomologous nucleotide sequences are hallmarks of non-homologous recombination. Group II chimeric minichromosomes contain all of the genes and the non-coding regions of two different minichromosomes. The conserved sequence blocks in the non-coding regions of Group II chimeric minichromosomes resemble the "recombination repeats" in the non-coding regions of the mt genomes of higher plants. These repeats are essential to homologous recombination in higher plants. Our analyses of the nucleotide sequences of chimeric mt minichromosomes indicate both homologous and non-homologous recombination between minichromosomes in the mitochondria of the human body louse. Copyright © 2010 Elsevier B.V. All rights reserved.
Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.
Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng
2017-01-01
Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.
Cloutier, Sara C; Wang, Siwen; Ma, Wai Kit; Al Husini, Nadra; Dhoondia, Zuzer; Ansari, Athar; Pascuzzi, Pete E; Tran, Elizabeth J
2016-02-04
Long non-coding (lnc)RNAs, once thought to merely represent noise from imprecise transcription initiation, have now emerged as major regulatory entities in all eukaryotes. In contrast to the rapidly expanding identification of individual lncRNAs, mechanistic characterization has lagged behind. Here we provide evidence that the GAL lncRNAs in the budding yeast S. cerevisiae promote transcriptional induction in trans by formation of lncRNA-DNA hybrids or R-loops. The evolutionarily conserved RNA helicase Dbp2 regulates formation of these R-loops as genomic deletion or nuclear depletion results in accumulation of these structures across the GAL cluster gene promoters and coding regions. Enhanced transcriptional induction is manifested by lncRNA-dependent displacement of the Cyc8 co-repressor and subsequent gene looping, suggesting that these lncRNAs promote induction by altering chromatin architecture. Moreover, the GAL lncRNAs confer a competitive fitness advantage to yeast cells because expression of these non-coding molecules correlates with faster adaptation in response to an environmental switch. Copyright © 2016 Elsevier Inc. All rights reserved.
Crosstalk between the Notch signaling pathway and non-coding RNAs in gastrointestinal cancers
Pan, Yangyang; Mao, Yuyan; Jin, Rong; Jiang, Lei
2018-01-01
The Notch signaling pathway is one of the main signaling pathways that mediates direct contact between cells, and is essential for normal development. It regulates various cellular processes, including cell proliferation, apoptosis, migration, invasion, angiogenesis and metastasis. It additionally serves an important function in tumor progression. Non-coding RNAs mainly include small microRNAs, long non-coding RNAs and circular RNAs. At present, a large body of literature supports the biological significance of non-coding RNAs in tumor progression. It is also becoming increasingly evident that cross-talk exists between Notch signaling and non-coding RNAs. The present review summarizes the current knowledge of Notch-mediated gastrointestinal cancer cell processes, and the effect of the crosstalk between the three major types of non-coding RNAs and the Notch signaling pathway on the fate of gastrointestinal cancer cells. PMID:29285185
A-to-I editing of coding and non-coding RNAs by ADARs
Nishikura, Kazuko
2016-01-01
Adenosine deaminases acting on RNA (ADARs) convert adenosine to inosine in double-stranded RNA. This A-to-I editing occurs not only in protein-coding regions of mRNAs, but also frequently in non-coding regions that contain inverted Alu repeats. Editing of coding sequences can result in the expression of functionally altered proteins that are not encoded in the genome, whereas the significance of Alu editing remains largely unknown. Certain microRNA (miRNA) precursors are also edited, leading to reduced expression or altered function of mature miRNAs. Conversely, recent studies indicate that ADAR1 forms a complex with Dicer to promote miRNA processing, revealing a new function of ADAR1 in the regulation of RNA interference. PMID:26648264
Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong
2012-01-01
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.
Recurrent and functional regulatory mutations in breast cancer.
Rheinbay, Esther; Parasuraman, Prasanna; Grimsby, Jonna; Tiao, Grace; Engreitz, Jesse M; Kim, Jaegil; Lawrence, Michael S; Taylor-Weiner, Amaro; Rodriguez-Cuevas, Sergio; Rosenberg, Mara; Hess, Julian; Stewart, Chip; Maruvka, Yosef E; Stojanov, Petar; Cortes, Maria L; Seepo, Sara; Cibulskis, Carrie; Tracy, Adam; Pugh, Trevor J; Lee, Jesse; Zheng, Zongli; Ellisen, Leif W; Iafrate, A John; Boehm, Jesse S; Gabriel, Stacey B; Meyerson, Matthew; Golub, Todd R; Baselga, Jose; Hidalgo-Miranda, Alfredo; Shioda, Toshi; Bernards, Andre; Lander, Eric S; Getz, Gad
2017-07-06
Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.
cncRNAs: Bi-functional RNAs with protein coding and non-coding functions
Kumari, Pooja; Sampath, Karuna
2015-01-01
For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as ‘cncRNAs’, have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions. PMID:26498036
Kress, W John; Erickson, David L
2007-06-06
A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination.
DNA methylation of miRNA coding sequences putatively associated with childhood obesity.
Mansego, M L; Garcia-Lacarte, M; Milagro, F I; Marti, A; Martinez, J A
2017-02-01
Epigenetic mechanisms may be involved in obesity onset and its consequences. The aim of the present study was to evaluate whether DNA methylation status in microRNA (miRNA) coding regions is associated with childhood obesity. DNA isolated from white blood cells of 24 children (identification sample: 12 obese and 12 non-obese) from the Grupo Navarro de Obesidad Infantil study was hybridized in a 450 K methylation microarray. Several CpGs whose DNA methylation levels were statistically different between obese and non-obese were validated by MassArray® in 95 children (validation sample) from the same study. Microarray analysis identified 16 differentially methylated CpGs between both groups (6 hypermethylated and 10 hypomethylated). DNA methylation levels in miR-1203, miR-412 and miR-216A coding regions significantly correlated with body mass index standard deviation score (BMI-SDS) and explained up to 40% of the variation of BMI-SDS. The network analysis identified 19 well-defined obesity-relevant biological pathways from the KEGG database. MassArray® validation identified three regions located in or near miR-1203, miR-412 and miR-216A coding regions differentially methylated between obese and non-obese children. The current work identified three CpG sites located in coding regions of three miRNAs (miR-1203, miR-412 and miR-216A) that were differentially methylated between obese and non-obese children, suggesting a role of miRNA epigenetic regulation in childhood obesity. © 2016 World Obesity Federation.
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin
ERIC Educational Resources Information Center
Offner, Susan
2010-01-01
The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.
Design of ACM system based on non-greedy punctured LDPC codes
NASA Astrophysics Data System (ADS)
Lu, Zijun; Jiang, Zihong; Zhou, Lin; He, Yucheng
2017-08-01
In this paper, an adaptive coded modulation (ACM) scheme based on rate-compatible LDPC (RC-LDPC) codes was designed. The RC-LDPC codes were constructed by a non-greedy puncturing method which showed good performance in high code rate region. Moreover, the incremental redundancy scheme of LDPC-based ACM system over AWGN channel was proposed. By this scheme, code rates vary from 2/3 to 5/6 and the complication of the ACM system is lowered. Simulations show that more and more obvious coding gain can be obtained by the proposed ACM system with higher throughput.
Zhu, Shiyou; Li, Wei; Liu, Jingze; Chen, Chen-Hao; Liao, Qi; Xu, Ping; Xu, Han; Xiao, Tengfei; Cao, Zhongzheng; Peng, Jingyu; Yuan, Pengfei; Brown, Myles; Liu, Xiaole Shirley; Wei, Wensheng
2017-01-01
CRISPR/Cas9 screens have been widely adopted to analyse coding gene functions, but high throughput screening of non-coding elements using this method is more challenging, because indels caused by a single cut in non-coding regions are unlikely to produce a functional knockout. A high-throughput method to produce deletions of non-coding DNA is needed. Herein, we report a high throughput genomic deletion strategy to screen for functional long non-coding RNAs (lncRNAs) that is based on a lentiviral paired-guide RNA (pgRNA) library. Applying our screening method, we identified 51 lncRNAs that can positively or negatively regulate human cancer cell growth. We individually validated 9 lncRNAs using CRISPR/Cas9-mediated genomic deletion and functional rescue, CRISPR activation or inhibition, and gene expression profiling. Our high-throughput pgRNA genome deletion method should enable rapid identification of functional mammalian non-coding elements. PMID:27798563
Effects of GWAS-Associated Genetic Variants on lncRNAs within IBD and T1D Candidate Loci
Brorsson, Caroline A.; Pociot, Flemming
2014-01-01
Long non-coding RNAs are a new class of non-coding RNAs that are at the crosshairs in many human diseases such as cancers, cardiovascular disorders, inflammatory and autoimmune disease like Inflammatory Bowel Disease (IBD) and Type 1 Diabetes (T1D). Nearly 90% of the phenotype-associated single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) lie outside of the protein coding regions, and map to the non-coding intervals. However, the relationship between phenotype-associated loci and the non-coding regions including the long non-coding RNAs (lncRNAs) is poorly understood. Here, we systemically identified all annotated IBD and T1D loci-associated lncRNAs, and mapped nominally significant GWAS/ImmunoChip SNPs for IBD and T1D within these lncRNAs. Additionally, we identified tissue-specific cis-eQTLs, and strong linkage disequilibrium (LD) signals associated with these SNPs. We explored sequence and structure based attributes of these lncRNAs, and also predicted the structural effects of mapped SNPs within them. We also identified lncRNAs in IBD and T1D that are under recent positive selection. Our analysis identified putative lncRNA secondary structure-disruptive SNPs within and in close proximity (+/−5 kb flanking regions) of IBD and T1D loci-associated candidate genes, suggesting that these RNA conformation-altering polymorphisms might be associated with diseased-phenotype. Disruption of lncRNA secondary structure due to presence of GWAS SNPs provides valuable information that could be potentially useful for future structure-function studies on lncRNAs. PMID:25144376
Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O.; Decker, Christian; Preising, Markus N.; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Issa, Peter Charbel; Holz, Frank G.; Baig, Shahid M.; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y.; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S.; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J.
2013-01-01
Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover “hidden mutations” such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5′ exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5′-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even truncating mutations may be misleading. PMID:24265693
Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O; Decker, Christian; Preising, Markus N; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Charbel Issa, Peter; Holz, Frank G; Baig, Shahid M; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J
2013-01-01
Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover "hidden mutations" such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5' exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5'-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even truncating mutations may be misleading.
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.
Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja
2017-02-01
Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R; Voß, Björn
2015-04-22
In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5'UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5'UTR. Such an sRNA/mRNA structure, which we name 'actuaton', represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation.
Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline.
Toffano-Nioche, Claire; Luo, Yufei; Kuchly, Claire; Wallon, Claire; Steinbach, Delphine; Zytnicki, Matthias; Jacq, Annick; Gautheret, Daniel
2013-09-01
RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Kress, W. John; Erickson, David L.
2007-01-01
Background A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Methodology/Principal Findings Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. Conclusions/Significance A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination. PMID:17551588
2014-01-01
Background Small insertion and deletion polymorphisms (Indels) are the second most common mutations in the human genome, after Single Nucleotide Polymorphisms (SNPs). Recent studies have shown that they have significant influence on genetic variation by altering human traits and can cause multiple human diseases. In particular, many Indels that occur in protein coding regions are known to impact the structure or function of the protein. A major challenge is to predict the effects of these Indels and to distinguish between deleterious and neutral variants. When an Indel occurs within a coding region, it can be either frameshifting (FS) or non-frameshifting (NFS). FS-Indels either modify the complete C-terminal region of the protein or result in premature termination of translation. NFS-Indels insert/delete multiples of three nucleotides leading to the insertion/deletion of one or more amino acids. Results In order to study the relationships between NFS-Indels and Mendelian diseases, we characterized NFS-Indels according to numerous structural, functional and evolutionary parameters. We then used these parameters to identify specific characteristics of disease-causing and neutral NFS-Indels. Finally, we developed a new machine learning approach, KD4i, that can be used to predict the phenotypic effects of NFS-Indels. Conclusions We demonstrate in a large-scale evaluation that the accuracy of KD4i is comparable to existing state-of-the-art methods. However, a major advantage of our approach is that we also provide the reasons for the predictions, in the form of a set of rules. The rules are interpretable by non-expert humans and they thus represent new knowledge about the relationships between the genotype and phenotypes of NFS-Indels and the causative molecular perturbations that result in the disease. PMID:24742296
Origin and evolution of the long non-coding genes in the X-inactivation center.
Romito, Antonio; Rougeulle, Claire
2011-11-01
Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.
Kim, Min Jee; Im, Hyun Hwak; Lee, Kwang Youll; Han, Yeon Soo; Kim, Iksoo
2014-06-01
Abstract The complete nucleotide sequences of the mitochondrial genome from the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae), was determined. The 20,319-bp long circular genome is the longest among completely sequenced Coleoptera. As is typical in animals, the P. brevitarsis genome consisted of two ribosomal RNAs, 22 transfer RNAs, 13 protein-coding genes and one A + T-rich region. Although the size of the coding genes was typical, the non-coding A + T-rich region was 5654 bp, which is the longest in insects. The extraordinary length of this region was composed of 28,117-bp tandem repeats and 782-bp tandem repeats. These repeat sequences were encompassed by three non-repeat sequences constituting 1804 bp.
A benchmark study of scoring methods for non-coding mutations.
Drubay, Damien; Gautheret, Daniel; Michiels, Stefan
2018-05-15
Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.
Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R.; Voß, Björn
2015-01-01
In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5′UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5′UTR. Such an sRNA/mRNA structure, which we name ‘actuaton’, represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation. PMID:25902393
On fuzzy semantic similarity measure for DNA coding.
Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin
2016-02-01
A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Araya, Carlos L.; Cenik, Can; Reuter, Jason A.; Kiss, Gert; Pande, Vijay S.; Snyder, Michael P.; Greenleaf, William J.
2015-01-01
Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification. PMID:26691984
Python Radiative Transfer Emission code (PyRaTE): non-LTE spectral lines simulations
NASA Astrophysics Data System (ADS)
Tritsis, A.; Yorke, H.; Tassis, K.
2018-05-01
We describe PyRaTE, a new, non-local thermodynamic equilibrium (non-LTE) line radiative transfer code developed specifically for post-processing astrochemical simulations. Population densities are estimated using the escape probability method. When computing the escape probability, the optical depth is calculated towards all directions with density, molecular abundance, temperature and velocity variations all taken into account. A very easy-to-use interface, capable of importing data from simulations outputs performed with all major astrophysical codes, is also developed. The code is written in PYTHON using an "embarrassingly parallel" strategy and can handle all geometries and projection angles. We benchmark the code by comparing our results with those from RADEX (van der Tak et al. 2007) and against analytical solutions and present case studies using hydrochemical simulations. The code will be released for public use.
Bain, Peter A; Papanicolaou, Alexie; Kumar, Anupama
2015-01-01
Murray-Darling rainbowfish (Melanotaenia fluviatilis [Castelnau, 1878]; Atheriniformes: Melanotaeniidae) is a small-bodied teleost currently under development in Australasia as a test species for aquatic toxicological studies. To date, efforts towards the development of molecular biomarkers of contaminant exposure have been hindered by the lack of available sequence data. To address this, we sequenced messenger RNA from brain, liver and gonads of mature male and female fish and generated a high-quality draft transcriptome using a de novo assembly approach. 149,742 clusters of putative transcripts were obtained, encompassing 43,841 non-redundant protein-coding regions. Deduced amino acid sequences were annotated by functional inference based on similarity with sequences from manually curated protein sequence databases. The draft assembly contained protein-coding regions homologous to 95.7% of the complete cohort of predicted proteins from the taxonomically related species, Oryzias latipes (Japanese medaka). The mean length of rainbowfish protein-coding sequences relative to their medaka homologues was 92.1%, indicating that despite the limited number of tissues sampled a large proportion of the total expected number of protein-coding genes was captured in the study. Because of our interest in the effects of environmental contaminants on endocrine pathways, we manually curated subsets of coding regions for putative nuclear receptors and steroidogenic enzymes in the rainbowfish transcriptome, revealing 61 candidate nuclear receptors encompassing all known subfamilies, and 41 putative steroidogenic enzymes representing all major steroidogenic enzymes occurring in teleosts. The transcriptome presented here will be a valuable resource for researchers interested in biomarker development, protein structure and function, and contaminant-response genomics in Murray-Darling rainbowfish.
Dual CRISPR-Cas9 Cleavage Mediated Gene Excision and Targeted Integration in Yarrowia lipolytica.
Gao, Difeng; Smith, Spencer; Spagnuolo, Michael; Rodriguez, Gabriel; Blenner, Mark
2018-05-29
CRISPR-Cas9 technology has been successfully applied in Yarrowia lipolytica for targeted genomic editing including gene disruption and integration; however, disruptions by existing methods typically result from small frameshift mutations caused by indels within the coding region, which usually resulted in unnatural protein. In this study, a dual cleavage strategy directed by paired sgRNAs is developed for gene knockout. This method allows fast and robust gene excision, demonstrated on six genes of interest. The targeted regions for excision vary in length from 0.3 kb up to 3.5 kb and contain both non-coding and coding regions. The majority of the gene excisions are repaired by perfect nonhomologous end-joining without indel. Based on this dual cleavage system, two targeted markerless integration methods are developed by providing repair templates. While both strategies are effective, homology mediated end joining (HMEJ) based method are twice as efficient as homology recombination (HR) based method. In both cases, dual cleavage leads to similar or improved gene integration efficiencies compared to gene excision without integration. This dual cleavage strategy will be useful for not only generating more predictable and robust gene knockout, but also for efficient targeted markerless integration, and simultaneous knockout and integration in Y. lipolytica. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R.; Gorodkin, Jan
2014-01-01
Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways. PMID:24416147
Sabarinathan, Radhakrishnan; Wenzel, Anne; Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R; Gorodkin, Jan
2014-01-01
Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways.
Crescenzo-Chaigne, Bernadette; Barbezange, Cyril; Frigard, Vianney; Poulain, Damien; van der Werf, Sylvie
2014-01-01
Exchange of the non coding regions of the NP segment between type A and C influenza viruses was used to demonstrate the importance not only of the proximal panhandle, but also of the initial distal panhandle strength in type specificity. Both elements were found to be compulsory to rescue infectious virus by reverse genetics systems. Interestingly, in type A influenza virus infectious context, the length of the NP segment 5′ NC region once transcribed into mRNA was found to impact its translation, and the level of produced NP protein consequently affected the level of viral genome replication. PMID:25268971
2014-01-01
Background The genome is pervasively transcribed but most transcripts do not code for proteins, constituting non-protein-coding RNAs. Despite increasing numbers of functional reports of individual long non-coding RNAs (lncRNAs), assessing the extent of functionality among the non-coding transcriptional output of mammalian cells remains intricate. In the protein-coding world, transcripts differentially expressed in the context of processes essential for the survival of multicellular organisms have been instrumental in the discovery of functionally relevant proteins and their deregulation is frequently associated with diseases. We therefore systematically identified lncRNAs expressed differentially in response to oncologically relevant processes and cell-cycle, p53 and STAT3 pathways, using tiling arrays. Results We found that up to 80% of the pathway-triggered transcriptional responses are non-coding. Among these we identified very large macroRNAs with pathway-specific expression patterns and demonstrated that these are likely continuous transcripts. MacroRNAs contain elements conserved in mammals and sauropsids, which in part exhibit conserved RNA secondary structure. Comparing evolutionary rates of a macroRNA to adjacent protein-coding genes suggests a local action of the transcript. Finally, in different grades of astrocytoma, a tumor disease unrelated to the initially used cell lines, macroRNAs are differentially expressed. Conclusions It has been shown previously that the majority of expressed non-ribosomal transcripts are non-coding. We now conclude that differential expression triggered by signaling pathways gives rise to a similar abundance of non-coding content. It is thus unlikely that the prevalence of non-coding transcripts in the cell is a trivial consequence of leaky or random transcription events. PMID:24594072
RPS8—a New Informative DNA Marker for Phylogeny of Babesia and Theileria Parasites in China
Tian, Zhan-Cheng; Liu, Guang-Yuan; Yin, Hong; Luo, Jian-Xun; Guan, Gui-Quan; Luo, Jin; Xie, Jun-Ren; Shen, Hui; Tian, Mei-Yuan; Zheng, Jin-feng; Yuan, Xiao-song; Wang, Fang-fang
2013-01-01
Piroplasmosis is a serious debilitating and sometimes fatal disease. Phylogenetic relationships within piroplasmida are complex and remain unclear. We compared the intron–exon structure and DNA sequences of the RPS8 gene from Babesia and Theileria spp. isolates in China. Similar to 18S rDNA, the 40S ribosomal protein S8 gene, RPS8, including both coding and non-coding regions is a useful and novel genetic marker for defining species boundaries and for inferring phylogenies because it tends to have little intra-specific variation but considerable inter-specific difference. However, more samples are needed to verify the usefulness of the RPS8 (coding and non-coding regions) gene as a marker for the phylogenetic position and detection of most Babesia and Theileria species, particularly for some closely related species. PMID:24244571
Hill, Katherine E; Kelly, Andrew D; Kuijjer, Marieke L; Barry, William; Rattani, Ahmed; Garbutt, Cassandra C; Kissick, Haydn; Janeway, Katherine; Perez-Atayde, Antonio; Goldsmith, Jeffrey; Gebhardt, Mark C; Arredouani, Mohamed S; Cote, Greg; Hornicek, Francis; Choy, Edwin; Duan, Zhenfeng; Quackenbush, John; Haibe-Kains, Benjamin; Spentzos, Dimitrios
2017-05-15
A microRNA (miRNA) collection on the imprinted 14q32 MEG3 region has been associated with outcome in osteosarcoma. We assessed the clinical utility of this miRNA set and their association with methylation status. We integrated coding and non-coding RNA data from three independent annotated clinical osteosarcoma cohorts (n = 65, n = 27, and n = 25) and miRNA and methylation data from one in vitro (19 cell lines) and one clinical (NCI Therapeutically Applicable Research to Generate Effective Treatments (TARGET) osteosarcoma dataset, n = 80) dataset. We used time-dependent receiver operating characteristic (tdROC) analysis to evaluate the clinical value of candidate miRNA profiles and machine learning approaches to compare the coding and non-coding transcriptional programs of high- and low-risk osteosarcoma tumors and high- versus low-aggressiveness cell lines. In the cell line and TARGET datasets, we also studied the methylation patterns of the MEG3 imprinting control region on 14q32 and their association with miRNA expression and tumor aggressiveness. In the tdROC analysis, miRNA sets on 14q32 showed strong discriminatory power for recurrence and survival in the three clinical datasets. High- or low-risk tumor classification was robust to using different microRNA sets or classification methods. Machine learning approaches showed that genome-wide miRNA profiles and miRNA regulatory networks were quite different between the two outcome groups and mRNA profiles categorized the samples in a manner concordant with the miRNAs, suggesting potential molecular subtypes. Further, miRNA expression patterns were reproducible in comparing high-aggressiveness versus low-aggressiveness cell lines. Methylation patterns in the MEG3 differentially methylated region (DMR) also distinguished high-aggressiveness from low-aggressiveness cell lines and were associated with expression of several 14q32 miRNAs in both the cell lines and the large TARGET clinical dataset. Within the limits of available CpG array coverage, we observed a potential methylation-sensitive regulation of the non-coding RNA cluster by CTCF, a known enhancer-blocking factor. Loss of imprinting/methylation changes in the 14q32 non-coding region defines reproducible previously unrecognized osteosarcoma subtypes with distinct transcriptional programs and biologic and clinical behavior. Future studies will define the precise relationship between 14q32 imprinting, non-coding RNA expression, genomic enhancer binding, and tumor aggressiveness, with possible therapeutic implications for both early- and advanced-stage patients.
Alvarado, David M; Yang, Ping; Druley, Todd E; Lovett, Michael; Gurnett, Christina A
2014-06-01
Despite declining sequencing costs, few methods are available for cost-effective single-nucleotide polymorphism (SNP), insertion/deletion (INDEL) and copy number variation (CNV) discovery in a single assay. Commercially available methods require a high investment to a specific region and are only cost-effective for large samples. Here, we introduce a novel, flexible approach for multiplexed targeted sequencing and CNV analysis of large genomic regions called multiplexed direct genomic selection (MDiGS). MDiGS combines biotinylated bacterial artificial chromosome (BAC) capture and multiplexed pooled capture for SNP/INDEL and CNV detection of 96 multiplexed samples on a single MiSeq run. MDiGS is advantageous over other methods for CNV detection because pooled sample capture and hybridization to large contiguous BAC baits reduces sample and probe hybridization variability inherent in other methods. We performed MDiGS capture for three chromosomal regions consisting of ∼ 550 kb of coding and non-coding sequence with DNA from 253 patients with congenital lower limb disorders. PITX1 nonsense and HOXC11 S191F missense mutations were identified that segregate in clubfoot families. Using a novel pooled-capture reference strategy, we identified recurrent chromosome chr17q23.1q23.2 duplications and small HOXC 5' cluster deletions (51 kb and 12 kb). Given the current interest in coding and non-coding variants in human disease, MDiGS fulfills a niche for comprehensive and low-cost evaluation of CNVs, coding, and non-coding variants across candidate regions of interest. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The role of sustained observations and data co-management in Arctic Ocean governance
NASA Astrophysics Data System (ADS)
Eicken, H.; Lee, O. A.; Rupp, S. T.; Trainor, S.; Walsh, J. E.
2015-12-01
Rapid environmental change, a rise in maritime activities and resource development, and increasing engagement by non-Arctic nations are key to major shifts underway in Arctic social-environmental systems (SES). These shifts are triggering responses by policy makers, regulators and a range of other actors in the Arctic Ocean region. Arctic science can play an important role in informing such responses, in particular by (i) providing data from sustained observations to serve as indicators of change and major transitions and to inform regulatory and policy response; (ii) identifying linkages across subsystems of Arctic SES and across regions; (iii) providing predictions or scenarios of future states of Arctic SES; and (iv) informing adaptation action in response to rapid change. Policy responses to a changing Arctic are taking a multi-faceted approach by advancing international agreements through the Arctic Council (e.g., Search and Rescue Agreement), global forums (e.g., IMO Polar Code) or private sector instruments (e.g., ISO code for offshore structures). At the regional level, co-management of marine living resources involving local, indigenous stakeholders has proven effective. All of these approaches rely on scientific data and information for planning and decision-making. Examples from the Pacific Arctic sector illustrate how such relevant data is currently collected through a multitude of different government agencies, universities, and private entities. Its effective use in informing policy, planning and emergency response requires coordinated, sustained acquisition, common standards or best practices, and data sharing agreements - best achieved through data co-management approaches. For projections and scenarios of future states of Arctic SES, knowledge co-production that involves all relevant stakeholders and specifically addresses major sources of uncertainty is of particular relevance in an international context.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boore, Jeffrey L.; Medina, Monica; Rosenberg, Lewis A.
2004-01-31
We have determined the complete sequence of the mitochondrial genome of the scaphopod mollusk Graptacme eborea (Conrad, 1846) (14,492 nts) and completed the sequence of the mitochondrial genome of the bivalve mollusk Mytilus edulis Linnaeus, 1758 (16,740 nts). (The name Graptacme eborea is a revision of the species formerly known as Dentalium eboreum.) G. eborea mtDNA contains the 37 genes that are typically found and has the genes divided about evenly between the two strands, but M. edulis contains an extra trnM and is missing atp8, and has all genes on the same strand. Each has a highly rearranged genemore » order relative to each other and to all other studied mtDNAs. G. eborea mtDNA has almost no strand skew, but the coding strand of M. edulis mtDNA is very rich in G and T. This is reflected in differential codon usage patterns and even in amino acid compositions. G. eborea mtDNA has fewer non-coding nucleotides than any other mtDNA studied to date, with the largest non-coding region being only 24 nt long. Phylogenetic analysis using 2,420 aligned amino acid positions of concatenated proteins weakly supports an association of the scaphopod with gastropods to the exclusion of Bivalvia, Cephalopoda, and Polyplacophora, but is generally unable to convincingly resolve the relationships among major groups of the Lophotrochozoa, in contrast to the good resolution seen for several other major metazoan groups.« less
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Natural variation in non-coding regions underlying phenotypic diversity in budding yeast
Salinas, Francisco; de Boer, Carl G.; Abarca, Valentina; García, Verónica; Cuevas, Mara; Araos, Sebastian; Larrondo, Luis F.; Martínez, Claudio; Cubillos, Francisco A.
2016-01-01
Linkage mapping studies in model organisms have typically focused their efforts in polymorphisms within coding regions, ignoring those within regulatory regions that may contribute to gene expression variation. In this context, differences in transcript abundance are frequently proposed as a source of phenotypic diversity between individuals, however, until now, little molecular evidence has been provided. Here, we examined Allele Specific Expression (ASE) in six F1 hybrids from Saccharomyces cerevisiae derived from crosses between representative strains of the four main lineages described in yeast. ASE varied between crosses with levels ranging between 28% and 60%. Part of the variation in expression levels could be explained by differences in transcription factors binding to polymorphic cis-regulations and to differences in trans-activation depending on the allelic form of the TF. Analysis on highly expressed alleles on each background suggested ASN1 as a candidate transcript underlying nitrogen consumption differences between two strains. Further promoter allele swap analysis under fermentation conditions confirmed that coding and non-coding regions explained aspartic and glutamic acid consumption differences, likely due to a polymorphism affecting Uga3 binding. Together, we provide a new catalogue of variants to bridge the gap between genotype and phenotype. PMID:26898953
Practice Location Characteristics of Non-Traditional Dental Practices.
Solomon, Eric S; Jones, Daniel L
2016-04-01
Current and future dental school graduates are increasingly likely to choose a non-traditional dental practice-a group practice managed by a dental service organization or a corporate practice with employed dentists-for their initial practice experience. In addition, the growth of non-traditional practices, which are located primarily in major urban areas, could accelerate the movement of dentists to those areas and contribute to geographic disparities in the distribution of dental services. To help the profession understand the implications of these developments, the aim of this study was to compare the location characteristics of non-traditional practices and traditional dental practices. After identifying non-traditional practices across the United States, the authors located those practices and traditional dental practices geographically by zip code. Non-traditional dental practices were found to represent about 3.1% of all dental practices, but they had a greater impact on the marketplace with almost twice the average number of staff and annual revenue. Virtually all non-traditional dental practices were located in zip codes that also had a traditional dental practice. Zip codes with non-traditional practices had significant differences from zip codes with only a traditional dental practice: the populations in areas with non-traditional practices had higher income levels and higher education and were slightly younger and proportionally more Hispanic; those practices also had a much higher likelihood of being located in a major metropolitan area. Dental educators and leaders need to understand the impact of these trends in the practice environment in order to both prepare graduates for practice and make decisions about planning for the workforce of the future.
Hybrid 3D model for the interaction of plasma thruster plumes with nearby objects
NASA Astrophysics Data System (ADS)
Cichocki, Filippo; Domínguez-Vázquez, Adrián; Merino, Mario; Ahedo, Eduardo
2017-12-01
This paper presents a hybrid particle-in-cell (PIC) fluid approach to model the interaction of a plasma plume with a spacecraft and/or any nearby object. Ions and neutrals are modeled with a PIC approach, while electrons are treated as a fluid. After a first iteration of the code, the domain is split into quasineutral and non-neutral regions, based on non-neutrality criteria, such as the relative charge density and the Debye length-to-cell size ratio. At the material boundaries of the former quasineutral region, a dedicated algorithm ensures that the Bohm condition is met. In the latter non-neutral regions, the electron density and electric potential are obtained by solving the coupled electron momentum balance and Poisson equations. Boundary conditions for both the electric current and potential are finally obtained with a plasma sheath sub-code and an equivalent circuit model. The hybrid code is validated by applying it to a typical plasma plume-spacecraft interaction scenario, and the physics and capabilities of the model are finally discussed.
The complete mitochondrial genome of Chrysopa pallens (Insecta, Neuroptera, Chrysopidae).
He, Kun; Chen, Zhe; Yu, Dan-Na; Zhang, Jia-Yong
2012-10-01
The complete mitochondrial genome of Chrysopa pallens (Neuroptera, Chrysopidae) was sequenced. It consists of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA (rRNA) genes, and a control region (AT-rich region). The total length of C. pallens mitogenome is 16,723 bp with 79.5% AT content, and the length of control region is 1905 bp with 89.1% AT content. The non-coding regions of C. pallens include control region between 12S rRNA and trnI genes, and a 75-bp space region between trnI and trnQ genes.
Pang, Erli; Wu, Xiaomei; Lin, Kui
2016-06-01
Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.
Chung, H Y; Choi, Y C; Park, H N
2015-05-18
We investigated the phylogenetic relationships between pig breeds, compared the genetic similarity between humans and pigs, and provided basic genetic information on Korean native pigs (KNPs), using genetic variants of the swine leukocyte antigen 3 (SLA-3) gene. Primers were based on sequences from GenBank (accession Nos. AF464010 and AF464009). Polymerase chain reaction analysis amplified approximately 1727 bp of segments, which contained 1086 bp of coding regions and 641 bp of the 3'- and 5'-untranslated regions. Bacterial artificial chromosome clones of miniature pigs were used for sequencing the SLA-3 genomic region, which was 3114 bp in total length, including the coding (1086 bp) and non-coding (2028 bp) regions. Sequence analysis detected 53 single nucleotide polymorphisms (SNPs), based on a minor allele frequency greater than 0.01, which is low compared with other pig breeds, and the results suggest that there is low genetic variability in KNPs. Comparative analysis revealed that humans possess approximately three times more genetic variation than do pigs. Approximately 71% of SNPs in exons 2 and 3 were detected in KNPs, and exon 5 in humans is a highly polymorphic region. Newly identified sequences of SLA-3 using KNPs were submitted to GenBank (accession No. DQ992512-18). Cluster analysis revealed that KNPs were grouped according to three major alleles: SLA-3*0502 (DQ992518), SLA-3*0302 (DQ992513 and DQ992516), and SLA-3*0303 (DQ992512, DQ992514, DQ992515, and DQ992517). Alignments revealed that humans have a relatively close genetic relationship with pigs and chimpanzees. The information provided by this study may be useful in KNP management.
NASA Astrophysics Data System (ADS)
Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.
2018-09-01
This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.
Masingue, Marion; Perrot, Jimmy; Carlier, Robert-Yves; Piguet-Lacroix, Guenaelle; Latour, Philippe; Stojkovic, Tanya
2018-05-01
Charcot-Marie-Tooth disease (CMT) refers to a group of clinically and genetically heterogeneous inherited neuropathies. Ganglioside-induced differentiation-associated protein 1 GDAP1-related CMT has been reported in an autosomal dominant or recessive form in patients presenting either axonal or demyelinating neuropathy. We report two Sri Lankan sisters born to consanguineous parents and presenting with a severe axonal sensorimotor neuropathy. The early onset of the disease, the distal and proximal weakness and atrophy leading to major disability, along with areflexia, and, most notably, vocal cord and diaphragm paralysis were highly evocative of a GDAP1-related CMT. However, sequencing of the coding regions of the gene was normal. Whole-exome sequencing (WES) was performed and revealed that the largest region of homozygosity was around GDAP1 with several variants, mostly in non-coding regions. In view of the high clinical suspicion of GDAP1 gene involvement, we examined the variants in this gene and this, along with functional studies, allowed us to identify an alternative splicing site revealing a cryptic in-frame stop codon in intron 4 responsible for a severe loss of wild-type GDAP1. This work is the first to describe a deleterious mutation in GDAP1 gene outside of coding sequences or intronic junctions and emphasizes the importance of interpreting molecular analysis, and in particular WES results, in light of the clinical and electrophysiological phenotype.
The evolution and expression of the snaR family of small non-coding RNAs
Parrott, Andrew M.; Tsai, Michael; Batchu, Priyanka; Ryan, Karen; Ozer, Harvey L.; Tian, Bin; Mathews, Michael B.
2011-01-01
We recently identified the snaR family of small non-coding RNAs that associate in vivo with the nuclear factor 90 (NF90/ILF3) protein. The major human species, snaR-A, is an RNA polymerase III transcript with restricted tissue distribution and orthologs in chimpanzee but not rhesus macaque or mouse. We report their expression in human tissues and their evolution in primates. snaR genes are exclusively in African Great Apes and some are unique to humans. Two novel families of snaR-related genetic elements were found in primates: CAS (catarrhine ancestor of snaR), limited to Old World Monkeys and apes; and ASR (Alu/snaR-related), present in all monkeys and apes. ASR and CAS appear to have spread by retrotransposition, whereas most snaR genes have spread by segmental duplication. snaR-A and snaR-G2 are differentially expressed in discrete regions of the human brain and other tissues, notably including testis. snaR-A is up-regulated in transformed and immortalized human cells, and is stably bound to ribosomes in HeLa cells. We infer that snaR evolved from the left monomer of the primate-specific Alu SINE family via ASR and CAS in conjunction with major primate speciation events, and suggest that snaRs participate in tissue- and species-specific regulation of cell growth and translation. PMID:20935053
Erturk, Elif; Cecener, Gulsah; Polatkan, Volkan; Gokgoz, Sehsuvar; Egeli, Unal; Tunca, Berrin; Tezcan, Gulcin; Demirdogen, Elif; Ak, Secil; Tasdelen, Ismet
2014-01-01
Although genetic markers identifying women at an increased risk of developing breast cancer exist, the majority of inherited risk factors remain elusive. Mutations in the BRCA1/BRCA2 gene confer a substantial increase in breast cancer risk, yet routine clinical genetic screening is limited to the coding regions and intron- exon boundaries, precluding the identification of mutations in noncoding and untranslated regions. Because 3' untranslated region (3'UTR) polymorphisms disrupting microRNA (miRNA) binding can be functional and can act as genetic markers of cancer risk, we aimed to determine genetic variation in the 3'UTR of BRCA1/BRCA2 in familial and early-onset breast cancer patients with and without mutations in the coding regions of BRCA1/ BRCA2 and to identify specific 3'UTR variants that may be risk factors for cancer development. The 3'UTRs of the BRCA1 and BRCA2 genes were screened by heteroduplex analysis and DNA sequencing in 100 patients from 46 BRCA1/2 families, 54 non-BRCA1/2 families, and 47 geographically matched controls. Two polymorphisms were identified. SNPs c.*1287C>T (rs12516) (BRCA1) and c.*105A>C (rs15869) (BRCA2) were identified in 27% and 24% of patients, respectively. These 2 variants were also identified in controls with no family history of cancer (23.4% and 23.4%, respectively). In comparison to variations in the 3'UTR region of the BRCA1/2 genes and the BRCA1/2 mutational status in patients, there was a statistically significant relationship between the BRCA1 gene polymorphism c.*1287C>T (rs12516) and BRCA1 mutations (p=0.035) by Fisher's Exact Test. SNP c.*1287C>T (rs12516) of the BRCA1 gene may have potential use as a genetic marker of an increased risk of developing breast cancer and likely represents a non-coding sequence variation in BRCA1 that impacts BRCA1 function and leads to increased early-onset and/or familial breast cancer risk in the Turkish population.
Tau mRNA 3'UTR-to-CDS ratio is increased in Alzheimer disease.
García-Escudero, Vega; Gargini, Ricardo; Martín-Maestro, Patricia; García, Esther; García-Escudero, Ramón; Avila, Jesús
2017-08-10
Neurons frequently show an imbalance in expression of the 3' untranslated region (3'UTR) relative to the coding DNA sequence (CDS) region of mature messenger RNAs (mRNA). The ratio varies among different cells or parts of the brain. The Map2 protein levels per cell depend on the 3'UTR-to-CDS ratio rather than the total mRNA amount, which suggests powerful regulation of protein expression by 3'UTR sequences. Here we found that MAPT (the microtubule-associated protein tau gene) 3'UTR levels are particularly high with respect to other genes; indeed, the 3'UTR-to-CDS ratio of MAPT is balanced in healthy brain in mouse and human. The tau protein accumulates in Alzheimer diseased brain. We nonetheless observed that the levels of RNA encoding MAPT/tau were diminished in these patients' brains. To explain this apparently contradictory result, we studied MAPT mRNA stoichiometry in coding and non-coding regions, and found that the 3'UTR-to-CDS ratio was higher in the hippocampus of Alzheimer disease patients, with higher tau protein but lower total mRNA levels. Our data indicate that changes in the 3'UTR-to-CDS ratio have a regulatory role in the disease. Future research should thus consider not only mRNA levels, but also the ratios between coding and non-coding regions. Copyright © 2017 Elsevier B.V. All rights reserved.
Non-coding RNAs in cancer brain metastasis
Wu, Kerui; Sharma, Sambad; Venkat, Suresh; Liu, Keqin; Zhou, Xiaobo; Watabe, Kounosuke
2017-01-01
More than 90% of cancer death is attributed to metastatic disease, and the brain is one of the major metastatic sites of melanoma, colon, renal, lung and breast cancers. Despite the recent advancement of targeted therapy for cancer, the incidence of brain metastasis is increasing. One reason is that most therapeutic drugs can’t penetrate blood-brain-barrier and tumor cells find the brain as sanctuary site. In this review, we describe the pathophysiology of brain metastases to introduce the latest understandings of metastatic brain malignancies. This review also particularly focuses on non-coding RNAs and their roles in cancer brain metastasis. Furthermore, we discuss the roles of the extracellular vesicles as they are known to transport information between cells to initiate cancer cell-microenvironment communication. The potential clinical translation of non-coding RNAs as a tool for diagnosis and for treatment is also discussed in this review. At the end, the computational aspects of non-coding RNA detection, the sequence and structure calculation and epigenetic regulation of non-coding RNA in brain metastasis are discussed. PMID:26709907
Zorc, Minja; Kunej, Tanja
2016-05-01
MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a starting point for further functional studies and association studies with poultry production and health traits and the basis for systematic screening of exonic miRNAs and missense/miRNA seed polymorphisms in other genomes.
2017-12-01
poses a threat to regional security and economic stability—major U.S. national interests. Distributed maritime capability is demonstrated by applying...regional security, economic stability, fisheries enforcement 15. NUMBER OF PAGES 95 16. PRICE CODE 17. SECURITY CLASSIFICATION OF REPORT...a dominant aggressor in the South China Sea that poses a threat to regional security and economic stability—major U.S. national interests
Standing your Ground to Exoribonucleases: Function of Flavivirus Long Non-coding RNAs
Charley, Phillida A.; Wilusz, Jeffrey
2015-01-01
Members of the Flaviviridae (e.g. Dengue virus, West Nile virus, and Hepatitis C virus) contain a positive-sense RNA genome that encodes a large polyprotein. It is now also clear most if not all of these viruses also produce an abundant subgenomic long non-coding RNA. These non-coding RNAs, which are called subgenomicflavivirus RNAs (sfRNAs) or Xrn1-resistant RNAs (xrRNAs), are stable decay intermediates generated from the viral genomic RNA through the stalling of the cellular exoribonuclease Xrn1 at highly structured regions. Several functions of these flavivirus long non-coding RNAs have been revealed in recent years. The generation of these sfRNAs/xrRNAs from viral transcripts results in the repression of Xrn1 and the dysregulation of cellular mRNA stability. The abundant sfRNAs also serve directly as a decoy for important cellular protein regulators of the interferon and RNA interference antiviral pathways. Thus the generation of long non-coding RNAs from flaviviruses, hepaciviruses and pestiviruses likely disrupts aspects of innate immunity and may directly contribute to viral replication, cytopathology and pathogenesis. PMID:26368052
Confinement properties of tokamak plasmas with extended regions of low magnetic shear
NASA Astrophysics Data System (ADS)
Graves, J. P.; Cooper, W. A.; Kleiner, A.; Raghunathan, M.; Neto, E.; Nicolas, T.; Lanthaler, S.; Patten, H.; Pfefferle, D.; Brunetti, D.; Lutjens, H.
2017-10-01
Extended regions of low magnetic shear can be advantageous to tokamak plasmas. But the core and edge can be susceptible to non-resonant ideal fluctuations due to the weakened restoring force associated with magnetic field line bending. This contribution shows how saturated non-linear phenomenology, such as 1 / 1 Long Lived Modes, and Edge Harmonic Oscillations associated with QH-modes, can be modelled accurately using the non-linear stability code XTOR, the free boundary 3D equilibrium code VMEC, and non-linear analytic theory. That the equilibrium approach is valid is particularly valuable because it enables advanced particle confinement studies to be undertaken in the ordinarily difficult environment of strongly 3D magnetic fields. The VENUS-LEVIS code exploits the Fourier description of the VMEC equilibrium fields, such that full Lorenzian and guiding centre approximated differential operators in curvilinear angular coordinates can be evaluated analytically. Consequently, the confinement properties of minority ions such as energetic particles and high Z impurities can be calculated accurately over slowing down timescales in experimentally relevant 3D plasmas.
Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu
2015-05-27
Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.
Adaptive evolution of the matrix extracellular phosphoglycoprotein in mammals
2011-01-01
Background Matrix extracellular phosphoglycoprotein (MEPE) belongs to a family of small integrin-binding ligand N-linked glycoproteins (SIBLINGs) that play a key role in skeleton development, particularly in mineralization, phosphate regulation and osteogenesis. MEPE associated disorders cause various physiological effects, such as loss of bone mass, tumors and disruption of renal function (hypophosphatemia). The study of this developmental gene from an evolutionary perspective could provide valuable insights on the adaptive diversification of morphological phenotypes in vertebrates. Results Here we studied the adaptive evolution of the MEPE gene in 26 Eutherian mammals and three birds. The comparative genomic analyses revealed a high degree of evolutionary conservation of some coding and non-coding regions of the MEPE gene across mammals indicating a possible regulatory or functional role likely related with mineralization and/or phosphate regulation. However, the majority of the coding region had a fast evolutionary rate, particularly within the largest exon (1467 bp). Rodentia and Scandentia had distinct substitution rates with an increased accumulation of both synonymous and non-synonymous mutations compared with other mammalian lineages. Characteristics of the gene (e.g. biochemical, evolutionary rate, and intronic conservation) differed greatly among lineages of the eight mammalian orders. We identified 20 sites with significant positive selection signatures (codon and protein level) outside the main regulatory motifs (dentonin and ASARM) suggestive of an adaptive role. Conversely, we find three sites under selection in the signal peptide and one in the ASARM motif that were supported by at least one selection model. The MEPE protein tends to accumulate amino acids promoting disorder and potential phosphorylation targets. Conclusion MEPE shows a high number of selection signatures, revealing the crucial role of positive selection in the evolution of this SIBLING member. The selection signatures were found mainly outside the functional motifs, reinforcing the idea that other regions outside the dentonin and the ASARM might be crucial for the function of the protein and future studies should be undertaken to understand its importance. PMID:22103247
The complete mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae).
Zhou, Xuming; Chen, Yu; Zhu, Shanliang; Xu, Haigen; Liu, Yan; Chen, Lian
2016-01-01
The mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae) is the first complete mtDNA sequence reported in the genus Pomacea. The total length of mtDNA is 15,707 bp, which containing 13 protein-coding genes, 2 ribosomal RNAs, 22 transfer RNAs, and a 359 bp non-coding region. The A + T content of the overall base composition of H-strand is 71.7% (T: 41%, C: 12.7%, A: 30.7%, G: 15.6%). ATP6, ATP8, CO1, CO2, ND1-3, ND5, ND6, ND4L and Cyt b genes begin with ATG as start codon, CO3 and ND4 begin with ATA. ATP8, CO2-3, ND4L, ND2-6 and Cyt b genes are terminated with TAA as stop codon, ATP6, ND1, and CO1 end with TAG. A long non-coding region is found and a 23 bp repeat unit repeat 11 times in this region.
Long non-coding RNA produced by RNA polymerase V determines boundaries of heterochromatin
Böhmdorfer, Gudrun; Sethuraman, Shriya; Rowley, M Jordan; Krzyszton, Michal; Rothi, M Hafiz; Bouzit, Lilia; Wierzbicki, Andrzej T
2016-01-01
RNA-mediated transcriptional gene silencing is a conserved process where small RNAs target transposons and other sequences for repression by establishing chromatin modifications. A central element of this process are long non-coding RNAs (lncRNA), which in Arabidopsis thaliana are produced by a specialized RNA polymerase known as Pol V. Here we show that non-coding transcription by Pol V is controlled by preexisting chromatin modifications located within the transcribed regions. Most Pol V transcripts are associated with AGO4 but are not sliced by AGO4. Pol V-dependent DNA methylation is established on both strands of DNA and is tightly restricted to Pol V-transcribed regions. This indicates that chromatin modifications are established in close proximity to Pol V. Finally, Pol V transcription is preferentially enriched on edges of silenced transposable elements, where Pol V transcribes into TEs. We propose that Pol V may play an important role in the determination of heterochromatin boundaries. DOI: http://dx.doi.org/10.7554/eLife.19092.001 PMID:27779094
Junk DNA and the long non-coding RNA twist in cancer genetics
Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A
2015-01-01
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839
Chromatin accessibility prediction via a hybrid deep convolutional neural network.
Liu, Qiao; Xia, Fei; Yin, Qijin; Jiang, Rui
2018-03-01
A majority of known genetic variants associated with human-inherited diseases lie in non-coding regions that lack adequate interpretation, making it indispensable to systematically discover functional sites at the whole genome level and precisely decipher their implications in a comprehensive manner. Although computational approaches have been complementing high-throughput biological experiments towards the annotation of the human genome, it still remains a big challenge to accurately annotate regulatory elements in the context of a specific cell type via automatic learning of the DNA sequence code from large-scale sequencing data. Indeed, the development of an accurate and interpretable model to learn the DNA sequence signature and further enable the identification of causative genetic variants has become essential in both genomic and genetic studies. We proposed Deopen, a hybrid framework mainly based on a deep convolutional neural network, to automatically learn the regulatory code of DNA sequences and predict chromatin accessibility. In a series of comparison with existing methods, we show the superior performance of our model in not only the classification of accessible regions against background sequences sampled at random, but also the regression of DNase-seq signals. Besides, we further visualize the convolutional kernels and show the match of identified sequence signatures and known motifs. We finally demonstrate the sensitivity of our model in finding causative noncoding variants in the analysis of a breast cancer dataset. We expect to see wide applications of Deopen with either public or in-house chromatin accessibility data in the annotation of the human genome and the identification of non-coding variants associated with diseases. Deopen is freely available at https://github.com/kimmo1019/Deopen. ruijiang@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Ikegami, Kohta; Ohgane, Jun; Tanaka, Satoshi; Yagi, Shintaro; Shiota, Kunio
2009-01-01
Genes constitute only a small proportion of the mammalian genome, the majority of which is composed of non-genic repetitive elements including interspersed repeats and satellites. A unique feature of the mammalian genome is that there are numerous tissue-dependent, differentially methylated regions (T-DMRs) in the non-repetitive sequences, which include genes and their regulatory elements. The epigenetic status of T-DMRs varies from that of repetitive elements and constitutes the DNA methylation profile genome-wide. Since the DNA methylation profile is specific to each cell and tissue type, much like a fingerprint, it can be used as a means of identification. The formation of DNA methylation profiles is the basis for cell differentiation and development in mammals. The epigenetic status of each T-DMR is regulated by the interplay between DNA methyltransferases, histone modification enzymes, histone subtypes, non-histone nuclear proteins and non-coding RNAs. In this review, we will discuss how these epigenetic factors cooperate to establish cell- and tissue-specific DNA methylation profiles.
Kühne, Annett; Kaiser, Rolf; Schirmer, Markus; Heider, Ulrike; Muhlke, Sabine; Niere, Wiebke; Overbeck, Tobias; Hohloch, Karin; Trümper, Lorenz; Sezer, Orhan; Brockmöller, Jürgen
2007-07-01
Melphalan is widely used in the treatment of multiple myeloma. Pharmacokinetics of this alkylating drug shows high inter-individual variability. As melphalan is a phenylalanine derivative, the pharmacokinetic variability may be determined by genetic polymorphisms in the L-type amino acid transporters LAT1 (SLC7A5) and LAT2 (SLC7A8). Pharmacokinetics were analysed in 64 patients after first administration of intravenous melphalan. Severity of side effects was documented according to WHO criteria. Genomic DNA was analysed for polymorphisms in LAT1 and LAT2 by sequencing of the entire coding region, intron-exon boundaries and 2 kb upstream promoter region. Selected polymorphisms in the common heavy chain of both transporters, the protein 4F2hc (SLC3A2), were analysed by single nucleotide primer extension. Melphalan pharmacokinetics was highly variable with up to 6.2-fold differences in total clearance. A total of 44 polymorphisms were identified in LAT1 and 21 polymorphisms in LAT2. From all variants, only five were in the coding region and only one heterozygous non-synonymous polymorphism (Ala94Thr) was found in LAT2. Numerous polymorphisms were found in the LAT1 and LAT2 5'-flanking regions but did not correlate with expression of the respective genes. No significant correlations could be observed between the polymorphisms in 4F2hc, LAT1, and LAT2 with melphalan pharmacokinetics or with melphalan side effects. The study confirmed that these transporter genes are highly conserved, particularly in the coding sequences. Genetic variation in 4F2hc, LAT1, and LAT2 does not appear to be a major cause of inter-individual variability in pharmacokinetics and of adverse reactions to melphalan.
Genetic characterisation of the recent foot-and-mouth disease virus subtype A/IRN/2005
Klein, Joern; Hussain, Manzoor; Ahmad, Munir; Normann, Preben; Afzal, Muhammad; Alexandersen, Soren
2007-01-01
Background According to the World Reference Laboratory for FMD, a new subtype of FMDV serotype A was detected in Iran in 2005. This subtype was designated A/IRN/2005, and rapidly spread throughout Iran and moved westwards into Saudi Arabia and Turkey where it was initially detected from August 2005 and subsequently caused major disease problems in the spring of 2006. The same subtype reached Jordan in 2007. As part of an ongoing project we have also detected this subtype in Pakistan with the first positive samples detected in April 2006. To characterise this subtype in detail, we have sequenced and analysed the complete coding sequence of three subtype A/IRN/2005 isolates collected in Pakistan in 2006, the complete coding sequence of one subtype A/IRN/2005 isolate collected during the first outbreak in Turkey in 2005 and, in addition, the partial 1D coding sequence derived from 4 epithelium samples and 34 swab-samples from Asian buffaloes or cattle subsequently found to be infected with the A/IRN/2005 subtype. Results The phylogenies of the genome regions encoding for the structural proteins, displayed, with the exception of 1A, distinct, serotype-specific clustering and an evolutionary relationship of the A/IRN/2005 sublineage with the A22 sublineage. Potential recombination events have been detected in parts of the genome region coding for the non-structural proteins of FMDV. In addition, amino acid substitutions have been detected in the deduced VP1 protein sequence, potentially related to clinical or subclinical outcome of FMD. Indications of differential susceptibility for developing a subclinical course of disease between Asian buffaloes and cattle have been detected. Furthermore, hitherto unknown insertions of 2 amino acids before the second start codon, as well as sublineage specific amino acids have been detected in the genome region encoding for the leader proteinase of A/IRN/2005 sublineage. Conclusion Our findings indicate that the A/IRN/2005 sublineage has undergone two different paths of evolution for the structural and non-structural genome regions. The structural genome regions have had their evolutionary starting point in the A22 sublineage. It can be assumed that, due to the quasispecies structure of FMDV populations and the error-prone replication process, advantageous mutations in a changed environment have been fixed and lead to the occurrence of the new A/IRN/2005 sublineage. Together with this mechanism, recombination within the non-structural genome regions, potentially modifying the virulence of the virus, may be involved in the success of this new sublineage. The possible origin of this recombinant virus may be a co-infection with Asia1 and a serotype A precursor of the A/IRN/2005 sublineage potentially within Asian Buffaloes, as these appears to relatively easy become infected, but usually without developing clinical disease and consequently showing not a strong acute inflammatory immune response against a second FMDV infection. PMID:18001482
Ming-Xing, Lu; Zhi-Teng, Chen; Wei-Wei, Yu; Yu-Zhou, Du
2017-03-01
We report the complete mitochondrial genome (mitogenome) of a spiraling whitefly, Aleurodicus dispersus (Hemiptera: Aleyrodidae). The 16 170 bp long genome consists of 13 protein-coding genes, 20 transfer RNAs, 2 ribosomal RNAs, and a control region. The A. dispersus mitogenome also includes a cytb-like non-coding region and shows several variations relative to the typical insect mitogenome. A phylogenetic tree has been constructed using the 13 protein-coding genes of 12 related species from Hemiptera. Our results would contribute to further study of phylogeny in Aleyrodidae and Hemiptera.
2004-09-01
to provide access to and protect are the NG Game Code, Employee Files, E -MAIL, Marketing Plans and Legacy Code. The NG Game Code is the MOST...major hit. E -MAIL is classified NON- SENSITIVE. The Marketing Plans are for the NG Game Code. They contain information concerning what the new...simulation games currently on the market except that, rather than allowing players to choose rides, refreshments and facilities, CyberCIEGE will
New genetic variants of LATS1 detected in urinary bladder and colon cancer.
Saadeldin, Mona K; Shawer, Heba; Mostafa, Ahmed; Kassem, Neemat M; Amleh, Asma; Siam, Rania
2014-01-01
LATS1, the large tumor suppressor 1 gene, encodes for a serine/threonine kinase protein and is implicated in cell cycle progression. LATS1 is down-regulated in various human cancers, such as breast cancer, and astrocytoma. Point mutations in LATS1 were reported in human sarcomas. Additionally, loss of heterozygosity of LATS1 chromosomal region predisposes to breast, ovarian, and cervical tumors. In the current study, we investigated LATS1 genetic variations including single nucleotide polymorphisms (SNPs), in 28 Egyptian patients with either urinary bladder or colon cancers. The LATS1 gene was amplified and sequenced and the expression of LATS1 at the RNA level was assessed in 12 urinary bladder cancer samples. We report, the identification of a total of 29 variants including previously identified SNPs within LATS1 coding and non-coding sequences. A total of 18 variants were novel. Majority of the novel variants, 13, were mapped to intronic sequences and un-translated regions of the gene. Four of the five novel variants located in the coding region of the gene, represented missense mutations within the serine/threonine kinase catalytic domain. Interestingly, LATS1 RNA steady state levels was lost in urinary bladder cancerous tissue harboring four specific SNPs (16045 + 41736 + 34614 + 56177) positioned in the 5'UTR, intron 6, and two silent mutations within exon 4 and exon 8, respectively. This study identifies novel single-base-sequence alterations in the LATS1 gene. These newly identified variants could potentially be used as novel diagnostic or prognostic tools in cancer.
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species.
Chen, Zhiwen; Feng, Kun; Grover, Corrinne E; Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F; Wang, Kunbo; Hua, Jinping
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium.
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species
Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F.; Wang, Kunbo
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium. PMID:27309527
Chen, L P; E, G X; Zhao, Y J; Na, R S; Zhao, Z Q; Zhang, J H; Ma, Y H; Sun, Y W; Zhong, T; Zhang, H P; Huang, Y F
2015-06-18
DRA encodes the alpha chain of the DR heterodimer, is closely linked to DRB and is considered almost monomorphic in major histocompatibility complex region. In this study, we identified the exon 2 of DRA to evaluate the immunogenetic diversity of Chinese south indigenous goat. Two single nucleotide polymorphisms in an untranslated region and one synonymous substitution in coding region were identified. These data suggest that high immunodiversity in native Chinese population.
Ahmad, Muneer; Jung, Low Tan; Bhuiyan, Al-Amin
2017-10-01
Digital signal processing techniques commonly employ fixed length window filters to process the signal contents. DNA signals differ in characteristics from common digital signals since they carry nucleotides as contents. The nucleotides own genetic code context and fuzzy behaviors due to their special structure and order in DNA strand. Employing conventional fixed length window filters for DNA signal processing produce spectral leakage and hence results in signal noise. A biological context aware adaptive window filter is required to process the DNA signals. This paper introduces a biological inspired fuzzy adaptive window median filter (FAWMF) which computes the fuzzy membership strength of nucleotides in each slide of window and filters nucleotides based on median filtering with a combination of s-shaped and z-shaped filters. Since coding regions cause 3-base periodicity by an unbalanced nucleotides' distribution producing a relatively high bias for nucleotides' usage, such fundamental characteristic of nucleotides has been exploited in FAWMF to suppress the signal noise. Along with adaptive response of FAWMF, a strong correlation between median nucleotides and the Π shaped filter was observed which produced enhanced discrimination between coding and non-coding regions contrary to fixed length conventional window filters. The proposed FAWMF attains a significant enhancement in coding regions identification i.e. 40% to 125% as compared to other conventional window filters tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. This study proves that conventional fixed length window filters applied to DNA signals do not achieve significant results since the nucleotides carry genetic code context. The proposed FAWMF algorithm is adaptive and outperforms significantly to process DNA signal contents. The algorithm applied to variety of DNA datasets produced noteworthy discrimination between coding and non-coding regions contrary to fixed window length conventional filters. Copyright © 2017 Elsevier B.V. All rights reserved.
Genetics of Inflammatory Bowel Diseases
McGovern, Dermot; Kugathasan, Subra; Cho, Judy H.
2015-01-01
In this Review, we provide an update on genome-wide association studies (GWAS) in inflammatory bowel disease (IBD). In addition, we summarize progress in defining the functional consequences of associated alleles for coding and non-coding genetic variation. In the small minority of loci where major association signals correspond to non-synonymous variation, we summarize studies defining their functional effects and implications for therapeutic targeting. Importantly, the large majority of GWAS-associated loci involve non-coding variation, many of which modulate levels of gene expression. Recent expression quantitative trait loci (eQTL) studies have established that expression of the large majority of human genes is regulated by non-coding genetic variation. Significant advances in defining the epigenetic landscape have demonstrated that IBD GWAS signals are highly enriched within cell-specific active enhancer marks. Studies in European ancestry populations have dominated the landscape of IBD genetics studies, but increasingly, studies in Asian and African-American populations are being reported. Common variation accounts for only a modest fraction of the predicted heritability and the role of rare genetic variation of higher effects (i.e. odds ratios markedly deviating from one) is increasingly being identified through sequencing efforts. These sequencing studies have been particularly productive in very-early onset, more severe cases. A major challenge in IBD genetics will be harnessing the vast array of genetic discovery for clinical utility, through emerging precision medicine initiatives. We discuss the rapidly evolving area of direct to consumer genetic testing, as well as the current utility of clinical exome sequencing, especially in very early onset, severe IBD cases. We summarize recent progress in the pharmacogenetics of IBD with respect of partitioning patient responses to anti-TNF and thiopurine therapies. Highly collaborative studies across research centers and across subspecialties and disciplines will be required to fully realize the promise of genetic discovery in IBD. PMID:26255561
Dong, Lu; Zhao, Xin; Ong, Stacie L; Harvey, Allison G
2017-10-01
The current study examined whether and which specific contents of patients' memory for cognitive therapy (CT) were associated with treatment adherence and outcome. Data were drawn from a pilot RCT of forty-eight depressed adults, who received either CT plus Memory Support Intervention (CT + Memory Support) or CT-as-usual. Patients' memory for treatment was measured using the Patient Recall Task and responses were coded into cognitive behavioral therapy (CBT) codes, such as CBT Model and Cognitive Restructuring, and non-CBT codes, such as individual coping strategies and no code. Treatment adherence was measured using therapist and patient ratings during treatment. Depression outcomes included treatment response, remission, and recurrence. Total number of CBT codes recalled was not significantly different comparing CT + Memory Support to CT-as-usual. Total CBT codes recalled were positively associated with adherence, while non-CBT codes recalled were negatively associated with adherence. Treatment responders (vs. non-responders) exhibited a significant increase in their recall of Cognitive Restructuring from session 7 to posttreatment. Greater recall of Cognitive Restructuring was marginally significantly associated with remission. Greater total number of CBT codes recalled (particularly CBT Model) was associated with non-recurrence of depression. Results highlight the important relationships between patients' memory for treatment and treatment adherence and outcome. Copyright © 2017 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramsdell, J.V. Jr.; Simonen, C.A.; Burk, K.W.
1994-02-01
The purpose of the Hanford Environmental Dose Reconstruction (HEDR) Project is to estimate radiation doses that individuals may have received from operations at the Hanford Site since 1944. This report deals specifically with the atmospheric transport model, Regional Atmospheric Transport Code for Hanford Emission Tracking (RATCHET). RATCHET is a major rework of the MESOILT2 model used in the first phase of the HEDR Project; only the bookkeeping framework escaped major changes. Changes to the code include (1) significant changes in the representation of atmospheric processes and (2) incorporation of Monte Carlo methods for representing uncertainty in input data, model parameters,more » and coefficients. To a large extent, the revisions to the model are based on recommendations of a peer working group that met in March 1991. Technical bases for other portions of the atmospheric transport model are addressed in two other documents. This report has three major sections: a description of the model, a user`s guide, and a programmer`s guide. These sections discuss RATCHET from three different perspectives. The first provides a technical description of the code with emphasis on details such as the representation of the model domain, the data required by the model, and the equations used to make the model calculations. The technical description is followed by a user`s guide to the model with emphasis on running the code. The user`s guide contains information about the model input and output. The third section is a programmer`s guide to the code. It discusses the hardware and software required to run the code. The programmer`s guide also discusses program structure and each of the program elements.« less
Biochemical and genetic analysis of the role of the viral polymerase in enterovirus recombination.
Woodman, Andrew; Arnold, Jamie J; Cameron, Craig E; Evans, David J
2016-08-19
Genetic recombination in single-strand, positive-sense RNA viruses is a poorly understand mechanism responsible for generating extensive genetic change and novel phenotypes. By moving a critical cis-acting replication element (CRE) from the polyprotein coding region to the 3' non-coding region we have further developed a cell-based assay (the 3'CRE-REP assay) to yield recombinants throughout the non-structural coding region of poliovirus from dually transfected cells. We have additionally developed a defined biochemical assay in which the only protein present is the poliovirus RNA dependent RNA polymerase (RdRp), which recapitulates the strand transfer events of the recombination process. We have used both assays to investigate the role of the polymerase fidelity and nucleotide turnover rates in recombination. Our results, of both poliovirus intertypic and intratypic recombination in the CRE-REP assay and using a range of polymerase variants in the biochemical assay, demonstrate that RdRp fidelity is a fundamental determinant of recombination frequency. High fidelity polymerases exhibit reduced recombination and low fidelity polymerases exhibit increased recombination in both assays. These studies provide the basis for the analysis of poliovirus recombination throughout the non-structural region of the virus genome and provide a defined biochemical assay to further dissect this important evolutionary process. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The complete mitochondrial genome of the Giant Manta ray, Manta birostris.
Hinojosa-Alvarez, Silvia; Díaz-Jaimes, Pindaro; Marcet-Houben, Marina; Gabaldón, Toni
2015-01-01
The complete mitochondrial genome of the giant manta ray (Manta birostris), consists of 18,075 bp with rich A + T and low G content. Gene organization and length is similar to other species of ray. It comprises of 13 protein-coding genes, 2 rRNAs genes, 23 tRNAs genes and 1 non-coding sequence, and the control region. We identified an AT tandem repeat region, similar to that reported in Mobula japanica.
Cenik, Can; Chua, Hon Nian; Singh, Guramrit; Akef, Abdalla; Snyder, Michael P; Palazzo, Alexander F; Moore, Melissa J; Roth, Frederick P
2017-03-01
Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5 ' proximal- i ntron- m inus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N 1 -methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N 1 -methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. © 2017 Cenik et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Drakos, Nicole E; Wahl, Lindi M
2015-12-01
Theoretical approaches are essential to our understanding of the complex dynamics of mobile genetic elements (MGEs) within genomes. Recently, the birth-death-diversification model was developed to describe the dynamics of mobile promoters (MPs), a particular class of MGEs in prokaryotes. A unique feature of this model is that genetic diversification of elements was included. To explore the implications of diversification on the longterm fate of MGE lineages, in this contribution we analyze the extinction probabilities, extinction times and equilibrium solutions of the birth-death-diversification model. We find that diversification increases both the survival and growth rate of MGE families, but the strength of this effect depends on the rate of horizontal gene transfer (HGT). We also find that the distribution of MGE families per genome is not necessarily monotonically decreasing, as observed for MPs, but may have a peak in the distribution that is related to the HGT rate. For MPs specifically, we find that new families have a high extinction probability, and predict that the number of MPs is increasing, albeit at a very slow rate. Additionally, we develop an extension of the birth-death-diversification model which allows MGEs in different regions of the genome, for example coding and non-coding, to be described by different rates. This extension may offer a potential explanation as to why the majority of MPs are located in non-promoter regions of the genome. Copyright © 2015 Elsevier Inc. All rights reserved.
Newtonian CAFE: a new ideal MHD code to study the solar atmosphere
NASA Astrophysics Data System (ADS)
González, J. J.; Guzmán, F.
2015-12-01
In this work we present a new independent code designed to solve the equations of classical ideal magnetohydrodynamics (MHD) in three dimensions, submitted to a constant gravitational field. The purpose of the code centers on the analysis of solar phenomena within the photosphere-corona region. In special the code is capable to simulate the propagation of impulsively generated linear and non-linear MHD waves in the non-isothermal solar atmosphere. We present 1D and 2D standard tests to demonstrate the quality of the numerical results obtained with our code. As 3D tests we present the propagation of MHD-gravity waves and vortices in the solar atmosphere. The code is based on high-resolution shock-capturing methods, uses the HLLE flux formula combined with Minmod, MC and WENO5 reconstructors. The divergence free magnetic field constraint is controlled using the Flux Constrained Transport method.
Quantitative Profiling of Peptides from RNAs classified as non-coding
Prabakaran, Sudhakaran; Hemberg, Martin; Chauhan, Ruchi; Winter, Dominic; Tweedie-Cullen, Ry Y.; Dittrich, Christian; Hong, Elizabeth; Gunawardena, Jeremy; Steen, Hanno; Kreiman, Gabriel; Steen, Judith A.
2014-01-01
Only a small fraction of the mammalian genome codes for messenger RNAs destined to be translated into proteins, and it is generally assumed that a large portion of transcribed sequences - including introns and several classes of non-coding RNAs (ncRNAs) do not give rise to peptide products. A systematic examination of translation and physiological regulation of ncRNAs has not been conducted. Here, we use computational methods to identify the products of non-canonical translation in mouse neurons by analyzing unannotated transcripts in combination with proteomic data. This study supports the existence of non-canonical translation products from both intragenic and extragenic genomic regions, including peptides derived from anti-sense transcripts and introns. Moreover, the studied novel translation products exhibit temporal regulation similar to that of proteins known to be involved in neuronal activity processes. These observations highlight a potentially large and complex set of biologically regulated translational events from transcripts formerly thought to lack coding potential. PMID:25403355
RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”
Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113
RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".
Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.
Küpper, Clemens; Burke, Terry; Lank, David B.
2015-01-01
Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935
Wang, Hongbo; Zhao, Yingchao; Chen, Mingyue; Cui, Jie
2017-01-01
Cervical cancer is the third most common cancer worldwide and the fourth leading cause of cancer-associated mortality in women. Accumulating evidence indicates that long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) may play key roles in the carcinogenesis of different cancers; however, little is known about the mechanisms of lncRNAs and circRNAs in the progression and metastasis of cervical cancer. In this study, we explored the expression profiles of lncRNAs, circRNAs, miRNAs, and mRNAs in HPV16 (human papillomavirus genotype 16) mediated cervical squamous cell carcinoma and matched adjacent non-tumor (ATN) tissues from three patients with high-throughput RNA sequencing (RNA-seq). In total, we identified 19 lncRNAs, 99 circRNAs, 28 miRNAs, and 304 mRNAs that were commonly differentially expressed (DE) in different patients. Among the non-coding RNAs, 3 lncRNAs and 44 circRNAs are novel to our knowledge. Functional enrichment analysis showed that DE lncRNAs, miRNAs, and mRNAs were enriched in pathways crucial to cancer as well as other gene ontology (GO) terms. Furthermore, the co-expression network and function prediction suggested that all 19 DE lncRNAs could play different roles in the carcinogenesis and development of cervical cancer. The competing endogenous RNA (ceRNA) network based on DE coding and non-coding RNAs showed that each miRNA targeted a number of lncRNAs and circRNAs. The link between part of the miRNAs in the network and cervical cancer has been validated in previous studies, and these miRNAs targeted the majority of the novel non-coding RNAs, thus suggesting that these novel non-coding RNAs may be involved in cervical cancer. Taken together, our study shows that DE non-coding RNAs could be further developed as diagnostic and therapeutic biomarkers of cervical cancer. The complex ceRNA network also lays the foundation for future research of the roles of coding and non-coding RNAs in cervical cancer. PMID:28970820
Diehl, William E.; Johnson, Welkin E.; Hunter, Eric
2013-01-01
All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500
2013-01-01
Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680
Carr, Michael J; McCormack, Grace P; Mutton, Ken J; Crowley, Brendan
2006-04-01
Hematopoietic stem cell transplant recipients frequently develop BK virus (BKV)-associated hemorrhagic cystitis, which coincides with BK viruria. However, the precise role of BKV in the etiology of hemorrhagic cystitis in hematopoietic stem cell transplant recipients remains unclear, since approximately 50% of all such adult transplant recipients excrete BKV, yet do not develop this clinical condition. In the present study, BKV were analyzed to determine if mutations in the non-coding control region (NCCR), and specific BKV sub-types defined by sequence analysis of major capsid protein VP1, were associated with development of hemorrhagic cystitis in hematopoietic stem cell transplant recipients. The regions encoding VP1 and NCCRs of BKV in urine samples collected from 15 hematopoietic stem cell transplant recipients with hemorrhagic cystitis and 20 without this illness were amplified and sequenced. Sequence variations in the NCCRs of BKV were identified in urine samples from those with and without hemorrhagic cystitis. Furthermore, five unique sequence variations within transcription factor binding sites in the canonical NCCR, O-P-Q-R-S, were identified, representing new BKV variants from a population of cloned quasi-species obtained from patients with and without hemorrhagic cystitis. Thirty-five BKV VP1 sequences were analyzed by phylogenetic analysis but no specific BKV sub-type was associated with hemorrhagic cystitis. Five previously unrecognized naturally occurring variants of the BKV are described which involve amplifications, deletions, and rearrangements of the archetypal BKV NCCRs in individuals with and without hemorrhagic cystitis. Architectural rearrangements in the NCCRs of BKV did not appear to be a prerequisite for development of hemorrhagic cystitis in hematopoietic stem cell transplant recipients. Copyright 2006 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Helfenbein, Kevin G.; Brown, Wesley M.; Boore, Jeffrey L.
We have sequenced the complete mitochondrial DNA (mtDNA) of the articulate brachiopod Terebratalia transversa. The circular genome is 14,291 bp in size, relatively small compared to other published metazoan mtDNAs. The 37 genes commonly found in animal mtDNA are present; the size decrease is due to the truncation of several tRNA, rRNA, and protein genes, to some nucleotide overlaps, and to a paucity of non-coding nucleotides. Although the gene arrangement differs radically from those reported for other metazoans, some gene junctions are shared with two other articulate brachiopods, Laqueus rubellus and Terebratulina retusa. All genes in the T. transversa mtDNA,more » unlike those in most metazoan mtDNAs reported, are encoded by the same strand. The A+T content (59.1 percent) is low for a metazoan mtDNA, and there is a high propensity for homopolymer runs and a strong base-compositional strand bias. The coding strand is quite G+T-rich, a skew that is shared by the confamilial (laqueid) specie s L. rubellus, but opposite to that found in T. retusa, a cancellothyridid. These compositional skews are strongly reflected in the codon usage patterns and the amino acid compositions of the mitochondrial proteins, with markedly different usage observed between T. retusa and the two laqueids. This observation, plus the similarity of the laqueid non-coding regions to the reverse complement of the non-coding region of the cancellothyridid, suggest that an inversion that resulted in a reversal in the direction of first-strand replication has occurred in one of the two lineages. In addition to the presence of one non-coding region in T. transversa that is comparable to those in the other brachiopod mtDNAs, there are two others with the potential to form secondary structures; one or both of these may be involved in the process of transcript cleavage.« less
Redwan, R M; Saidin, A; Kumar, S V
2015-08-12
Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology. In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of commelinids which support the monophyly relationship between Arecales and Dasypogonaceae and between Zingiberales to the Poales, which includes the A. comosus. The complete sequence of the chloroplast of pineapple provides insights to the divergence of genic chloroplast sequences from the members of the subclass Commelinidae. The complete pineapple chloroplast will serve as a reference for in-depth taxonomical studies in the Bromeliaceae family when more species under the family are sequenced in the future. The genetic sequence information will also make feasible other molecular applications of the pineapple chloroplast for plant genetic improvement.
The complete mitochondrial genome of Rapana venosa (Gastropoda, Muricidae).
Sun, Xiujun; Yang, Aiguo
2016-01-01
The complete mitochondrial (mt) genome of the veined rapa whelk, Rapana venosa, was determined using genome walking techniques in this study. The total length of the mt genome sequence of R. venosa was 15,271 bp, which is comparable to the reported Muricidae mitogenomes to date. It contained 13 protein-coding genes, 21 transfer RNA genes, and two ribosomal RNA genes. A bias towards a higher representation of nucleotides A and T (69%) was detected in the mt genome of R. venosa. A small number of non-coding nucleotides (302 bp) was detected, and the largest non-coding region was 74 bp in length.
The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101
NASA Astrophysics Data System (ADS)
Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.
2014-08-01
Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.
A lncRNA Perspective into (Re)Building the Heart.
Frank, Stefan; Aguirre, Aitor; Hescheler, Juergen; Kurian, Leo
2016-01-01
Our conception of the human genome, long focused on the 2% that codes for proteins, has profoundly changed since its first draft assembly in 2001. Since then, an unanticipatedly expansive functionality and convolution has been attributed to the majority of the genome that is transcribed in a cell-type/context-specific manner into transcripts with no apparent protein coding ability. While the majority of these transcripts, currently annotated as long non-coding RNAs (lncRNAs), are functionally uncharacterized, their prominent role in embryonic development and tissue homeostasis, especially in the context of the heart, is emerging. In this review, we summarize and discuss the latest advances in understanding the relevance of lncRNAs in (re)building the heart.
Carapelli, Antonio; Comandi, Sara; Convey, Peter; Nardi, Francesco; Frati, Francesco
2008-01-01
Background Mitogenomics data, i.e. complete mitochondrial genome sequences, are popular molecular markers used for phylogenetic, phylogeographic and ecological studies in different animal lineages. Their comparative analysis has been used to shed light on the evolutionary history of given taxa and on the molecular processes that regulate the evolution of the mitochondrial genome. A considerable literature is available in the fields of invertebrate biochemical and ecophysiological adaptation to extreme environmental conditions, exemplified by those of the Antarctic. Nevertheless, limited molecular data are available from terrestrial Antarctic species, and this study represents the first attempt towards the description of a mitochondrial genome from one of the most widespread and common collembolan species of Antarctica. Results In this study we describe the mitochondrial genome of the Antarctic collembolan Cryptopygus antarcticus Willem, 1901. The genome contains the standard set of 37 genes usually present in animal mtDNAs and a large non-coding fragment putatively corresponding to the region (A+T-rich) responsible for the control of replication and transcription. All genes are arranged in the gene order typical of Pancrustacea. Three additional short non-coding regions are present at gene junctions. Two of these are located in positions of abrupt shift of the coding polarity of genes oriented on opposite strands suggesting a role in the attenuation of the polycistronic mRNA transcription(s). In addition, remnants of an additional copy of trnL(uag) are present between trnS(uga) and nad1. Nucleotide composition is biased towards a high A% and T% (A+T = 70.9%), as typically found in hexapod mtDNAs. There is also a significant strand asymmetry, with the J-strand being more abundant in A and C. Within the A+T-rich region, some short sequence fragments appear to be similar (in position and primary sequence) to those involved in the origin of the N-strand replication of the Drosophila mtDNA. Conclusion The mitochondrial genome of C. antarcticus shares several features with other pancrustacean genomes, although the presence of unusual non-coding regions is also suggestive of molecular rearrangements that probably occurred before the differentiation of major collembolan families. Closer examination of gene boundaries also confirms previous observations on the presence of unusual start and stop codons, and suggests a role for tRNA secondary structures as potential cleavage signals involved in the maturation of the primary transcript. Sequences potentially involved in the regulation of replication/transcription are present both in the A+T-rich region and in other areas of the genome. Their position is similar to that observed in a limited number of insect species, suggesting unique replication/transcription mechanisms for basal and derived hexapod lineages. This initial description and characterization of the mitochondrial genome of C. antarcticus will constitute the essential foundation prerequisite for investigations of the evolutionary history of one of the most speciose collembolan genera present in Antarctica and other localities of the Southern Hemisphere. PMID:18593463
Impacts of Bt crops on non-target organisms and insecticide use patterns
USDA-ARS?s Scientific Manuscript database
Bacillus thuringiensis (Bt), a bacterium capable of producing insecticidal proteins is ubiquitous in the environment, and the genes coding for these proteins are now becoming ubiquitous in major crop plants via recombinant DNA technology where they provide host plant resistance to major lepidopteran...
RNAcode: Robust discrimination of coding and noncoding regions in comparative sequence data
Washietl, Stefan; Findeiß, Sven; Müller, Stephan A.; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L.; Stadler, Peter F.; Goldman, Nick
2011-01-01
With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied “out of the box,” without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as “noncoding.” RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode. PMID:21357752
RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.
Washietl, Stefan; Findeiss, Sven; Müller, Stephan A; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L; Stadler, Peter F; Goldman, Nick
2011-04-01
With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied "out of the box," without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as "noncoding." RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode.
Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity
Shabalina, Svetlana A.; Spiridonov, Nikolay A.; Kashina, Anna
2013-01-01
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions. PMID:23293005
Peng, Huizhen; Liu, Qiaolin; Xiao, Tiaoyi
2016-09-01
In this study, 15 sets of primers were used to amplify contiguous, overlapping segments of the complete mitochondrial DNA (mtDNA) of C. capio furong(♀) × C. carpio var.singguonensis(♂) in order to characterize and compare their mitochondrial genomes. The total length of the mitochondrial genome was 16,581 bp and deposited in the GenBank with the accession number KP210473. The organization of the mitochondrial genomes contained 37 genes (13 protein-coding genes, 2 ribosomal RNA and 22 transfer RNAs) and a major non-coding control region which was similar to those reported mitochondrial genomes. Most genes were encoded on the H-strand, except for the ND6 and 8 tRNA genes, encoding on the L-strand. The nucleotide skewness for the coding strands of C. capio furong(♀) × C. carpio var.singguonensis(♂) (AT-skew = 0.12, GC-skew = -0.27) were biased toward T and G. The complete mitogenome may provide important date for the study of genetic mechanism of C. capio furong(♀) × C. carpio var.singguonensis(♂).
Martin, Guillaume E.; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader
2014-01-01
Background and Aims To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the ‘inverted repeat-lacking clade’, IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. Methods The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. Key Results The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip–flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. Conclusions This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. PMID:24769537
Network perturbation by recurrent regulatory variants in cancer
Cho, Ara; Lee, Insuk; Choi, Jung Kyoon
2017-01-01
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
Thomson, P A; Parla, J S; McRae, A F; Kramer, M; Ramakrishnan, K; Yao, J; Soares, D C; McCarthy, S; Morris, S W; Cardone, L; Cass, S; Ghiban, E; Hennah, W; Evans, K L; Rebolini, D; Millar, J K; Harris, S E; Starr, J M; MacIntyre, D J; McIntosh, A M; Watson, J D; Deary, I J; Visscher, P M; Blackwood, D H; McCombie, W R; Porteous, D J
2014-06-01
A balanced t(1;11) translocation that transects the Disrupted in schizophrenia 1 (DISC1) gene shows genome-wide significant linkage for schizophrenia and recurrent major depressive disorder (rMDD) in a single large Scottish family, but genome-wide and exome sequencing-based association studies have not supported a role for DISC1 in psychiatric illness. To explore DISC1 in more detail, we sequenced 528 kb of the DISC1 locus in 653 cases and 889 controls. We report 2718 validated single-nucleotide polymorphisms (SNPs) of which 2010 have a minor allele frequency of <1%. Only 38% of these variants are reported in the 1000 Genomes Project European subset. This suggests that many DISC1 SNPs remain undiscovered and are essentially private. Rare coding variants identified exclusively in patients were found in likely functional protein domains. Significant region-wide association was observed between rs16856199 and rMDD (P=0.026, unadjusted P=6.3 × 10(-5), OR=3.48). This was not replicated in additional recurrent major depression samples (replication P=0.11). Combined analysis of both the original and replication set supported the original association (P=0.0058, OR=1.46). Evidence for segregation of this variant with disease in families was limited to those of rMDD individuals referred from primary care. Burden analysis for coding and non-coding variants gave nominal associations with diagnosis and measures of mood and cognition. Together, these observations are likely to generalise to other candidate genes for major mental illness and may thus provide guidelines for the design of future studies.
Finch, Caroline F; Orchard, John W; Twomey, Dara M; Saad Saleem, Muhammad; Ekegren, Christina L; Lloyd, David G; Elliott, Bruce C
2014-04-01
To compare Orchard Sports Injury Classification System (OSICS-10) sports medicine diagnoses assigned by a clinical and non-clinical coder. Assessment of intercoder agreement. Community Australian football. 1082 standardised injury surveillance records. Direct comparison of the four-character hierarchical OSICS-10 codes assigned by two independent coders (a sports physician and an epidemiologist). Adjudication by a third coder (biomechanist). The coders agreed on the first character 95% of the time and on the first two characters 86% of the time. They assigned the same four-digit OSICS-10 code for only 46% of the 1082 injuries. The majority of disagreements occurred for the third character; 85% were because one coder assigned a non-specific 'X' code. The sports physician code was deemed correct in 53% of cases and the epidemiologist in 44%. Reasons for disagreement included the physician not using all of the collected information and the epidemiologist lacking specific anatomical knowledge. Sports injury research requires accurate identification and classification of specific injuries and this study found an overall high level of agreement in coding according to OSICS-10. The fact that the majority of the disagreements occurred for the third OSICS character highlights the fact that increasing complexity and diagnostic specificity in injury coding can result in a loss of reliability and demands a high level of anatomical knowledge. Injury report form details need to reflect this level of complexity and data management teams need to include a broad range of expertise.
Walker, Joseph F; Zanis, Michael J; Emery, Nancy C
2014-04-01
Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.
Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong
2017-01-01
Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33) and F. virginiana (O477). However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33) and F. virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genus Fragaria. PMID:29038765
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.
Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor
2017-08-30
Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.
Thuan, Nguyen Huy; Dhakal, Dipesh; Pokhrel, Anaya Raj; Chu, Luan Luong; Van Pham, Thi Thuy; Shrestha, Anil; Sohng, Jae Kyung
2018-05-01
Streptomyces peucetius ATCC 27952 produces two major anthracyclines, doxorubicin (DXR) and daunorubicin (DNR), which are potent chemotherapeutic agents for the treatment of several cancers. In order to gain detailed insight on genetics and biochemistry of the strain, the complete genome was determined and analyzed. The result showed that its complete sequence contains 7187 protein coding genes in a total of 8,023,114 bp, whereas 87% of the genome contributed to the protein coding region. The genomic sequence included 18 rRNA, 66 tRNAs, and 3 non-coding RNAs. In silico studies predicted ~ 68 biosynthetic gene clusters (BCGs) encoding diverse classes of secondary metabolites, including non-ribosomal polyketide synthase (NRPS), polyketide synthase (PKS I, II, and III), terpenes, and others. Detailed analysis of the genome sequence revealed versatile biocatalytic enzymes such as cytochrome P450 (CYP), electron transfer systems (ETS) genes, methyltransferase (MT), glycosyltransferase (GT). In addition, numerous functional genes (transporter gene, SOD, etc.) and regulatory genes (afsR-sp, metK-sp, etc.) involved in the regulation of secondary metabolites were found. This minireview summarizes the genome-based genome mining (GM) of diverse BCGs and genome exploration (GE) of versatile biocatalytic enzymes, and other enzymes involved in maintenance and regulation of metabolism of S. peucetius. The detailed analysis of genome sequence provides critically important knowledge useful in the bioengineering of the strain or harboring catalytically efficient enzymes for biotechnological applications.
2014-01-01
Background Protein coding genes account for only about 2% of the human genome, whereas the vast majority of transcripts are non-coding RNAs including long non-coding RNAs. A growing volume of literature has proposed that lncRNAs are important players in cancer. HOTAIR was previously shown to be an oncogene and negative prognostic factor in a variety of cancers. However, the factors that contribute to its upregulation and the interaction between HOTAIR and miRNAs are largely unknown. Methods A computational screen of HOTAIR promoter was conducted to search for transcription-factor-binding sites. HOTAIR promoter activities were examined by luciferase reporter assay. The function of the c-Myc binding site in the HOTAIR promoter region was tested by a promoter assay with nucleotide substitutions in the putative E-box. The association of c-Myc with the HOTAIR promoter in vivo was confirmed by chromatin immunoprecipitation assay and Electrophoretic mobility shift assay. A search for miRNAs with complementary base paring with HOTAIR was performed utilizing online software program. Gain and loss of function approaches were employed to investigate the expression changes of HOTAIR or miRNA-130a. The expression levels of HOTAIR, c-Myc and miRNA-130a were examined in 65 matched pairs of gallbladder cancer tissues. The effects of HOTAIR and miRNA-130a on gallbladder cancer cell invasion and proliferation was tested using in vitro cell invasion and flow cytometric assays. Results We demonstrate that HOTAIR is a direct target of c-Myc through interaction with putative c-Myc target response element (RE) in the upstream region of HOTAIR in gallbladder cancer cells. A positive correlation between c-Myc and HOTAIR mRNA levels was observed in gallbladder cancer tissues. We predicted that HOTAIR harbors a miRNA-130a binding site. Our data showed that this binding site is vital for the regulation of miRNA-130a by HOTAIR. Moreover, a negative correlation between HOTAIR and miRNA-130a was observed in gallbladder cancer tissues. Finally, we demonstrate that the oncogenic activity of HOTAIR is in part through its negative regulation of miRNA-130a. Conclusion Together, these results suggest that HOTAIR is a c-Myc-activated driver of malignancy, which acts in part through repression of miRNA-130a. PMID:24953832
Many human accelerated regions are developmental enhancers
Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.
2013-01-01
The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637
Exploring the read-write genome: mobile DNA and mammalian adaptation.
Shapiro, James A
2017-02-01
The read-write genome idea predicts that mobile DNA elements will act in evolution to generate adaptive changes in organismal DNA. This prediction was examined in the context of mammalian adaptations involving regulatory non-coding RNAs, viviparous reproduction, early embryonic and stem cell development, the nervous system, and innate immunity. The evidence shows that mobile elements have played specific and sometimes major roles in mammalian adaptive evolution by generating regulatory sites in the DNA and providing interaction motifs in non-coding RNA. Endogenous retroviruses and retrotransposons have been the predominant mobile elements in mammalian adaptive evolution, with the notable exception of bats, where DNA transposons are the major agents of RW genome inscriptions. A few examples of independent but convergent exaptation of mobile DNA elements for similar regulatory rewiring functions are noted.
Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction.
Do, Ron; Stitziel, Nathan O; Won, Hong-Hee; Jørgensen, Anders Berg; Duga, Stefano; Angelica Merlini, Pier; Kiezun, Adam; Farrall, Martin; Goel, Anuj; Zuk, Or; Guella, Illaria; Asselta, Rosanna; Lange, Leslie A; Peloso, Gina M; Auer, Paul L; Girelli, Domenico; Martinelli, Nicola; Farlow, Deborah N; DePristo, Mark A; Roberts, Robert; Stewart, Alexander F R; Saleheen, Danish; Danesh, John; Epstein, Stephen E; Sivapalaratnam, Suthesh; Hovingh, G Kees; Kastelein, John J; Samani, Nilesh J; Schunkert, Heribert; Erdmann, Jeanette; Shah, Svati H; Kraus, William E; Davies, Robert; Nikpay, Majid; Johansen, Christopher T; Wang, Jian; Hegele, Robert A; Hechter, Eliana; Marz, Winfried; Kleber, Marcus E; Huang, Jie; Johnson, Andrew D; Li, Mingyao; Burke, Greg L; Gross, Myron; Liu, Yongmei; Assimes, Themistocles L; Heiss, Gerardo; Lange, Ethan M; Folsom, Aaron R; Taylor, Herman A; Olivieri, Oliviero; Hamsten, Anders; Clarke, Robert; Reilly, Dermot F; Yin, Wu; Rivas, Manuel A; Donnelly, Peter; Rossouw, Jacques E; Psaty, Bruce M; Herrington, David M; Wilson, James G; Rich, Stephen S; Bamshad, Michael J; Tracy, Russell P; Cupples, L Adrienne; Rader, Daniel J; Reilly, Muredach P; Spertus, John A; Cresci, Sharon; Hartiala, Jaana; Tang, W H Wilson; Hazen, Stanley L; Allayee, Hooman; Reiner, Alex P; Carlson, Christopher S; Kooperberg, Charles; Jackson, Rebecca D; Boerwinkle, Eric; Lander, Eric S; Schwartz, Stephen M; Siscovick, David S; McPherson, Ruth; Tybjaerg-Hansen, Anne; Abecasis, Goncalo R; Watkins, Hugh; Nickerson, Deborah A; Ardissino, Diego; Sunyaev, Shamil R; O'Donnell, Christopher J; Altshuler, David; Gabriel, Stacey; Kathiresan, Sekar
2015-02-05
Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance. When MI occurs early in life, genetic inheritance is a major component to risk. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI risk in individual families, whereas common variants at more than 45 loci have been associated with MI risk in the population. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI at an early age (≤50 years in males and ≤60 years in females) along with MI-free controls. We identified two genes in which rare coding-sequence mutations were more frequent in MI cases versus controls at exome-wide significance. At low-density lipoprotein receptor (LDLR), carriers of rare non-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol. Among controls, about 1 in 217 carried an LDLR coding-sequence mutation and had plasma LDL cholesterol > 190 mg dl(-1). At apolipoprotein A-V (APOA5), carriers of rare non-synonymous mutations were at 2.2-fold increased risk for MI. When compared with non-carriers, LDLR mutation carriers had higher plasma LDL cholesterol, whereas APOA5 mutation carriers had higher plasma triglycerides. Recent evidence has connected MI risk with coding-sequence mutations at two genes functionally related to APOA5, namely lipoprotein lipase and apolipoprotein C-III (refs 18, 19). Combined, these observations suggest that, as well as LDL cholesterol, disordered metabolism of triglyceride-rich lipoproteins contributes to MI risk.
Grossen, Christine; Keller, Lukas; Biebach, Iris; Croll, Daniel
2014-01-01
The major histocompatibility complex (MHC) is a crucial component of the vertebrate immune system and shows extremely high levels of genetic polymorphism. The extraordinary genetic variation is thought to be ancient polymorphisms maintained by balancing selection. However, introgression from related species was recently proposed as an additional mechanism. Here we provide evidence for introgression at the MHC in Alpine ibex (Capra ibex ibex). At a usually very polymorphic MHC exon involved in pathogen recognition (DRB exon 2), Alpine ibex carried only two alleles. We found that one of these DRB alleles is identical to a DRB allele of domestic goats (Capra aegagrus hircus). We sequenced 2489 bp of the coding and non-coding regions of the DRB gene and found that Alpine ibex homozygous for the goat-type DRB exon 2 allele showed nearly identical sequences (99.8%) to a breed of domestic goats. Using Sanger and RAD sequencing, microsatellite and SNP chip data, we show that the chromosomal region containing the goat-type DRB allele has a signature of recent introgression in Alpine ibex. A region of approximately 750 kb including the DRB locus showed high rates of heterozygosity in individuals carrying one copy of the goat-type DRB allele. These individuals shared SNP alleles both with domestic goats and other Alpine ibex. In a survey of four Alpine ibex populations, we found that the region surrounding the DRB allele shows strong linkage disequilibria, strong sequence clustering and low diversity among haplotypes carrying the goat-type allele. Introgression at the MHC is likely adaptive and introgression critically increased MHC DRB diversity in the genetically impoverished Alpine ibex. Our finding contradicts the long-standing view that genetic variability at the MHC is solely a consequence of ancient trans-species polymorphism. Introgression is likely an underappreciated source of genetic diversity at the MHC and other loci under balancing selection. PMID:24945814
Evaluation of non-coding variation in GLUT1 deficiency.
Liu, Yu-Chi; Lee, Jia Wei Audrey; Bellows, Susannah T; Damiano, John A; Mullen, Saul A; Berkovic, Samuel F; Bahlo, Melanie; Scheffer, Ingrid E; Hildebrand, Michael S
2016-12-01
Loss-of-function mutations in SLC2A1, encoding glucose transporter-1 (GLUT-1), lead to dysfunction of glucose transport across the blood-brain barrier. Ten percent of cases with hypoglycorrhachia (fasting cerebrospinal fluid [CSF] glucose <2.2mmol/L) do not have mutations. We hypothesized that GLUT1 deficiency could be due to non-coding SLC2A1 variants. We performed whole exome sequencing of one proband with a GLUT1 phenotype and hypoglycorrhachia negative for SLC2A1 sequencing and copy number variants. We studied a further 55 patients with different epilepsies and low CSF glucose who did not have exonic mutations or copy number variants. We sequenced non-coding promoter and intronic regions. We performed mRNA studies for the recurrent intronic variant. The proband had a de novo splice site mutation five base pairs from the intron-exon boundary. Three of 55 patients had deep intronic SLC2A1 variants, including a recurrent variant in two. The recurrent variant produced less SLC2A1 mRNA transcript. Fasting CSF glucose levels show an age-dependent correlation, which makes the definition of hypoglycorrhachia challenging. Low CSF glucose levels may be associated with pathogenic SLC2A1 mutations including deep intronic SLC2A1 variants. Extending genetic screening to non-coding regions will enable diagnosis of more patients with GLUT1 deficiency, allowing implementation of the ketogenic diet to improve outcomes. © 2016 Mac Keith Press.
Discovery of stimulation-responsive immune enhancers with CRISPR activation
Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander
2017-01-01
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues1–3. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption4–6, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa)7 to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs. PMID:28854172
Discovery of stimulation-responsive immune enhancers with CRISPR activation.
Simeonov, Dimitre R; Gowen, Benjamin G; Boontanrart, Mandy; Roth, Theodore L; Gagnon, John D; Mumbach, Maxwell R; Satpathy, Ansuman T; Lee, Youjin; Bray, Nicolas L; Chan, Alice Y; Lituiev, Dmytro S; Nguyen, Michelle L; Gate, Rachel E; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M; Mitros, Therese; Ray, Graham J; Curie, Gemma L; Naddaf, Nicki; Chu, Julia S; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R; Schumann, Kathrin; Daly, Mark J; Farh, Kyle K; Ansel, K Mark; Ye, Chun J; Greenleaf, William J; Anderson, Mark S; Bluestone, Jeffrey A; Chang, Howard Y; Corn, Jacob E; Marson, Alexander
2017-09-07
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (T H 17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.
Zhang, Xun; Gejman, Roger; Mahta, Ali; Zhong, Ying; Rice, Kimberley A.; Zhou, Yunli; Cheunsuchon, Pornsuk; Louis, David N.; Klibanski, Anne
2010-01-01
Meningiomas are common tumors, representing 15-25% of all central nervous system tumors. NF2 gene inactivation on chromosome 22 has been shown as an early event in tumorigenesis; however, few factors underlying tumor growth and progression have been identified. Chromosomal abnormalities of 14q32 are often associated with meningioma pathogenesis and progression; therefore it has been proposed that an as yet unidentified tumor suppressor is present at this locus. MEG3 is an imprinted gene located at 14q32 that encodes a non-coding RNA with an anti-proliferative function. We found that MEG3 mRNA is highly expressed in normal arachnoidal cells. However, MEG3 is not expressed in the majority of human meningiomas or the human meningioma cell lines IOMM-Lee and CH157-MN. There is a strong association between loss of MEG3 expression and tumor grade. Allelic loss at the MEG3 locus is also observed in meningiomas, with increasing prevalence in higher grade tumors. In addition, there is an increase in CpG methylation within the promoter and the imprinting control region of MEG3 gene in meningiomas. Functionally, MEG3 suppresses DNA synthesis in both IOMM-Lee and CH157-MN cells by approximately 60% in BrdU incorporation assays. Colony-forming efficiency assays show that MEG3 inhibits colony formation in CH157-MN cells by approximately 80%. Furthermore, MEG3 stimulates p53-mediated transactivation in these cell lines. Therefore, these data are consistent with the hypothesis that MEG3, which encodes a non-coding RNA, may be a tumor suppressor gene at chromosome 14q32 involved in meningioma progression via a novel mechanism. PMID:20179190
Discovery of stimulation-responsive immune enhancers with CRISPR activation
NASA Astrophysics Data System (ADS)
Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K.; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander
2017-09-01
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.
CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Congrains, Ada; Kamide, Kei; Katsuya, Tomohiro
Highlights: Black-Right-Pointing-Pointer ANRIL maps in the strongest susceptibility locus for cardiovascular disease. Black-Right-Pointing-Pointer Silencing of ANRIL leads to altered expression of tissue remodeling-related genes. Black-Right-Pointing-Pointer The effects of ANRIL on gene expression are splicing variant specific. Black-Right-Pointing-Pointer ANRIL affects progression of cardiovascular disease by regulating proliferation and apoptosis pathways. -- Abstract: ANRIL is a newly discovered non-coding RNA lying on the strongest genetic susceptibility locus for cardiovascular disease (CVD) in the chromosome 9p21 region. Genome-wide association studies have been linking polymorphisms in this locus with CVD and several other major diseases such as diabetes and cancer. The role of thismore » non-coding RNA in atherosclerosis progression is still poorly understood. In this study, we investigated the implication of ANRIL in the modulation of gene sets directly involved in atherosclerosis. We designed and tested siRNA sequences to selectively target two exons (exon 1 and exon 19) of the transcript and successfully knocked down expression of ANRIL in human aortic vascular smooth muscle cells (HuAoVSMC). We used a pathway-focused RT-PCR array to profile gene expression changes caused by ANRIL knock down. Notably, the genes affected by each of the siRNAs were different, suggesting that different splicing variants of ANRIL might have distinct roles in cell physiology. Our results suggest that ANRIL splicing variants play a role in coordinating tissue remodeling, by modulating the expression of genes involved in cell proliferation, apoptosis, extra-cellular matrix remodeling and inflammatory response to finally impact in the risk of cardiovascular disease and other pathologies.« less
Complete mitochondrial genome of a Asian lion (Panthera leo goojratensis).
Li, Yu-Fei; Wang, Qiang; Zhao, Jian-ning
2016-01-01
The entire mitochondrial genome of this Asian lion (Panthera leo goojratensis) was 17,183 bp in length, gene composition and arrangement conformed to other lions, which contained the typical structure of 22 tRNAs, 2 rRNAs, 13 protein-coding genes and a non-coding region. The characteristic of the mitochondrial genome was analyzed in detail.
USDA-ARS?s Scientific Manuscript database
Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...
Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.
Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi
2016-03-01
Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.
Arthur-Farraj, Peter J; Morgan, Claire C; Adamowicz, Martyna; Gomez-Sanchez, Jose A; Fazal, Shaline V; Beucher, Anthony; Razzaghi, Bonnie; Mirsky, Rhona; Jessen, Kristjan R; Aitman, Timothy J
2017-09-12
Repair Schwann cells play a critical role in orchestrating nerve repair after injury, but the cellular and molecular processes that generate them are poorly understood. Here, we perform a combined whole-genome, coding and non-coding RNA and CpG methylation study following nerve injury. We show that genes involved in the epithelial-mesenchymal transition are enriched in repair cells, and we identify several long non-coding RNAs in Schwann cells. We demonstrate that the AP-1 transcription factor C-JUN regulates the expression of certain micro RNAs in repair Schwann cells, in particular miR-21 and miR-34. Surprisingly, unlike during development, changes in CpG methylation are limited in injury, restricted to specific locations, such as enhancer regions of Schwann cell-specific genes (e.g., Nedd4l), and close to local enrichment of AP-1 motifs. These genetic and epigenomic changes broaden our mechanistic understanding of the formation of repair Schwann cell during peripheral nervous system tissue repair. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Delineating slowly and rapidly evolving fractions of the Drosophila genome.
Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S
2008-05-01
Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.
Chen, Zihao; Ju, Hongping; Yu, Shan; Zhao, Ting; Jing, Xiaojie; Li, Ping; Jia, Jing; Li, Nan; Tan, Bibo; Li, Yong
2018-05-23
Gastric cancer (GC) is one of the major global health problems, especially in Asia. Nowadays, long non-coding RNA (lncRNA) has gained significant attention in the current research climate such as carcinogenesis. This research desires to explore the mechanism of Prader-Willi region non-protein coding RNA 1 (PWRN1) on regulating GC process. Differentially expressed lncRNAs in GC tissues were screened out through microarray analysis. The RNA and protein expression level were detected by quantitative real-time PCR (qRT-PCR) and Western blot. Cell proliferation, apoptosis rate, metastasis abilities were respectively determined by cell counting kit 8 (CCK8), flow cytometry, wound healing, and transwell assay. The luciferase reporter system was used to verify the targetting relationships between PWRN1, miR-425-5p , and phosphatase and tensin homolog ( PTEN ). RNA-binding protein immunoprecipitation (RIP) assay was performed to prove whether PWRN1 acted as a competitive endogenous RNA (ceRNA) of miR-425-5p Tumor xenograft model and immunohistochemistry (IHC) were developed to study the influence of PWRN1 on tumor growth in vivo Microarray analysis determined that PWRN1 was differently expressed between GC tissues and adjacent tissues. qRT-PCR revealed PWRN1 low expression in GC tissues and cells. Up-regulated PWRN1 could reduce proliferation and metastasis and increase apoptosis in GC cells, while miR-425-5p had reverse effects. The RIP assay indicated that PWRN1 may target an oncogene, miR-425-5p The tumor xenograft assay found that up-regulated PWRN1 suppressed the tumor growth. The bioinformatics analysis, luciferase assay, and Western blot indicated that PWRN1 affected PTEN / Akt / MDM2 / p53 axis via suppressing miR-425-5p Our findings suggested that PWRN1 functioned as a ceRNA targetting miR-425-5p and suppressed GC development via p53 signaling pathway. © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
Zhang, Yue; Feng, Shiqian; Zeng, Yiying; Ning, Hong; Liu, Lijun; Zhao, Zihua; Jiang, Fan; Li, Zhihong
2018-06-23
Bactrocera tsuneonis (Miyake), generally known as the Japanese orange fly, is considered to be a major pest of commercial citrus crops. It has a limited distribution in China, Japan and Vietnam, but it has the potential to invade areas outside of Asia. More genetic information of B. tsuneonis should be obtained in order to develop effective methodologies for rapid and accurate molecular identification due to the difficulty of distinguishing it from Bactrocera minax based on morphological features. We report here the whole mitochondrial genome of B. tsuneonis sequenced by next-generation sequencing. This mitogenome sequence had a total length of 15,865 bp, a typical circular molecule comprising 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The structure and organization of the molecule were typical and similar compared with the published homologous sequences of other fruit flies in Tephritidae. The phylogenetic analyses based on the mitochondrial genome data presented a close genetic relationship between B. tsuneonis and B. minax. This is the first report of the complete mitochondrial genome of B. tsuneonis, and it can be used in further studies of species diagnosis, evolutionary biology, prevention and control. Copyright © 2018. Published by Elsevier B.V.
2004-12-09
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars
2017-02-10
The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.
Biosynthesis and expression of ependymin homologous sequences in zebrafish brain.
Sterrer, S; Königstorfer, A; Hoffmann, W
1990-01-01
Ependymins are unique, brain specific glycoproteins, which are major constituents of the cerebrospinal fluid. Originally, they were discovered in goldfish and are thought to be involved in synaptic plasticity. In the present study two transcripts were characterized in Brachydanio rerio originating from a single gene possibly by alternative splicing. These transcripts differ only in the length of their 3'-non-coding-regions and the encoded protein shares 90 and 88% homology with the two corresponding goldfish proteins, respectively. In situ hybridization revealed the expression of ependymins exclusively in the leptomeninx including its invaginations but not at all in the ependymal layer surrounding the ventricles. An initial developmental profile showed that ependymins first appear before hatching, i.e. between 48 and 72 h postfertilization.
Comparative genomics reveals insights into avian genome evolution and adaptation
Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun
2015-01-01
Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712
Maduz, Roman; Kugelmeier, Patrick; Meili, Severin; Döring, Robert; Meier, Christoph; Wahl, Peter
2017-04-01
The Abbreviated Injury Scale (AIS) and the Injury Severity Score (ISS) find increasingly widespread use to assess trauma burden and to perform interhospital benchmarking through trauma registries. Since 2015, public resource allocation in Switzerland shall even be derived from such data. As every trauma centre is responsible for its own coding and data input, this study aims at evaluating interobserver reliability of AIS and ISS coding. Interobserver reliability of the AIS and ISS is analysed from a cohort of 50 consecutive severely injured patients treated in 2012 at our institution, coded retrospectively by 3 independent and specifically trained observers. Considering a cutoff ISS≥16, only 38/50 patients (76%) were uniformly identified as polytraumatised or not. Increasing the cut off to ≥20, this increased to 41/50 patients (82%). A difference in the AIS of ≥ 1 was present in 261 (16%) of possible codes. Excluding the vast majority of uninjured body regions, uniformly identical AIS severity values were attributed in 67/193 (35%) body regions, or 318/579 (55%) possible observer pairings. Injury severity all too often is neither identified correctly nor consistently when using the AIS. This leads to wrong identification of severely injured patients using the ISS. Improving consistency of coding through centralisation is recommended before scores based on the AIS are to be used for interhospital benchmarking and resource allocation in the treatment of severely injured patients. Copyright © 2017. Published by Elsevier Ltd.
Wu, Yueh-Lung; Wu, Carol-P; Huang, Yu-Hui; Huang, Sheng-Ping; Lo, Huei-Ru; Chang, Hao-Shuo; Lin, Pi-Hsiu; Wu, Ming-Cheng; Chang, Chia-Jung; Chao, Yu-Chan
2014-11-01
The p143 gene from Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) has been found to increase the expression of luciferase, which is driven by the polyhedrin gene promoter, in a plasmid with virus coinfection. Further study indicated that this is due to the presence of a replication origin (ori) in the coding region of this gene. Transient DNA replication assays showed that a specific fragment of the p143 coding sequence, p143-3, underwent virus-dependent DNA replication in Spodoptera frugiperda IPLB-Sf-21 (Sf-21) cells. Deletion analysis of the p143-3 fragment showed that subfragment p143-3.2a contained the essential sequence of this putative ori. Sequence analysis of this region revealed a unique distribution of imperfect palindromes with high AT contents. No sequence homology or similarity between p143-3.2a and any other known ori was detected, suggesting that it is a novel baculovirus ori. Further study showed that the p143-3.2a ori can replicate more efficiently in infected Sf-21 cells than baculovirus homologous regions (hrs), the major baculovirus ori, or non-hr oris during virus replication. Previously, hr on its own was unable to replicate in mammalian cells, and for mammalian viral oris, viral proteins are generally required for their proper replication in host cells. However, the p143-3.2a ori was, surprisingly, found to function as an efficient ori in mammalian cells without the need for any viral proteins. We conclude that p143 contains a unique sequence that can function as an ori to enhance gene expression in not only insect cells but also mammalian cells. Baculovirus DNA replication relies on both hr and non-hr oris; however, so far very little is known about the latter oris. Here we have identified a new non-hr ori, the p143 ori, which resides in the coding region of p143. By developing a novel DNA replication-enhanced reporter system, we have identified and located the core region required for the p143 ori. This ori contains a large number of imperfect inverted repeats and is the most active ori in the viral genome during virus infection in insect cells. We also found that it is a unique ori that can replicate in mammalian cells without the assistance of baculovirus gene products. The identification of this ori should contribute to a better understanding of baculovirus DNA replication. Also, this ori is very useful in assisting with gene expression in mammalian cells. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Herrnstadt, Corinna; Elson, Joanna L; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M; Anderson, Christen; Ghosh, Soumitra S; Olefsky, Jerrold M; Beal, M Flint; Davis, Robert E; Howell, Neil
2002-05-01
The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here.
Technological Developments in lncRNA Biology.
Jathar, Sonali; Kumar, Vikram; Srivastava, Juhi; Tripathi, Vidisha
2017-01-01
It is estimated that more than 90% of the mammalian genome is transcribed as non-coding RNAs. Recent evidences have established that these non-coding transcripts are not junk or just transcriptional noise, but they do serve important biological purpose. One of the rapidly expanding fields of this class of transcripts is the regulatory lncRNAs, which had been a major challenge in terms of their molecular functions and mechanisms of action. The emergence of high-throughput technologies and the development in various conventional approaches have led to the expansion of the lncRNA world. The combination of multidisciplinary approaches has proven to be essential to unravel the complexity of their regulatory networks and helped establish the importance of their existence. Here, we review the current methodologies available for discovering and investigating functions of long non-coding RNAs (lncRNAs) and focus on the powerful technological advancement available to specifically address their functional importance.
Long non-coding RNAs in B-cell malignancies: a comprehensive overview
Taiana, Elisa; Neri, Antonino
2017-01-01
B-cell malignancies constitute a large part of hematological neoplasias. They represent a heterogeneous group of diseases, including Hodgkin's lymphoma, most non-Hodgkin's lymphomas (NHL), some leukemias and myelomas. B-cell malignancies reflect defined stages of normal B-cell differentiation and this represents the major basis for their classification. Long non-coding RNAs (lncRNAs) are non-protein-coding transcripts longer than 200 nucleotides, for which many recent studies have demonstrated a function in regulating gene expression, cell biology and carcinogenesis. Deregulated expression levels of lncRNAs have been observed in various types of cancers including hematological malignancies. The involvement of lncRNAs in cancer initiation and progression and their attractive features both as biomarker and for therapeutic research are becoming increasingly evident. In this review, we summarize the recent literature to highlight the status of the knowledge of lncRNAs role in normal B-cell development and in the pathogenesis of B-cell tumors. PMID:28947998
Columbo, Jesse A; Stone, David H; Goodney, Philip P; Nolan, Brian W; Stableford, Jennifer A; Brooke, Benjamin S; Powell, Richard J; Finn, Christine T
2016-05-01
Current evidence suggests an association between coronary artery disease and major depressive disorder (MDD). Data to support a similar association between peripheral arterial disease (PAD) and MDD are more limited. This study examines the prevalence and regional variation of both PAD and MDD in a large contemporary patient sample. All Medicare claims, part A and B, from January 2009 until December 2011 were queried using diagnosis codes specific for a previously validated clinical algorithm for PAD and major depression. Codes for PAD included those specific to cerebrovascular disease, abdominal aortic aneurysm, and peripheral vascular disease. Peripheral arterial disease prevalence, major depression prevalence, and coprevalence rates were determined, respectively. Regional variation of both conditions was determined using zip code data to identify potential endemic areas of disease intensity for both diagnoses. Over the study interval, the percentage of Medicare beneficiaries with a diagnosis of PAD remained relatively constant (3.0%-3.7%, n = 0.85-1.06 million in part A and 17.4%-17.5%, n = 4.82-4.93 million in part B), and MDD showed a similar trend (1.6%-2.7%, n = 0.46-0.79 million in part A and 6.1%-6.7%, n = 1.69-1.90 million in part B). The observed rate of MDD in those with an established diagnosis of PAD was 5-fold higher than those without PAD in part A claims (1.8-fold in part B claims). Moreover, there was a significant linear geographic correlation among patients with PAD and MDD (r = .54, P ≤ .01). This study documents a correlation between PAD and MDD and may, therefore, identify an at-risk population susceptible to inferior clinical outcomes. Significant regional variation exists in the prevalence of PAD and MDD, though there appear to be specific endemic regions notable for both disorders. Accordingly, health-care resource allocation toward endemic regions may help improve population health among this at-risk cohort. © The Author(s) 2016.
Finch, Caroline F; Orchard, John W; Twomey, Dara M; Saad Saleem, Muhammad; Ekegren, Christina L; Lloyd, David G; Elliott, Bruce C
2014-01-01
Objective To compare Orchard Sports Injury Classification System (OSICS-10) sports medicine diagnoses assigned by a clinical and non-clinical coder. Design Assessment of intercoder agreement. Setting Community Australian football. Participants 1082 standardised injury surveillance records. Main outcome measurements Direct comparison of the four-character hierarchical OSICS-10 codes assigned by two independent coders (a sports physician and an epidemiologist). Adjudication by a third coder (biomechanist). Results The coders agreed on the first character 95% of the time and on the first two characters 86% of the time. They assigned the same four-digit OSICS-10 code for only 46% of the 1082 injuries. The majority of disagreements occurred for the third character; 85% were because one coder assigned a non-specific ‘X’ code. The sports physician code was deemed correct in 53% of cases and the epidemiologist in 44%. Reasons for disagreement included the physician not using all of the collected information and the epidemiologist lacking specific anatomical knowledge. Conclusions Sports injury research requires accurate identification and classification of specific injuries and this study found an overall high level of agreement in coding according to OSICS-10. The fact that the majority of the disagreements occurred for the third OSICS character highlights the fact that increasing complexity and diagnostic specificity in injury coding can result in a loss of reliability and demands a high level of anatomical knowledge. Injury report form details need to reflect this level of complexity and data management teams need to include a broad range of expertise. PMID:22919021
Hu, Bo; Liu, Dong-Xing; Zhang, Yu-Qing; Song, Jian-Tao; Ji, Xian-Fei; Hou, Zhi-Qiang; Zhang, Zhen-Hai
2016-05-01
In this study we sequenced the complete mitochondrial genome sequencing of a heart failure model of cardiomyopathic Syrian hamster (Mesocricetus auratus) for the first time. The total length of the mitogenome was 16,267 bp. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region.
NASA Astrophysics Data System (ADS)
Durmaz, Murat; Karslioglu, Mahmut Onur
2015-04-01
There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv
2010-01-01
RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
Schmouth, Jean-François; Castellarin, Mauro; Laprise, Stéphanie; Banks, Kathleen G; Bonaguro, Russell J; McInerny, Simone C; Borretta, Lisa; Amirabbasi, Mahsa; Korecki, Andrea J; Portales-Casamar, Elodie; Wilson, Gary; Dreolini, Lisa; Jones, Steven J M; Wasserman, Wyeth W; Goldowitz, Daniel; Holt, Robert A; Simpson, Elizabeth M
2013-10-14
The next big challenge in human genetics is understanding the 98% of the genome that comprises non-coding DNA. Hidden in this DNA are sequences critical for gene regulation, and new experimental strategies are needed to understand the functional role of gene-regulation sequences in health and disease. In this study, we build upon our HuGX ('high-throughput human genes on the X chromosome') strategy to expand our understanding of human gene regulation in vivo. In all, ten human genes known to express in therapeutically important brain regions were chosen for study. For eight of these genes, human bacterial artificial chromosome clones were identified, retrofitted with a reporter, knocked single-copy into the Hprt locus in mouse embryonic stem cells, and mouse strains derived. Five of these human genes expressed in mouse, and all expressed in the adult brain region for which they were chosen. This defined the boundaries of the genomic DNA sufficient for brain expression, and refined our knowledge regarding the complexity of gene regulation. We also characterized for the first time the expression of human MAOA and NR2F2, two genes for which the mouse homologs have been extensively studied in the central nervous system (CNS), and AMOTL1 and NOV, for which roles in CNS have been unclear. We have demonstrated the use of the HuGX strategy to functionally delineate non-coding-regulatory regions of therapeutically important human brain genes. Our results also show that a careful investigation, using publicly available resources and bioinformatics, can lead to accurate predictions of gene expression.
The origins and evolutionary history of human non-coding RNA regulatory networks.
Sherafatian, Masih; Mowla, Seyed Javad
2017-04-01
The evolutionary history and origin of the regulatory function of animal non-coding RNAs are not well understood. Lack of conservation of long non-coding RNAs and small sizes of microRNAs has been major obstacles in their phylogenetic analysis. In this study, we tried to shed more light on the evolution of ncRNA regulatory networks by changing our phylogenetic strategy to focus on the evolutionary pattern of their protein coding targets. We used available target databases of miRNAs and lncRNAs to find their protein coding targets in human. We were able to recognize evolutionary hallmarks of ncRNA targets by phylostratigraphic analysis. We found the conventional 3'-UTR and lesser known 5'-UTR targets of miRNAs to be enriched at three consecutive phylostrata. Firstly, in eukaryata phylostratum corresponding to the emergence of miRNAs, our study revealed that miRNA targets function primarily in cell cycle processes. Moreover, the same overrepresentation of the targets observed in the next two consecutive phylostrata, opisthokonta and eumetazoa, corresponded to the expansion periods of miRNAs in animals evolution. Coding sequence targets of miRNAs showed a delayed rise at opisthokonta phylostratum, compared to the 3' and 5' UTR targets of miRNAs. LncRNA regulatory network was the latest to evolve at eumetazoa.
INTRODUCING CAFein, A NEW COMPUTATIONAL TOOL FOR STELLAR PULSATIONS AND DYNAMIC TIDES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Valsecchi, F.; Farr, W. M.; Willems, B.
2013-08-10
Here we present CAFein, a new computational tool for investigating radiative dissipation of dynamic tides in close binaries and of non-adiabatic, non-radial stellar oscillations in isolated stars in the linear regime. For the latter, CAFein computes the non-adiabatic eigenfrequencies and eigenfunctions of detailed stellar models. The code is based on the so-called Riccati method, a numerical algorithm that has been successfully applied to a variety of stellar pulsators, and which does not suffer from the major drawbacks of commonly used shooting and relaxation schemes. Here we present an extension of the Riccati method to investigate dynamic tides in close binaries.more » We demonstrate CAFein's capabilities as a stellar pulsation code both in the adiabatic and non-adiabatic regimes, by reproducing previously published eigenfrequencies of a polytrope, and by successfully identifying the unstable modes of a stellar model in the {beta} Cephei/SPB region of the Hertzsprung-Russell diagram. Finally, we verify CAFein's behavior in the dynamic tides regime by investigating the effects of dynamic tides on the eigenfunctions and orbital and spin evolution of massive main sequence stars in eccentric binaries, and of hot Jupiter host stars. The plethora of asteroseismic data provided by NASA's Kepler satellite, some of which include the direct detection of tidally excited stellar oscillations, make CAFein quite timely. Furthermore, the increasing number of observed short-period detached double white dwarfs (WDs) and the observed orbital decay in the tightest of such binaries open up a new possibility of investigating WD interiors through the effects of tides on their orbital evolution.« less
Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael
2013-01-01
Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343
The expanding regulatory universe of p53 in gastrointestinal cancer.
Fesler, Andrew; Zhang, Ning; Ju, Jingfang
2016-01-01
Tumor suppresser gene TP53 is one of the most frequently deleted or mutated genes in gastrointestinal cancers. As a transcription factor, p53 regulates a number of important protein coding genes to control cell cycle, cell death, DNA damage/repair, stemness, differentiation and other key cellular functions. In addition, p53 is also able to activate the expression of a number of small non-coding microRNAs (miRNAs) through direct binding to the promoter region of these miRNAs. Many miRNAs have been identified to be potential tumor suppressors by regulating key effecter target mRNAs. Our understanding of the regulatory network of p53 has recently expanded to include long non-coding RNAs (lncRNAs). Like miRNA, lncRNAs have been found to play important roles in cancer biology. With our increased understanding of the important functions of these non-coding RNAs and their relationship with p53, we are gaining exciting new insights into the biology and function of cells in response to various growth environment changes. In this review we summarize the current understanding of the ever expanding involvement of non-coding RNAs in the p53 regulatory network and its implications for our understanding of gastrointestinal cancer.
Long non-coding RNA and Polycomb: an intricate partnership in cancer biology.
Achour, Cyrinne; Aguilo, Francesca
2018-06-01
High-throughput analyses have revealed that the vast majority of the transcriptome does not code for proteins. These non-translated transcripts, when larger than 200 nucleotides, are termed long non-coding RNAs (lncRNAs), and play fundamental roles in diverse cellular processes. LncRNAs are subject to dynamic chemical modification, adding another layer of complexity to our understanding of the potential roles that lncRNAs play in health and disease. Many lncRNAs regulate transcriptional programs by influencing the epigenetic state through direct interactions with chromatin-modifying proteins. Among these proteins, Polycomb repressive complexes 1 and 2 (PRC1 and PRC2) have been shown to be recruited by lncRNAs to silence target genes. Aberrant expression, deficiency or mutation of both lncRNA and Polycomb have been associated with numerous human diseases, including cancer. In this review, we have highlighted recent findings regarding the concerted mechanism of action of Polycomb group proteins (PcG), acting together with some classically defined lncRNAs including X-inactive specific transcript ( XIST ), antisense non-coding RNA in the INK4 locus ( ANRIL ), metastasis associated lung adenocarcinoma transcript 1 ( MALAT1 ), and HOX transcript antisense RNA ( HOTAIR ).
An expanding universe of the non-coding genome in cancer biology.
Xue, Bin; He, Lin
2014-06-01
Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Structural Relationships Between Minor and Major Proteins of Hepatitis B Surface Antigen
Stibbe, Werner; Gerlich, Wolfram H.
1983-01-01
The minor glycoproteins from hepatitis B surface antigen, GP33 and GP36, contain at their carboxy-terminal part the sequence of the major protein P24. They have 55 additional amino acids at the amino-terminal part which are coded by the pre-S region of the viral DNA. Images PMID:6842680
DOE Office of Scientific and Technical Information (OSTI.GOV)
Massimo, F., E-mail: francesco.massimo@ensta-paristech.fr; Dipartimento SBAI, Università di Roma “La Sapienza“, Via A. Scarpa 14, 00161 Roma; Atzeni, S.
Architect, a time explicit hybrid code designed to perform quick simulations for electron driven plasma wakefield acceleration, is described. In order to obtain beam quality acceptable for applications, control of the beam-plasma-dynamics is necessary. Particle in Cell (PIC) codes represent the state-of-the-art technique to investigate the underlying physics and possible experimental scenarios; however PIC codes demand the necessity of heavy computational resources. Architect code substantially reduces the need for computational resources by using a hybrid approach: relativistic electron bunches are treated kinetically as in a PIC code and the background plasma as a fluid. Cylindrical symmetry is assumed for themore » solution of the electromagnetic fields and fluid equations. In this paper both the underlying algorithms as well as a comparison with a fully three dimensional particle in cell code are reported. The comparison highlights the good agreement between the two models up to the weakly non-linear regimes. In highly non-linear regimes the two models only disagree in a localized region, where the plasma electrons expelled by the bunch close up at the end of the first plasma oscillation.« less
Martinez-Laguna, Daniel; Soria-Castro, Alberto; Carbonell-Abella, Cristina; Orozco-López, Pilar; Estrada-Laza, Pilar; Nogues, Xavier; Díez-Perez, Adolfo; Prieto-Alhambra, Daniel
2017-11-28
Electronic medical records databases use pre-specified lists of diagnostic codes to identify fractures. These codes, however, are not specific enough to disentangle traumatic from fragility-related fractures. We report on the proportion of fragility fractures identified in a random sample of coded fractures in SIDIAP. Patients≥50 years old with any fracture recorded in 2012 (as per pre-specified ICD-10 codes) and alive at the time of recruitment were eligible for this retrospective observational study in 6 primary care centres contributing to the SIDIAP database (www.sidiap.org). Those with previous fracture/s, non-responders, and those with dementia or a serious psychiatric disease were excluded. Data on fracture type (traumatic vs fragility), skeletal site, and basic patient characteristics were collected. Of 491/616 (79.7%) patients with a registered fracture in 2012 who were contacted, 331 (349 fractures) were included. The most common fractures were forearm (82), ribs (38), and humerus (32), and 225/349 (64.5%) were fragility fractures, with higher proportions for classic osteoporotic sites: hip, 91.7%; spine, 87.7%; and major fractures, 80.5%. This proportion was higher in women, the elderly, and patients with a previously coded diagnosis of osteoporosis. More than 4 in 5 major fractures recorded in SIDIAP are due to fragility (non-traumatic), with higher proportions for hip (92%) and vertebral (88%) fracture, and a lower proportion for fractures other than major ones. Our data support the validity of SIDIAP for the study of the epidemiology of osteoporotic fractures. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.
The complete mitochondrial genome of Pholis nebulosus (Perciformes: Pholidae).
Wang, Zhongquan; Qin, Kaili; Liu, Jingxi; Song, Na; Han, Zhiqiang; Gao, Tianxiang
2016-11-01
In this study, the complete mitochondrial genome (mitogenome) sequence of Pholis nebulosus has been determined by long polymerase chain reaction and primer-walking methods. The mitogenome is a circular molecule of 16 524 bp in length, including the typical structure of 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 2 non-coding regions (L-strand replication origin and control region), the gene contents of which are identical to those observed in most bony fishes. Within the control region, we identified the termination-associated sequence domain (TAS), and the conserved sequence block domain (CSB-F, CSB-E, CSB-D, CSB-C, CSB-B, CSB-A, CSB-1, CSB-2, CSB-3).
Complete mitochondrial genome of the Tyto longimembris (Strigiformes: Tytonidae).
Xu, Peng; Li, Yankuo; Miao, Lujun; Xie, Guangyong; Huang, Yan
2016-07-01
The complete mitochondrial genome of Tyto longimembris has been determined in this study. It is 18,466 bp in length and consists of 13 protein-coding genes, 22 transfer RNA (tRNA) genes, 2 ribosomal RNA (rRNA) genes and a non-coding control region (D-loop). The overall base composition of the heavy strand of the T. longimembris mitochondrial genome is A: 30.1%, T: 23.5%, C: 31.8% and G: 14.6%. The structure of control region should be characterized by a region containing tandem repeats as two definitely separated clusters of tandem repeats were found. This study provided an important data set for phylogenetic and taxonomic analyses of Tyto species.
The Big Entity of New RNA World: Long Non-Coding RNAs in Microvascular Complications of Diabetes.
Raut, Satish K; Khullar, Madhu
2018-01-01
A major part of the genome is known to be transcribed into non-protein coding RNAs (ncRNAs), such as microRNA and long non-coding RNA (lncRNA). The importance of ncRNAs is being increasingly recognized in physiological and pathological processes. lncRNAs are a novel class of ncRNAs that do not code for proteins and are important regulators of gene expression. In the past, these molecules were thought to be transcriptional "noise" with low levels of evolutionary conservation. However, recent studies provide strong evidence indicating that lncRNAs are (i) regulated during various cellular processes, (ii) exhibit cell type-specific expression, (iii) localize to specific organelles, and (iv) associated with human diseases. Emerging evidence indicates an aberrant expression of lncRNAs in diabetes and diabetes-related microvascular complications. In the present review, we discuss the current state of knowledge of lncRNAs, their genesis from genome, and the mechanism of action of individual lncRNAs in the pathogenesis of microvascular complications of diabetes and therapeutic approaches.
Ieva, Antonio Di; Audigé, Laurent; Kellman, Robert M.; Shumrick, Kevin A.; Ringl, Helmut; Prein, Joachim; Matula, Christian
2014-01-01
The AOCMF Classification Group developed a hierarchical three-level craniomaxillofacial classification system with increasing level of complexity and details. The highest level 1 system distinguish four major anatomical units, including the mandible (code 91), midface (code 92), skull base (code 93), and cranial vault (code 94). This tutorial presents the level 2 and more detailed level 3 systems for the skull base and cranial vault units. The level 2 system describes fracture location outlining the topographic boundaries of the anatomic regions, considering in particular the endocranial and exocranial skull base surfaces. The endocranial skull base is divided into nine regions; a central skull base adjoining a left and right side are divided into the anterior, middle, and posterior skull base. The exocranial skull base surface and cranial vault are divided in regions defined by the names of the bones involved: frontal, parietal, temporal, sphenoid, and occipital bones. The level 3 system allows assessing fracture morphology described by the presence of fracture fragmentation, displacement, and bone loss. A documentation of associated intracranial diagnostic features is proposed. This tutorial is organized in a sequence of sections dealing with the description of the classification system with illustrations of the topographical skull base and cranial vault regions along with rules for fracture location and coding, a series of case examples with clinical imaging and a general discussion on the design of this classification. PMID:25489394
Bausher, Michael G; Singh, Nameirakpam D; Lee, Seung-Bum; Jansen, Robert K; Daniell, Henry
2006-01-01
Background The production of Citrus, the largest fruit crop of international economic value, has recently been imperiled due to the introduction of the bacterial disease Citrus canker. No significant improvements have been made to combat this disease by plant breeding and nuclear transgenic approaches. Chloroplast genetic engineering has a number of advantages over nuclear transformation; it not only increases transgene expression but also facilitates transgene containment, which is one of the major impediments for development of transgenic trees. We have sequenced the Citrus chloroplast genome to facilitate genetic improvement of this crop and to assess phylogenetic relationships among major lineages of angiosperms. Results The complete chloroplast genome sequence of Citrus sinensis is 160,129 bp in length, and contains 133 genes (89 protein-coding, 4 rRNAs and 30 distinct tRNAs). Genome organization is very similar to the inferred ancestral angiosperm chloroplast genome. However, in Citrus the infA gene is absent. The inverted repeat region has expanded to duplicate rps19 and the first 84 amino acids of rpl22. The rpl22 gene in the IRb region has a nonsense mutation resulting in 9 stop codons. This was confirmed by PCR amplification and sequencing using primers that flank the IR/LSC boundaries. Repeat analysis identified 29 direct and inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Comparison of protein-coding sequences with expressed sequence tags revealed six putative RNA edits, five of which resulted in non-synonymous modifications in petL, psbH, ycf2 and ndhA. Phylogenetic analyses using maximum parsimony (MP) and maximum likelihood (ML) methods of a dataset composed of 61 protein-coding genes for 30 taxa provide strong support for the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids and asterids. The MP and ML trees are incongruent in three areas: the position of Amborella and Nymphaeales, relationship of the magnoliid genus Calycanthus, and the monophyly of the eurosid I clade. Both MP and ML trees provide strong support for the monophyly of eurosids II and for the placement of Citrus (Sapindales) sister to a clade including the Malvales/Brassicales. Conclusion This is the first complete chloroplast genome sequence for a member of the Rutaceae and Sapindales. Expansion of the inverted repeat region to include rps19 and part of rpl22 and presence of two truncated copies of rpl22 is unusual among sequenced chloroplast genomes. Availability of a complete Citrus chloroplast genome sequence provides valuable information on intergenic spacer regions and endogenous regulatory sequences for chloroplast genetic engineering. Phylogenetic analyses resolve relationships among several major clades of angiosperms and provide strong support for the monophyly of the eurosid II clade and the position of the Sapindales sister to the Brassicales/Malvales. PMID:17010212
Sorimachi, Kenji; Okayasu, Teiji
2015-01-01
The complete vertebrate mitochondrial genome consists of 13 coding genes. We used this genome to investigate the existence of natural selection in vertebrate evolution. From the complete mitochondrial genomes, we predicted nucleotide contents and then separated these values into coding and non-coding regions. When nucleotide contents of a coding or non-coding region were plotted against the nucleotide content of the complete mitochondrial genomes, we obtained linear regression lines only between homonucleotides and their analogs. On every plot using G or A content purine, G content in aquatic vertebrates was higher than that in terrestrial vertebrates, while A content in aquatic vertebrates was lower than that in terrestrial vertebrates. Based on these relationships, vertebrates were separated into two groups, terrestrial and aquatic. However, using C or T content pyrimidine, clear separation between these two groups was not obtained. The hagfish (Eptatretus burgeri) was further separated from both terrestrial and aquatic vertebrates. Based on these results, nucleotide content relationships predicted from the complete vertebrate mitochondrial genomes reveal the existence of natural selection based on evolutionary separation between terrestrial and aquatic vertebrate groups. In addition, we propose that separation of the two groups might be linked to ammonia detoxification based on high G and low A contents, which encode Glu rich and Lys poor proteins.
Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J
2007-06-01
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
HLA-F polymorphisms in a Euro-Brazilian population from Southern Brazil.
Manvailer, L F S; Wowk, P F; Mattar, S B; da Siva, J S; da Graça Bicalho, M; Roxo, V M M S
2014-12-01
HLA-F is a non-classical major histocompatibility complex (MHC) gene. It codes class Ib MHC molecules with restricted distribution and less nucleotide variations than MHC class Ia genes. Of the 22 alleles registered on the IMGT database only four alleles encode for proteins that differ in their primary structure. To estimate genotype and allele frequencies, this study targeted on known protein coding regions of the HLA-F gene. Genotyping was performed by Sequence Base Typing (SBT). The sample was composed by 199-unrelated bone marrow donors from the Brazilian Bone Marrow Donor Registry (REDOME), Euro-Brazilians, from Southern Brazil. About 1673 bp were analyzed. The most frequent allele was HLA-F*01:01 (87.19%), followed by HLA-F*01:03 (12.31%), HLA-F*01:02 (0.25%) and HLA-F*01:04 (0.25%). Significant linkage disequilibrium (LD) was verified between HLA-F and HLA classes I and II alleles. This is the first study regarding HLA-F polymorphisms in a Euro-Brazilian population contributing to the Southern Brazilian genetic characterization. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Kadakkuzha, Beena M.; Liu, Xin-An; McCrate, Jennifer; Shankar, Gautam; Rizzo, Valerio; Afinogenova, Alina; Young, Brandon; Fallahi, Mohammad; Carvalloza, Anthony C.; Raveendra, Bindu; Puthanveettil, Sathyanarayanan V.
2015-01-01
Despite the importance of the long non-coding RNAs (lncRNAs) in regulating biological functions, the expression profiles of lncRNAs in the sub-regions of the mammalian brain and neuronal populations remain largely uncharacterized. By analyzing RNASeq datasets, we demonstrate region specific enrichment of populations of lncRNAs and mRNAs in the mouse hippocampus and pre-frontal cortex (PFC), the two major regions of the brain involved in memory storage and neuropsychiatric disorders. We identified 2759 lncRNAs and 17,859 mRNAs in the hippocampus and 2561 lncRNAs and 17,464 mRNAs expressed in the PFC. The lncRNAs identified correspond to ~14% of the transcriptome of the hippocampus and PFC and ~70% of the lncRNAs annotated in the mouse genome (NCBIM37) and are localized along the chromosomes as varying numbers of clusters. Importantly, we also found that a few of the tested lncRNA-mRNA pairs that share a genomic locus display specific co-expression in a region-specific manner. Furthermore, we find that sub-regions of the brain and specific neuronal populations have characteristic lncRNA expression signatures. These results reveal an unexpected complexity of the lncRNA expression in the mouse brain. PMID:25798087
O’Doherty, John P.
2015-01-01
Neural correlates of value have been extensively reported in a diverse set of brain regions. However, in many cases it is difficult to determine whether a particular neural response pattern corresponds to a value-signal per se as opposed to an array of alternative non-value related processes, such as outcome-identity coding, informational coding, encoding of autonomic and skeletomotor consequences, alongside previously described “salience” or “attentional” effects. Here, I review a number of experimental manipulations that can be used to test for value, and I identify the challenges in ascertaining whether a particular neural response is or is not a value signal. Finally, I emphasize that some non-value related signals may be especially informative as a means of providing insight into the nature of the decision-making related computations that are being implemented in a particular brain region. PMID:24726573
Van, K; Onoda, S; Kim, M Y; Kim, K D; Lee, S-H
2008-03-01
The Waxy (Wx) gene product controls the formation of a straight chain polymer of amylose in the starch pathway. Dominance/recessiveness of the Wx allele is associated with amylose content, leading to non-waxy/waxy phenotypes. For a total of 113 foxtail millet accessions, agronomic traits and the molecular differences of the Wx gene were surveyed to evaluate genetic diversities. Molecular types were associated with phenotypes determined by four specific primer sets (non-waxy, Type I; low amylose, Type VI; waxy, Type IV or V). Additionally, the insertion of transposable element in waxy was confirmed by ex1/TSI2R, TSI2F/ex2, ex2int2/TSI7R and TSI7F/ex4r. Seventeen single nucleotide polymorphims (SNPs) were observed from non-coding regions, while three SNPs from coding regions were non-synonymous. Interestingly, the phenotype of No. 88 was still non-waxy, although seven nucleotides (AATTGGT) insertion at 2,993 bp led to 78 amino acids shorter. The rapid decline of r (2) in the sequenced region (exon 1-intron 1-exon 2) suggested a low level of linkage disequilibrium and limited haplotype structure. K (s) values and estimation of evolutionary events indicate early divergence of S. italica among cereal crops. This study suggested the Wx gene was one of the targets in the selection process during domestication.
Kocher, Arthur; Gantier, Jean-Charles; Holota, Hélène; Jeziorski, Céline; Coissac, Eric; Bañuls, Anne-Laure; Girod, Romain; Gaborit, Pascal; Murienne, Jérôme
2016-11-01
The nearly complete mitochondrial genome of Lutzomyia umbratilis Ward & Fraiha, 1977 (Psychodidae: Phlebotominae), considered as the main vector of Leishmania guyanensis, is presented. The sequencing has been performed on an Illumina Hiseq 2500 platform, with a genome skimming strategy. The full nuclear ribosomal RNA segment was also assembled. The mitogenome of L. umbratilis was determined to be at least 15,717 bp-long and presents an architecture found in many mitogenomes of insect (13 protein-coding genes, 22 transfer RNAs, two ribosomal RNAs, and one non-coding region also referred as the control region). The control region contains a large repeated element of c. 370 bp and a poly-AT region of unknown length. This is the first mitogenome of Psychodidae to be described.
Altruistic functions for selfish DNA.
Faulkner, Geoffrey J; Carninci, Piero
2009-09-15
Mammalian genomes are comprised of 30-50% transposed elements (TEs). The vast majority of these TEs are truncated and mutated fragments of retrotransposons that are no longer capable of transposition. Although initially regarded as important factors in the evolution of gene regulatory networks, TEs are now commonly perceived as neutrally evolving and non-functional genomic elements. In a major development, recent works have strongly contradicted this "selfish DNA" or "junk DNA" dogma by demonstrating that TEs use a host of novel promoters to generate RNA on a massive scale across most eukaryotic cells. This transcription frequently functions to control the expression of protein-coding genes via alternative promoters, cis regulatory non protein-coding RNAs and the formation of double stranded short RNAs. If considered in sum, these findings challenge the designation of TEs as selfish and neutrally evolving genomic elements. Here, we will expand upon these themes and discuss challenges in establishing novel TE functions in vivo.
Prader-Willi Syndrome: Obesity due to Genomic Imprinting
Butler, Merlin G
2011-01-01
Prader-Willi syndrome (PWS) is a complex neurodevelopmental disorder due to errors in genomic imprinting with loss of imprinted genes that are paternally expressed from the chromosome 15q11-q13 region. Approximately 70% of individuals with PWS have a de novo deletion of the paternally derived 15q11-q13 region in which there are two subtypes (i.e., larger Type I or smaller Type II), maternal disomy 15 (both 15s from the mother) in about 25% of cases, and the remaining subjects have either defects in the imprinting center controlling the activity of imprinted genes or due to other chromosome 15 rearrangements. PWS is characterized by a particular facial appearance, infantile hypotonia, a poor suck and feeding difficulties, hypogonadism and hypogenitalism in both sexes, short stature and small hands and feet due to growth hormone deficiency, mild learning and behavioral problems (e.g., skin picking, temper tantrums) and hyperphagia leading to early childhood obesity. Obesity is a significant health problem, if uncontrolled. PWS is considered the most common known genetic cause of morbid obesity in children. The chromosome 15q11-q13 region contains approximately 100 genes and transcripts in which about 10 are imprinted and paternally expressed. This region can be divided into four groups: 1) a proximal non-imprinted region; 2) a PWS paternal-only expressed region containing protein-coding and non-coding genes; 3) an Angelman syndrome region containing maternally expressed genes and 4) a distal non-imprinted region. This review summarizes the current understanding of the genetic causes, the natural history and clinical presentation of individuals with PWS. PMID:22043168
Statistical and linguistic features of DNA sequences
NASA Technical Reports Server (NTRS)
Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
Comparative genomics reveals insights into avian genome evolution and adaptation.
Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun
2014-12-12
Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.
Borodovsky, M; Rudd, K E; Koonin, E V
1994-01-01
The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Sequence variations of the bovine prion protein gene (PRNP) in native Korean Hanwoo cattle
Choi, Sangho
2012-01-01
Bovine spongiform encephalopathy (BSE) is one of the fatal neurodegenerative diseases known as transmissible spongiform encephalopathies (TSEs) caused by infectious prion proteins. Genetic variations correlated with susceptibility or resistance to TSE in humans and sheep have not been reported for bovine strains including those from Holstein, Jersey, and Japanese Black cattle. Here, we investigated bovine prion protein gene (PRNP) variations in Hanwoo cattle [Bos (B.) taurus coreanae], a native breed in Korea. We identified mutations and polymorphisms in the coding region of PRNP, determined their frequency, and evaluated their significance. We identified four synonymous polymorphisms and two non-synonymous mutations in PRNP, but found no novel polymorphisms. The sequence and number of octapeptide repeats were completely conserved, and the haplotype frequency of the coding region was similar to that of other B. taurus strains. When we examined the 23-bp and 12-bp insertion/deletion (indel) polymorphisms in the non-coding region of PRNP, Hanwoo cattle had a lower deletion allele and 23-bp del/12-bp del haplotype frequency than healthy and BSE-affected animals of other strains. Thus, Hanwoo are seemingly less susceptible to BSE than other strains due to the 23-bp and 12-bp indel polymorphisms. PMID:22705734
Chen, Zhi-Teng; Du, Yu-Zhou
2015-03-01
The complete mitochondrial genome of the stonefly, Sweltsa longistyla Wu (Plecoptera: Chloroperlidae), was sequenced in this study. The mitogenome of S. longistyla is 16,151bp and contains 37 genes including 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes, and a large non-coding region. S. longistyla, Pteronarcys princeps Banks, Kamimuria wangi Du and Cryptoperla stilifera Sivec belong to the Plecoptera, and the gene order and orientation of their mitogenomes were similar. The overall AT content for the four stoneflies was below 72%, and the AT content of tRNA genes was above 69%. The four genomes were compact and contained only 65-127bp of non-coding intergenic DNAs. Overlapping nucleotides existed in all four genomes and ranged from 24 (P. princeps) to 178bp (K. wangi). There was a 7-bp motif ('ATGATAA') of overlapping DNA and an 8-bp motif (AAGCCTTA) conserved in three stonefly species (P. princeps, K. wangi and C. stilifera). The control regions of four stoneflies contained a stem-loop structure. Four conserved sequence blocks (CSBs) were present in the A+T-rich regions of all four stoneflies. Copyright © 2014 Elsevier B.V. All rights reserved.
Full-f version of GENE for turbulence in open-field-line systems
NASA Astrophysics Data System (ADS)
Pan, Q.; Told, D.; Shi, E. L.; Hammett, G. W.; Jenko, F.
2018-06-01
Unique properties of plasmas in the tokamak edge, such as large amplitude fluctuations and plasma-wall interactions in the open-field-line regions, require major modifications of existing gyrokinetic codes originally designed for simulating core turbulence. To this end, the global version of the 3D2V gyrokinetic code GENE, so far employing a δf-splitting technique, is extended to simulate electrostatic turbulence in straight open-field-line systems. The major extensions are the inclusion of the velocity-space nonlinearity, the development of a conducting-sheath boundary, and the implementation of the Lenard-Bernstein collision operator. With these developments, the code can be run as a full-f code and can handle particle loss to and reflection from the wall. The extended code is applied to modeling turbulence in the Large Plasma Device (LAPD), with a reduced mass ratio and a much lower collisionality. Similar to turbulence in a tokamak scrape-off layer, LAPD turbulence involves collisions, parallel streaming, cross-field turbulent transport with steep profiles, and particle loss at the parallel boundary.
2011-01-01
Background Most of the non-B HIV-1 subtypes are predominant in Sub-Saharan Africa and India although they have been found worldwide. In the last decade, immigration from these areas has increased considerably in Spain. The objective of this study was to evaluate the prevalence of non-B subtypes circulating in a cohort of HIV-1-infected immigrants in Seville, Southern Spain and to identify drug resistance-associated mutations. Methods Complete protease and first 220 codons of the reverse transcriptase coding regions were amplified and sequenced by population sequencing. HIV-1 subtypes were determined using Stanford University Drug Resistance Database, and phylogenetic analysis was performed comparing multiple reported sequences. Drug resistance mutations were defined according to the International AIDS Society-USA. Results From 2000 to 2010 a total of 1,089 newly diagnosed HIV-1-infected patients were enrolled in our cohort. Of these, 121 were immigrants, of which 98 had ethical approval and informed consent to include in our study. Twenty-nine immigrants (29/98, 29.6%) were infected with non-B subtypes, of which 15/29 (51.7%) were CRF02-AG, mostly from Sub-Saharan Africa, and 2/29 (6.9%) were CRF01-AE from Eastern Europe. A, C, F, J and G subtypes from Eastern Europe, Central-South America and Sub-Saharan Africa were also present. Some others harboured recombinant forms CRF02-AG/CRF01-AE, CRF2-AG/G and F/B, B/C, and K/G, in PR and RT-coding regions. Patients infected with non-B subtypes showed a high frequency of minor protease inhibitor resistance mutations, M36I, L63P, and K20R/I. Only one patient, CRF02_AG, showed major resistance mutation L90M. Major RT inhibitor resistance mutations K70R and A98G were present in one patient with subtype G, L100I in one patient with CRF01_AE, and K103N in another patient with CRF01_AE. Three patients had other mutations such as V118I, E138A and V90I. Conclusions The circulation of non-B subtypes has significantly increased in Southern Spain during the last decade, with 29.6% prevalence, in association with demographic changes among immigrants. This could be an issue in the treatment and management of these patients. Resistance mutations have been detected in these patients with a prevalence of 7% among treatment-naïve patients compared with the 21% detected among patients under HAART or during treatment interruption. PMID:21871090
Fayaz, Shima; Fard-Esfahani, Pezhman; Fard-Esfahani, Armaghan; Mostafavi, Ehsan; Meshkani, Reza; Mirmiranpour, Hossein; Khaghani, Shahnaz
2012-01-01
Homologous recombination (HR) is the major pathway for repairing double strand breaks (DSBs) in eukaryotes and XRCC2 is an essential component of the HR repair machinery. To evaluate the potential role of mutations in gene repair by HR in individuals susceptible to differentiated thyroid carcinoma (DTC) we used high resolution melting (HRM) analysis, a recently introduced method for detecting mutations, to examine the entire XRCC2 coding region in an Iranian population. HRM analysis was used to screen for mutations in three XRCC2 coding regions in 50 patients and 50 controls. There was no variation in the HRM curves obtained from the analysis of exons 1 and 2 in the case and control groups. In exon 3, an Arg188His polymorphism (rs3218536) was detected as a new melting curve group (OR: 1.46; 95%CI: 0.432–4.969; p = 0.38) compared with the normal melting curve. We also found a new Ser150Arg polymorphism in exon 3 of the control group. These findings suggest that genetic variations in the XRCC2 coding region have no potential effects on susceptibility to DTC. However, further studies with larger populations are required to confirm this conclusion. PMID:22481871
Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng
2005-09-10
Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less
Martin, Guillaume E; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader
2014-06-01
To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the 'inverted repeat-lacking clade', IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip-flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. © The Author 2014. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Webb, Kristen M; Rosenthal, Benjamin M
2011-01-01
The mitochondrial genome's non-recombinant mode of inheritance and relatively rapid rate of evolution has promoted its use as a marker for studying the biogeographic history and evolutionary interrelationships among many metazoan species. A modest portion of the mitochondrial genome has been defined for 12 species and genotypes of parasites in the genus Trichinella, but its adequacy in representing the mitochondrial genome as a whole remains unclear, as the complete coding sequence has been characterized only for Trichinella spiralis. Here, we sought to comprehensively describe the extent and nature of divergence between the mitochondrial genomes of T. spiralis (which poses the most appreciable zoonotic risk owing to its capacity to establish persistent infections in domestic pigs) and Trichinella murrelli (which is the most prevalent species in North American wildlife hosts, but which poses relatively little risk to the safety of pork). Next generation sequencing methodologies and scaffold and de novo assembly strategies were employed. The entire protein-coding region was sequenced (13,917 bp), along with a portion of the highly repetitive non-coding region (1524 bp) of the mitochondrial genome of T. murrelli with a combined average read depth of 250 reads. The accuracy of base calling, estimated from coding region sequence was found to exceed 99.3%. Genome content and gene order was not found to be significantly different from that of T. spiralis. An overall inter-species sequence divergence of 9.5% was estimated. Significant variation was identified when the amount of variation between species at each gene is compared to the average amount of variation between species across the coding region. Next generation sequencing is a highly effective means to obtain previously unknown mitochondrial genome sequence. Particular to parasites, the extremely deep coverage achieved through this method allows for the detection of sequence heterogeneity between the multiple individuals that necessarily comprise such templates. Copyright © 2010 Elsevier B.V. All rights reserved.
Toren, Dmitri; Barzilay, Thomer; Tacutu, Robi; Lehmann, Gilad; Muradian, Khachik K; Fraifeld, Vadim E
2016-01-04
Mitochondria are the only organelles in the animal cells that have their own genome. Due to a key role in energy production, generation of damaging factors (ROS, heat), and apoptosis, mitochondria and mtDNA in particular have long been considered one of the major players in the mechanisms of aging, longevity and age-related diseases. The rapidly increasing number of species with fully sequenced mtDNA, together with accumulated data on longevity records, provides a new fascinating basis for comparative analysis of the links between mtDNA features and animal longevity. To facilitate such analyses and to support the scientific community in carrying these out, we developed the MitoAge database containing calculated mtDNA compositional features of the entire mitochondrial genome, mtDNA coding (tRNA, rRNA, protein-coding genes) and non-coding (D-loop) regions, and codon usage/amino acids frequency for each protein-coding gene. MitoAge includes 922 species with fully sequenced mtDNA and maximum lifespan records. The database is available through the MitoAge website (www.mitoage.org or www.mitoage.info), which provides the necessary tools for searching, browsing, comparing and downloading the data sets of interest for selected taxonomic groups across the Kingdom Animalia. The MitoAge website assists in statistical analysis of different features of the mtDNA and their correlative links to longevity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The impact of major trauma network triage systems on patients with major burns.
Nizamoglu, Metin; O'Connor, Edmund Fitzgerald; Bache, Sarah; Theodorakopoulou, Evgenia; Sen, Sankhya; Sherren, Peter; Barnes, David; Dziewulski, Peter
2016-12-01
Trauma is a leading cause of death and disability worldwide. Patients presenting with severe trauma and burns benefit from specifically trained multidisciplinary teams. Regional trauma systems have shown improved outcomes for trauma patients. The aim of this study is to determine whether the development of major trauma systems have improved the management of patients with major burns. A retrospective study was performed over a four-year period reviewing all major burns in adults and children received at a regional burns centre in the UK before and after the implementation of the regional trauma systems and major trauma centres (MTC). Comparisons were drawn between three areas: (1) Patients presenting before the introduction of MTC and after the introduction of MTC. (2) Patients referred from MTC and non-MTC within the region, following the introduction of MTC. (3) Patients referred using the urban trauma protocol and the rural trauma protocol. Following the introduction of regional trauma systems and major trauma centres (MTC), isolated burn patients seen at our regional burns centre did not show any significant improvement in transfer times, admission resuscitation parameters, organ dysfunction or survival when referred from a MTC compared to a non-MTC emergency department. There was also no significant difference in survival when comparing referrals from all hospitals pre and post establishment of the major trauma network. No significant outcome benefit was demonstrated for burns patients referred via MTCs compared to non-MTCs. We suggest further research is needed to ascertain whether burns patients benefit from prolonged transfer times to a MTC compared to those seen at their local hospitals prior to transfer to a regional burns unit for further specialist care. Copyright © 2016 Elsevier Ltd and ISBI. All rights reserved.
Analysis of correlation structures in the Synechocystis PCC6803 genome.
Wu, Zuo-Bing
2014-12-01
Transfer of nucleotide strings in the Synechocystis sp. PCC6803 genome is investigated to exhibit periodic and non-periodic correlation structures by using the recurrence plot method and the phase space reconstruction technique. The periodic correlation structures are generated by periodic transfer of several substrings in long periodic or non-periodic nucleotide strings embedded in the coding regions of genes. The non-periodic correlation structures are generated by non-periodic transfer of several substrings covering or overlapping with the coding regions of genes. In the periodic and non-periodic transfer, some gaps divide the long nucleotide strings into the substrings and prevent their global transfer. Most of the gaps are either the replacement of one base or the insertion/reduction of one base. In the reconstructed phase space, the points generated from two or three steps for the continuous iterative transfer via the second maximal distance can be fitted by two lines. It partly reveals an intrinsic dynamics in the transfer of nucleotide strings. Due to the comparison of the relative positions and lengths, the substrings concerned with the non-periodic correlation structures are almost identical to the mobile elements annotated in the genome. The mobile elements are thus endowed with the basic results on the correlation structures. Copyright © 2014 Elsevier Ltd. All rights reserved.
Dong, Xiaomin; Chen, Kenian; Cuevas-Diaz Duran, Raquel; You, Yanan; Sloan, Steven A; Zhang, Ye; Zong, Shan; Cao, Qilin; Barres, Ben A; Wu, Jia Qian
2015-12-01
Long non-coding RNAs (lncRNAs) (> 200 bp) play crucial roles in transcriptional regulation during numerous biological processes. However, it is challenging to comprehensively identify lncRNAs, because they are often expressed at low levels and with more cell-type specificity than are protein-coding genes. In the present study, we performed ab initio transcriptome reconstruction using eight purified cell populations from mouse cortex and detected more than 5000 lncRNAs. Predicting the functions of lncRNAs using cell-type specific data revealed their potential functional roles in Central Nervous System (CNS) development. We performed motif searches in ENCODE DNase I digital footprint data and Mouse ENCODE promoters to infer transcription factor (TF) occupancy. By integrating TF binding and cell-type specific transcriptomic data, we constructed a novel framework that is useful for systematically identifying lncRNAs that are potentially essential for brain cell fate determination. Based on this integrative analysis, we identified lncRNAs that are regulated during Oligodendrocyte Precursor Cell (OPC) differentiation from Neural Stem Cells (NSCs) and that are likely to be involved in oligodendrogenesis. The top candidate, lnc-OPC, shows highly specific expression in OPCs and remarkable sequence conservation among placental mammals. Interestingly, lnc-OPC is significantly up-regulated in glial progenitors from experimental autoimmune encephalomyelitis (EAE) mouse models compared to wild-type mice. OLIG2-binding sites in the upstream regulatory region of lnc-OPC were identified by ChIP (chromatin immunoprecipitation)-Sequencing and validated by luciferase assays. Loss-of-function experiments confirmed that lnc-OPC plays a functional role in OPC genesis. Overall, our results substantiated the role of lncRNA in OPC fate determination and provided an unprecedented data source for future functional investigations in CNS cell types. We present our datasets and analysis results via the interactive genome browser at our laboratory website that is freely accessible to the research community. This is the first lncRNA expression database of collective populations of glia, vascular cells, and neurons. We anticipate that these studies will advance the knowledge of this major class of non-coding genes and their potential roles in neurological development and diseases.
Deep sequencing approaches for the analysis of prokaryotic transcriptional boundaries and dynamics.
James, Katherine; Cockell, Simon J; Zenkin, Nikolay
2017-05-01
The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively simple genomes of prokaryotes. Traditional methods for the study of these areas, such as tiling arrays, are noisy, labour-intensive and lack the resolution required for densely-packed bacterial genomes. Recently, deep sequencing has become increasingly popular for the study of the transcriptome due to its lower costs, higher accuracy and single nucleotide resolution. These methods have revolutionised our understanding of prokaryotic transcriptional dynamics. Here, we review the deep sequencing and data analysis techniques that are available for the study of transcription in prokaryotes, and discuss the bioinformatic considerations of these analyses. Copyright © 2017 Elsevier Inc. All rights reserved.
Mitochondrial genomes of parasitic flatworms.
Le, Thanh H; Blair, David; McManus, Donald P
2002-05-01
Complete or near-complete mitochondrial genomes are now available for 11 species or strains of parasitic flatworms belonging to the Trematoda and the Cestoda. The organization of these genomes is not strikingly different from those of other eumetazoans, although one gene (atp8) commonly found in other phyla is absent from flatworms. The gene order in most flatworms has similarities to those seen in higher protostomes such as annelids. However, the gene order has been drastically altered in Schistosoma mansoni, which obscures this possible relationship. Among the sequenced taxa, base composition varies considerably, creating potential difficulties for phylogeny reconstruction. Long non-coding regions are present in all taxa, but these vary in length from only a few hundred to approximately 10000 nucleotides. Among Schistosoma spp., the long non-coding regions are rich in repeats and length variation among individuals is known. Data from mitochondrial genomes are valuable for studies on species identification, phylogenies and biogeography.
NASA Astrophysics Data System (ADS)
Marocchino, A.; Atzeni, S.; Schiavi, A.
2014-01-01
In some regions of a laser driven inertial fusion target, the electron mean-free path can become comparable to or even longer than the electron temperature gradient scale-length. This can be particularly important in shock-ignited (SI) targets, where the laser-spike heated corona reaches temperatures of several keV. In this case, thermal conduction cannot be described by a simple local conductivity model and a Fick's law. Fluid codes usually employ flux-limited conduction models, which preserve causality, but lose important features of the thermal flow. A more accurate thermal flow modeling requires convolution-like non-local operators. In order to improve the simulation of SI targets, the non-local electron transport operator proposed by Schurtz-Nicolaï-Busquet [G. P. Schurtz et al., Phys. Plasmas 7, 4238 (2000)] has been implemented in the DUED fluid code. Both one-dimensional (1D) and two-dimensional (2D) simulations of SI targets have been performed. 1D simulations of the ablation phase highlight that while the shock profile and timing might be mocked up with a flux-limiter; the electron temperature profiles exhibit a relatively different behavior with no major effects on the final gain. The spike, instead, can only roughly be reproduced with a fixed flux-limiter value. 1D target gain is however unaffected, provided some minor tuning of laser pulses. 2D simulations show that the use of a non-local thermal conduction model does not affect the robustness to mispositioning of targets driven by quasi-uniform laser irradiation. 2D simulations performed with only two final polar intense spikes yield encouraging results and support further studies.
Stimulus features coded by single neurons of a macaque body category selective patch.
Popivanov, Ivo D; Schyns, Philippe G; Vogels, Rufin
2016-04-26
Body category-selective regions of the primate temporal cortex respond to images of bodies, but it is unclear which fragments of such images drive single neurons' responses in these regions. Here we applied the Bubbles technique to the responses of single macaque middle superior temporal sulcus (midSTS) body patch neurons to reveal the image fragments the neurons respond to. We found that local image fragments such as extremities (limbs), curved boundaries, and parts of the torso drove the large majority of neurons. Bubbles revealed the whole body in only a few neurons. Neurons coded the features in a manner that was tolerant to translation and scale changes. Most image fragments were excitatory but for a few neurons both inhibitory and excitatory fragments (opponent coding) were present in the same image. The fragments we reveal here in the body patch with Bubbles differ from those suggested in previous studies of face-selective neurons in face patches. Together, our data indicate that the majority of body patch neurons respond to local image fragments that occur frequently, but not exclusively, in bodies, with a coding that is tolerant to translation and scale. Overall, the data suggest that the body category selectivity of the midSTS body patch depends more on the feature statistics of bodies (e.g., extensions occur more frequently in bodies) than on semantics (bodies as an abstract category).
Stimulus features coded by single neurons of a macaque body category selective patch
Popivanov, Ivo D.; Schyns, Philippe G.; Vogels, Rufin
2016-01-01
Body category-selective regions of the primate temporal cortex respond to images of bodies, but it is unclear which fragments of such images drive single neurons’ responses in these regions. Here we applied the Bubbles technique to the responses of single macaque middle superior temporal sulcus (midSTS) body patch neurons to reveal the image fragments the neurons respond to. We found that local image fragments such as extremities (limbs), curved boundaries, and parts of the torso drove the large majority of neurons. Bubbles revealed the whole body in only a few neurons. Neurons coded the features in a manner that was tolerant to translation and scale changes. Most image fragments were excitatory but for a few neurons both inhibitory and excitatory fragments (opponent coding) were present in the same image. The fragments we reveal here in the body patch with Bubbles differ from those suggested in previous studies of face-selective neurons in face patches. Together, our data indicate that the majority of body patch neurons respond to local image fragments that occur frequently, but not exclusively, in bodies, with a coding that is tolerant to translation and scale. Overall, the data suggest that the body category selectivity of the midSTS body patch depends more on the feature statistics of bodies (e.g., extensions occur more frequently in bodies) than on semantics (bodies as an abstract category). PMID:27071095
Keeping abreast with long non-coding RNAs in mammary gland development and breast cancer
Hansji, Herah; Leung, Euphemia Y.; Baguley, Bruce C.; Finlay, Graeme J.; Askarian-Amiri, Marjan E.
2014-01-01
The majority of the human genome is transcribed, even though only 2% of transcripts encode proteins. Non-coding transcripts were originally dismissed as evolutionary junk or transcriptional noise, but with the development of whole genome technologies, these non-coding RNAs (ncRNAs) are emerging as molecules with vital roles in regulating gene expression. While shorter ncRNAs have been extensively studied, the functional roles of long ncRNAs (lncRNAs) are still being elucidated. Studies over the last decade show that lncRNAs are emerging as new players in a number of diseases including cancer. Potential roles in both oncogenic and tumor suppressive pathways in cancer have been elucidated, but the biological functions of the majority of lncRNAs remain to be identified. Accumulated data are identifying the molecular mechanisms by which lncRNA mediates both structural and functional roles. LncRNA can regulate gene expression at both transcriptional and post-transcriptional levels, including splicing and regulating mRNA processing, transport, and translation. Much current research is aimed at elucidating the function of lncRNAs in breast cancer and mammary gland development, and at identifying the cellular processes influenced by lncRNAs. In this paper we review current knowledge of lncRNAs contributing to these processes and present lncRNA as a new paradigm in breast cancer development. PMID:25400658
Molecular phylogeography of the Andean alpine plant, Gunnera magellanica
NASA Astrophysics Data System (ADS)
Shimizu, M.; Fujii, N.; Ito, M.; Asakawa, T.; Nishida, H.; Suyama, C.; Ueda, K.
2015-12-01
To clarify the evolutionary history of Gunnera magellanica (Gunneraceae), an alpine plant of the Andes mountains, we performed molecular phylogeographic analyses based on the sequences of an internal transcribed spacer (ITS) of nuclear ribosomal DNA and four non-coding regions (trnH-psbA, trnL-trnF, atpB-rbcL, rpl16 intron) of chloroplast DNA. We investigated 3, 4, 4 and 11 populations in, Ecuador, Bolivia, Argentina, and Chile, respectively, and detected six ITS genotypes (Types A-F) in G. magellanica. Five genotypes (Types A-E) were observed in the northern Andes population (Ecuador and Bolivia); only one ITS genotype (Type F) was observed in the southern Andes population (Chile and Argentina). Phylogenetic analyses showed that the ITS genotypes of the northern and southern Andes populations form different clades with high bootstrap probability. Furthermore, network analysis, analysis of molecular variance, and spatial analysis of molecular variance showed that there were two major clusters (the northern and southern Andes populations) in this species. Furthermore, in chloroplast DNA analysis, three major clades (northern Andes, Chillan, and southern Andes) were inferred from phylogenetic analyses using four non-coding regions, a finding that was supported by the above three types of analysis. The Chillan clade is the northernmost population in the southern Andes populations. With the exception of the Chillan clade (Chillan population), results of nuclear DNA and chloroplast DNA analyses were consistent. Both markers showed that the northern and southern Andes populations of G. magellanica were genetically different from each other. This type of clear phylogeographical structure was supported by PERMUT analysis according to Pons & Petit (1995, 1996). Moreover, based on our preliminary estimation that is based on the ITS sequences, the northern and southern Andes clades diverged ~0.63-3 million years ago, during a period of upheaval in the Andes. This suggests that the populations of G. magellanica that were distributed along the Andes have been divided into the two local populations of the northern and southern Andes during the uplift of the Andes.
Al-Dmour, Hayat; Al-Ani, Ahmed
2016-04-01
The present work has the goal of developing a secure medical imaging information system based on a combined steganography and cryptography technique. It attempts to securely embed patient's confidential information into his/her medical images. The proposed information security scheme conceals coded Electronic Patient Records (EPRs) into medical images in order to protect the EPRs' confidentiality without affecting the image quality and particularly the Region of Interest (ROI), which is essential for diagnosis. The secret EPR data is converted into ciphertext using private symmetric encryption method. Since the Human Visual System (HVS) is less sensitive to alterations in sharp regions compared to uniform regions, a simple edge detection method has been introduced to identify and embed in edge pixels, which will lead to an improved stego image quality. In order to increase the embedding capacity, the algorithm embeds variable number of bits (up to 3) in edge pixels based on the strength of edges. Moreover, to increase the efficiency, two message coding mechanisms have been utilized to enhance the ±1 steganography. The first one, which is based on Hamming code, is simple and fast, while the other which is known as the Syndrome Trellis Code (STC), is more sophisticated as it attempts to find a stego image that is close to the cover image through minimizing the embedding impact. The proposed steganography algorithm embeds the secret data bits into the Region of Non Interest (RONI), where due to its importance; the ROI is preserved from modifications. The experimental results demonstrate that the proposed method can embed large amount of secret data without leaving a noticeable distortion in the output image. The effectiveness of the proposed algorithm is also proven using one of the efficient steganalysis techniques. The proposed medical imaging information system proved to be capable of concealing EPR data and producing imperceptible stego images with minimal embedding distortions compared to other existing methods. In order to refrain from introducing any modifications to the ROI, the proposed system only utilizes the Region of Non Interest (RONI) in embedding the EPR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The complete mitochondrial genome of the Border Collie dog.
Wu, An-Quan; Zhang, Yong-Liang; Li, Li-Li; Chen, Long; Yang, Tong-Wen
2016-01-01
Border Collie dog is one of the famous breed of dog. In the present work we report the complete mitochondrial genome sequence of Border Collie dog for the first time. The total length of the mitogenome was 16,730 bp with the base composition of 31.6% for A, 28.7% for T, 25.5% for C, and 14.2% for G and an A-T (60.3%)-rich feature was detected. It harbored 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and one non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of dogs.
Development and Validation of a Supersonic Helium-Air Coannular Jet Facility
NASA Technical Reports Server (NTRS)
Carty, Atherton A.; Cutler, Andrew D.
1999-01-01
Data are acquired in a simple coannular He/air supersonic jet suitable for validation of CFD (Computational Fluid Dynamics) codes for high speed propulsion. Helium is employed as a non-reacting hydrogen fuel simulant, constituting the core of the coannular flow while the coflow is composed of air. The mixing layer interface between the two flows in the near field and the plume region which develops further downstream constitute the primary regions of interest, similar to those present in all hypersonic air breathing propulsion systems. A computational code has been implemented from the experiment's inception, serving as a tool for model design during the development phase.
1989-11-01
from National Technical Information Service , 5285 Port Royal Road, Springfield, VA 22161. 17. COSATI CODES 16 SUBJECT TERMS (Continue on reverse if...National Marine Fisheries Service , NOAA Southwest Region Honolulu, HI 96822-2396 19. ABSTRACT (Continued). world. The major environmental impact facing...Nitta and John J. Naughton of the Southwest Region, National Marine Fisheries Service (NMFS), under support agreement WESCW88-241. Dr. C. Scott Baker
2011-09-01
tectonically active regions such as the Middle East. For example, we previously applied the code to determine the crust and upper mantle structure...Objective Optimization (MOO) for Multiple Datasets The primary goal of our current project is to develop a tool for estimating crustal structure that...be used to obtain crustal velocity structures by modeling broadband waveform, receiver function, and surface wave dispersion data. The code has been
Bio—Cryptography: A Possible Coding Role for RNA Redundancy
NASA Astrophysics Data System (ADS)
Regoli, M.
2009-03-01
The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences have some sections called Introns. Introns, derived from the term "intragenic regions," are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behavior in the access to the secret key to code the messages. In the RNA-Crypto System algorithm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
A Repository of Codes of Ethics and Technical Standards in Health Informatics
Zaïane, Osmar R.
2014-01-01
We present a searchable repository of codes of ethics and standards in health informatics. It is built using state-of-the-art search algorithms and technologies. The repository will be potentially beneficial for public health practitioners, researchers, and software developers in finding and comparing ethics topics of interest. Public health clinics, clinicians, and researchers can use the repository platform as a one-stop reference for various ethics codes and standards. In addition, the repository interface is built for easy navigation, fast search, and side-by-side comparative reading of documents. Our selection criteria for codes and standards are two-fold; firstly, to maintain intellectual property rights, we index only codes and standards freely available on the internet. Secondly, major international, regional, and national health informatics bodies across the globe are surveyed with the aim of understanding the landscape in this domain. We also look at prevalent technical standards in health informatics from major bodies such as the International Standards Organization (ISO) and the U. S. Food and Drug Administration (FDA). Our repository contains codes of ethics from the International Medical Informatics Association (IMIA), the iHealth Coalition (iHC), the American Health Information Management Association (AHIMA), the Australasian College of Health Informatics (ACHI), the British Computer Society (BCS), and the UK Council for Health Informatics Professions (UKCHIP), with room for adding more in the future. Our major contribution is enhancing the findability of codes and standards related to health informatics ethics by compilation and unified access through the health informatics ethics repository. PMID:25422725
Disruption of a -35kb enhancer impairs CTCF binding and MLH1 expression in colorectal cells.
Liu, Qing; Thoms, Julie A; Nunez, Andrea C; Huang, Yizhou; Knezevic, Kathy; Packham, Deborah; Poulos, Rebecca C; Williams, Rachel; Beck, Dominik; Hawkins, Nicholas J; Ward, Robyn L; Wong, Jason W H; Hesson, Luke B; Sloane, Mathew A; Pimanda, John
2018-06-13
MLH1 is a major tumour suppressor gene involved in the pathogenesis of Lynch syndrome and various sporadic cancers. Despite their potential pathogenic importance, genomic regions capable of regulating MLH1 expression over long distances have yet to be identified. Here we use chromosome conformation capture (3C) to screen a 650-kb region flanking the MLH1 locus to identify interactions between the MLH1 promoter and distal regions in MLH1 expressing and non-expressing cells. Putative enhancers were functionally validated using luciferase reporter assays, chromatin immunoprecipitation and CRISPR-Cas9 mediated deletion of endogenous regions. To evaluate whether germline variants in the enhancer might contribute to impaired MLH1 expression in patients with suspected Lynch syndrome, we also screened germline DNA from a cohort of 74 patients with no known coding mutations or epimutations at the MLH1 promoter. A 1.8kb DNA fragment, 35kb upstream of the MLH1 transcription start site enhances MLH1 gene expression in colorectal cells. The enhancer was bound by CTCF and CRISPR-Cas9 mediated deletion of a core binding region impairs endogenous MLH1 expression. 5.4% of suspected Lynch syndrome patients have a rare single nucleotide variant (G>A; rs143969848; 2.5% in gnomAD European, non-Finnish) within a highly conserved CTCF binding motif, which disrupts enhancer activity in SW620 colorectal carcinoma cells. A CTCF bound region within the MLH1 -35 enhancer regulates MLH1 expression in colorectal cells and is worthy of scrutiny in future genetic screening strategies for suspected Lynch syndrome associated with loss of MLH1 expression. Copyright ©2018, American Association for Cancer Research.
Sun, Liying; Andika, Ida Bagus; Shen, Jiangfeng; Yang, Di; Ratti, Claudio; Chen, Jianping
2013-10-01
Some viruses use alternative translation initiation at non-AUG codons as a strategy to produce multiple proteins during gene expression. Here we show that, using this strategy, Chinese wheat mosaic virus (CWMV; Furovirus) expresses a larger form of coat protein (N-ext/CP) in infected plants. Site-directed mutagenesis and transient expression analysis confirmed that CWMV N-ext/CP is initiated at an upstream in-frame CUG codon at nucleotide position 207-209 of RNA 2, which adds a 39 amino acid (aa) N-terminal extension to the major CP. Interestingly, in planta and in vitro analyses indicated that CWMV N-ext/CP but not CP interacts with the CWMV cysteine-rich protein (CRP), an RNA silencing suppressor. We further determined that the N-terminal 39 aa extension, particularly the 10 aa region immediately upstream of the major CP coding region is responsible for the interaction of N-ext/CP with CRP. In an Agrobacterium co-infiltration assay, co-expression with N-ext/CP did not affect CRP silencing suppression activity. Thus the alternative translation initiation at a CUG codon provides the CWMV N-ext/CP with the ability to bind to the viral silencing suppressor. Copyright © 2013 Elsevier B.V. All rights reserved.
Perret, Stéphanie; Maamar, Hédia; Bélaich, Jean-Pierre; Tardif, Chantal
2004-01-01
The enzymatic composition of the cellulosomes produced by Clostridium cellulolyticum was modified by inhibiting the synthesis of Cel48F that is the major cellulase of the cellulosomes. The strain ATCC 35319 (pSOSasrF) was developed to over-produce a 469 nucleotide-long antisense-RNA (asRNA) directed against the ribosome-binding site region and the beginning of the coding region of the cel48F mRNAs. The cellulolytic system secreted by the asRNA-producing strain showed a markedly lower amount of Cel48F, compared to the control strain transformed with the empty plasmid (pSOSzero). This was correlated with a 30% decrease of the specific activity of the cellulolytic system on Avicel cellulose, indicating that Cel48F plays an important role in the recalcitrant cellulose degradation. However, only minor effects were observed on the growth parameters on cellulose. In both transformant strains, cellulosome production was found to be reduced and two unknown proteins (P105 and P98) appeared as major components of their cellulolytic systems. These proteins did not contain any dockerin domain and were shown to be not included into the cellulosomes; they are expected to participate to the non-cellulosomal cellulolytic system of C. cellulolyticum.
Identification and role of regulatory non-coding RNAs in Listeria monocytogenes.
Izar, Benjamin; Mraheil, Mobarak Abu; Hain, Torsten
2011-01-01
Bacterial regulatory non-coding RNAs control numerous mRNA targets that direct a plethora of biological processes, such as the adaption to environmental changes, growth and virulence. Recently developed high-throughput techniques, such as genomic tiling arrays and RNA-Seq have allowed investigating prokaryotic cis- and trans-acting regulatory RNAs, including sRNAs, asRNAs, untranslated regions (UTR) and riboswitches. As a result, we obtained a more comprehensive view on the complexity and plasticity of the prokaryotic genome biology. Listeria monocytogenes was utilized as a model system for intracellular pathogenic bacteria in several studies, which revealed the presence of about 180 regulatory RNAs in the listerial genome. A regulatory role of non-coding RNAs in survival, virulence and adaptation mechanisms of L. monocytogenes was confirmed in subsequent experiments, thus, providing insight into a multifaceted modulatory function of RNA/mRNA interference. In this review, we discuss the identification of regulatory RNAs by high-throughput techniques and in their functional role in L. monocytogenes.
Sustaining the Civil Reserve Air Fleet (CRAF) Program
2003-05-01
means to build international networks and global marketing alliances. These arrangements have permitted alliances to exploit the regional competitive...next major phase occurred in the 1990s—the linking of networks in different countries through code sharing alliances and global marketing alliances
Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain
2012-01-01
Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826
Piga, M; Casula, L; Perra, D; Sanna, S; Floris, A; Antonelli, A; Cauli, A; Mathieu, A
2016-01-01
The objective of this paper is to evaluate hospital admissions in systemic lupus erythematosus (SLE) patients through a retrospective population-based study analyzing hospitalization data during 2001-2012 in Sardinia, an Italian region with universal health system coverage. Data on the hospital discharge records with the ICD-9-CM code for SLE (710.0) were obtained from the Department of Health and Hygiene and analyzed, mostly focusing on primary and non-primary diagnosis and Diagnosis-Related Group (DRG) code. In order to establish the significance of the annual trend for number and type of primary and non-primary discharge diagnosis, the two-tailed Cochran-Armitage test for trend was applied. In order to estimate SLE prevalence, data from administrative database and medical records were assembled. This study included 6222 hospitalizations in 1675 patients (87% women). Hospitalizations with SLE as primary diagnosis were 3782 (58.0%) and significantly decreased during the study period. The annual number of renal, hematologic and neuropsychiatric disorders as non-primary diagnosis associated with SLE remained constant; however, their percentage increased (p < 0.0001) because of a declining number of admissions for SLE without associated diagnosis and without complications. Hospitalizations with SLE as non-primary diagnosis showed a significant upward trend in number and percentage of cerebrovascular accident (p = 0.0004), acute coronary syndrome (p = 0.0004) and chronic renal failure (p = 0.0003) as underlying primary diagnosis, while complications of pregnancy, labor and childbirth (p = 0.3375), malignancies (p = 0.6608) and adverse drug reactions (p = 0.2456) did not show statistically significant changes. Infections showed an increasing trend between 2001 and 2012 but did not reach statistical significance (p = 0.0304). After correction for hospitalization (93.8%) and survival (91.1%) rates calculated over the study period, the 2012 SLE prevalence in Sardinia was estimated to be 99.3 per 100,000 inhabitants. While overall hospitalizations for SLE patients declined, those for cerebrovascular accident, acute coronary syndrome and chronic renal failure as underlying primary diagnosis increased during the study period. © The Author(s) 2015.
The Identification of Software Failure Regions
1990-06-01
be used to detect non-obviously redundant test cases. A preliminary examination of the manual analysis method is performed with a set of programs ...failure regions are defined and a method of failure region analysis is described in detail. The thesis describes how this analysis may be used to detect...is the termination of the ability of a functional unit to perform its required function. (Glossary, 1983) The presence of faults in program code
Screening of reproduction-related single-nucleotide variations from MeDIP-seq data in sheep.
Cao, Jiaxue; Wei, Caihong; Zhang, Shuzhen; Capellini, Terence D; Zhang, Li; Zhao, Fuping; Li, Li; Zhong, Tao; Wang, Linjie; Du, Lixin; Zhang, Hongping
2016-11-01
Extensive variation in reproduction has arisen in Chinese Mongolian sheep during recent domestication. Hu and Small-tailed Han sheep, for example, have become non-seasonal breeders and exhibit higher fecundity than Tan and Ujumqin breeds. We therefore scanned reproduction-related single-nucleotide variations from methylated DNA-immunoprecipitation sequencing data generated from each of those four breeds to uncover potential mechanisms underlying this breed variation. We generated a high-quality map of single nucleotide variations (SNVs) in DNA methylation enriched regions, and found that the majority of variants are located within non-coding regions. We identified 359 SNVs within the Sheep Quantitative Trait Locus (QTL) database. Nineteen of these SNVs associated with the Aseasonal Reproduction QTL, and 10 out of the 19 reside close to genes with known reproduction functions. We also identified the well-known FecB mutation in high-fecundity sheep (Hu and Small-tailed Han sheep). When we applied these FecB finding to our breeding system, we improved lambing rate by 175%. In summary, this study provided strong candidate SNVs associated with sheep fecundity that can serve as targets for functional testing and to enhance selective breeding strategies. Mol. Reprod. Dev. 83: 958-967, 2016 © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
2010-01-01
Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence. PMID:20525227
Sugai, Akihiro; Sato, Hiroki; Yoneda, Misako; Kai, Chieko
2017-08-01
The regulation of transcription during Nipah virus (NiV) replication is poorly understood. Using a bicistronic minigenome system, we investigated the involvement of non-coding regions (NCRs) in the transcriptional re-initiation efficiency of NiV RNA polymerase. Reporter assays revealed that attenuation of NiV gene expression was not constant at each gene junction, and that the attenuating property was controlled by the 3' NCR. However, this regulation was independent of the gene-end, gene-start and intergenic regions. Northern blot analysis indicated that regulation of viral gene expression by the phosphoprotein (P) and large protein (L) 3' NCRs occurred at the transcription level. We identified uridine-rich tracts within the L 3' NCR that are similar to gene-end signals. These gene-end-like sequences were recognized as weak transcription termination signals by the viral RNA polymerase, thereby reducing downstream gene transcription. Thus, we suggest that NiV has a unique mechanism of transcriptional regulation. Copyright © 2017 Elsevier Inc. All rights reserved.
Srivastava, Rishi; Singh, Mohar; Bajaj, Deepak; Parida, Swarup K.
2016-01-01
Development and large-scale genotyping of user-friendly informative genome/gene-derived InDel markers in natural and mapping populations is vital for accelerating genomics-assisted breeding applications of chickpea with minimal resource expenses. The present investigation employed a high-throughput whole genome next-generation resequencing strategy in low and high pod number parental accessions and homozygous individuals constituting the bulks from each of two inter-specific mapping populations [(Pusa 1103 × ILWC 46) and (Pusa 256 × ILWC 46)] to develop non-erroneous InDel markers at a genome-wide scale. Comparing these high-quality genomic sequences, 82,360 InDel markers with reference to kabuli genome and 13,891 InDel markers exhibiting differentiation between low and high pod number parental accessions and bulks of aforementioned mapping populations were developed. These informative markers were structurally and functionally annotated in diverse coding and non-coding sequence components of genome/genes of kabuli chickpea. The functional significance of regulatory and coding (frameshift and large-effect mutations) InDel markers for establishing marker-trait linkages through association/genetic mapping was apparent. The markers detected a greater amplification (97%) and intra-specific polymorphic potential (58–87%) among a diverse panel of cultivated desi, kabuli, and wild accessions even by using a simpler cost-efficient agarose gel-based assay implicating their utility in large-scale genetic analysis especially in domesticated chickpea with narrow genetic base. Two high-density inter-specific genetic linkage maps generated using aforesaid mapping populations were integrated to construct a consensus 1479 InDel markers-anchored high-resolution (inter-marker distance: 0.66 cM) genetic map for efficient molecular mapping of major QTLs governing pod number and seed yield per plant in chickpea. Utilizing these high-density genetic maps as anchors, three major genomic regions harboring each of pod number and seed yield robust QTLs (15–28% phenotypic variation explained) were identified on chromosomes 2, 4, and 6. The integration of genetic and physical maps at these QTLs mapped on chromosomes scaled-down the long major QTL intervals into high-resolution short pod number and seed yield robust QTL physical intervals (0.89–2.94 Mb) which were essentially got validated in multiple genetic backgrounds of two chickpea mapping populations. The genome-wide InDel markers including natural allelic variants and genomic loci/genes delineated at major six especially in one colocalized novel congruent robust pod number and seed yield robust QTLs mapped on a high-density consensus genetic map were found most promising in chickpea. These functionally relevant molecular tags can drive marker-assisted genetic enhancement to develop high-yielding cultivars with increased seed/pod number and yield in chickpea. PMID:27695461
Rozhdestvensky, Timofey S; Robeck, Thomas; Galiveti, Chenna R; Raabe, Carsten A; Seeger, Birte; Wolters, Anna; Gubar, Leonid V; Brosius, Jürgen; Skryabin, Boris V
2016-02-05
Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5'HPRT-LoxP-Neo(R) cassette (5'LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScr(p-/m5'LoxP)), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScr(p-/m5'LoxP) mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScr(p-/m5'LoxP) mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases.
A candidate gene for choanal atresia in alpaca.
Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G
2010-03-01
Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
Hrdlickova, Barbara; Kumar, Vinod; Kanduri, Kartiek; Zhernakova, Daria V; Tripathi, Subhash; Karjalainen, Juha; Lund, Riikka J; Li, Yang; Ullah, Ubaid; Modderman, Rutger; Abdulahad, Wayel; Lähdesmäki, Harri; Franke, Lude; Lahesmaa, Riitta; Wijmenga, Cisca; Withoff, Sebo
2014-01-01
Although genome-wide association studies (GWAS) have identified hundreds of variants associated with a risk for autoimmune and immune-related disorders (AID), our understanding of the disease mechanisms is still limited. In particular, more than 90% of the risk variants lie in non-coding regions, and almost 10% of these map to long non-coding RNA transcripts (lncRNAs). lncRNAs are known to show more cell-type specificity than protein-coding genes. We aimed to characterize lncRNAs and protein-coding genes located in loci associated with nine AIDs which have been well-defined by Immunochip analysis and by transcriptome analysis across seven populations of peripheral blood leukocytes (granulocytes, monocytes, natural killer (NK) cells, B cells, memory T cells, naive CD4(+) and naive CD8(+) T cells) and four populations of cord blood-derived T-helper cells (precursor, primary, and polarized (Th1, Th2) T-helper cells). We show that lncRNAs mapping to loci shared between AID are significantly enriched in immune cell types compared to lncRNAs from the whole genome (α <0.005). We were not able to prioritize single cell types relevant for specific diseases, but we observed five different cell types enriched (α <0.005) in five AID (NK cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, and psoriasis; memory T and CD8(+) T cells in juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis; Th0 and Th2 cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis). Furthermore, we show that co-expression analyses of lncRNAs and protein-coding genes can predict the signaling pathways in which these AID-associated lncRNAs are involved. The observed enrichment of lncRNA transcripts in AID loci implies lncRNAs play an important role in AID etiology and suggests that lncRNA genes should be studied in more detail to interpret GWAS findings correctly. The co-expression results strongly support a model in which the lncRNA and protein-coding genes function together in the same pathways.
Structural architecture of the human long non-coding RNA, steroid receptor RNA activator
Novikova, Irina V.; Hennelly, Scott P.; Sanbonmatsu, Karissa Y.
2012-01-01
While functional roles of several long non-coding RNAs (lncRNAs) have been determined, the molecular mechanisms are not well understood. Here, we report the first experimentally derived secondary structure of a human lncRNA, the steroid receptor RNA activator (SRA), 0.87 kB in size. The SRA RNA is a non-coding RNA that coactivates several human sex hormone receptors and is strongly associated with breast cancer. Coding isoforms of SRA are also expressed to produce proteins, making the SRA gene a unique bifunctional system. Our experimental findings (SHAPE, in-line, DMS and RNase V1 probing) reveal that this lncRNA has a complex structural organization, consisting of four domains, with a variety of secondary structure elements. We examine the coevolution of the SRA gene at the RNA structure and protein structure levels using comparative sequence analysis across vertebrates. Rapid evolutionary stabilization of RNA structure, combined with frame-disrupting mutations in conserved regions, suggests that evolutionary pressure preserves the RNA structural core rather than its translational product. We perform similar experiments on alternatively spliced SRA isoforms to assess their structural features. PMID:22362738
1983-10-01
SYSTEMS OBJECTIVES. This study was conducted as part of a continuing effort to obtain actual (historical) life cycle costs of major Army systems from...Procurement, AMS Code for RDTE, etc.). System life cycle costs cut across appropriation lines. A common architecture should be prerequisite to... life cycle costs of major Army systems have not been successful, but attention recently has been directed toward the possibility that a significant
Germ-line and somatic EPHA2 coding variants in lens aging and cataract.
Bennett, Thomas M; M'Hamdi, Oussama; Hejtmancik, J Fielding; Shiels, Alan
2017-01-01
Rare germ-line mutations in the coding regions of the human EPHA2 gene (EPHA2) have been associated with inherited forms of pediatric cataract, whereas, frequent, non-coding, single nucleotide variants (SNVs) have been associated with age-related cataract. Here we sought to determine if germ-line EPHA2 coding SNVs were associated with age-related cataract in a case-control DNA panel (> 50 years) and if somatic EPHA2 coding SNVs were associated with lens aging and/or cataract in a post-mortem lens DNA panel (> 48 years). Micro-fluidic PCR amplification followed by targeted amplicon (exon) next-generation (deep) sequencing of EPHA2 (17-exons) afforded high read-depth coverage (1000x) for > 82% of reads in the cataract case-control panel (161 cases, 64 controls) and > 70% of reads in the post-mortem lens panel (35 clear lens pairs, 22 cataract lens pairs). Novel and reference (known) missense SNVs in EPHA2 that were predicted in silico to be functionally damaging were found in both cases and controls from the age-related cataract panel at variant allele frequencies (VAFs) consistent with germ-line transmission (VAF > 20%). Similarly, both novel and reference missense SNVs in EPHA2 were found in the post-mortem lens panel at VAFs consistent with a somatic origin (VAF > 3%). The majority of SNVs found in the cataract case-control panel and post-mortem lens panel were transitions and many occurred at di-pyrimidine sites that are susceptible to ultraviolet (UV) radiation induced mutation. These data suggest that novel germ-line (blood) and somatic (lens) coding SNVs in EPHA2 that are predicted to be functionally deleterious occur in adults over 50 years of age. However, both types of EPHA2 coding variants were present at comparable levels in individuals with or without age-related cataract making simple genotype-phenotype correlations inconclusive.
Germ-line and somatic EPHA2 coding variants in lens aging and cataract
Bennett, Thomas M.; M’Hamdi, Oussama; Hejtmancik, J. Fielding
2017-01-01
Rare germ-line mutations in the coding regions of the human EPHA2 gene (EPHA2) have been associated with inherited forms of pediatric cataract, whereas, frequent, non-coding, single nucleotide variants (SNVs) have been associated with age-related cataract. Here we sought to determine if germ-line EPHA2 coding SNVs were associated with age-related cataract in a case-control DNA panel (> 50 years) and if somatic EPHA2 coding SNVs were associated with lens aging and/or cataract in a post-mortem lens DNA panel (> 48 years). Micro-fluidic PCR amplification followed by targeted amplicon (exon) next-generation (deep) sequencing of EPHA2 (17-exons) afforded high read-depth coverage (1000x) for > 82% of reads in the cataract case-control panel (161 cases, 64 controls) and > 70% of reads in the post-mortem lens panel (35 clear lens pairs, 22 cataract lens pairs). Novel and reference (known) missense SNVs in EPHA2 that were predicted in silico to be functionally damaging were found in both cases and controls from the age-related cataract panel at variant allele frequencies (VAFs) consistent with germ-line transmission (VAF > 20%). Similarly, both novel and reference missense SNVs in EPHA2 were found in the post-mortem lens panel at VAFs consistent with a somatic origin (VAF > 3%). The majority of SNVs found in the cataract case-control panel and post-mortem lens panel were transitions and many occurred at di-pyrimidine sites that are susceptible to ultraviolet (UV) radiation induced mutation. These data suggest that novel germ-line (blood) and somatic (lens) coding SNVs in EPHA2 that are predicted to be functionally deleterious occur in adults over 50 years of age. However, both types of EPHA2 coding variants were present at comparable levels in individuals with or without age-related cataract making simple genotype-phenotype correlations inconclusive. PMID:29267365
Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús
2016-01-01
Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037
Cross-separatrix Coupling in Nonlinear Global Electrostatic Turbulent Transport in C-2U
NASA Astrophysics Data System (ADS)
Lau, Calvin; Fulton, Daniel; Bao, Jian; Lin, Zhihong; Binderbauer, Michl; Tajima, Toshiki; Schmitz, Lothar; TAE Team
2017-10-01
In recent years, the progress of the C-2/C-2U advanced beam-driven field-reversed configuration (FRC) experiments at Tri Alpha Energy, Inc. has pushed FRCs to transport limited regimes. Understanding particle and energy transport is a vital step towards an FRC reactor, and two particle-in-cell microturbulence codes, the Gyrokinetic Toroidal Code (GTC) and A New Code (ANC), are being developed and applied toward this goal. Previous local electrostatic GTC simulations find the core to be robustly stable with drift-wave instability only in the scrape-off layer (SOL) region. However, experimental measurements showed fluctuations in both regions; one possibility is that fluctuations in the core originate from the SOL, suggesting the need for non-local simulations with cross-separatrix coupling. Current global ANC simulations with gyrokinetic ions and adiabatic electrons find that non-local effects (1) modify linear growth-rates and frequencies of instabilities and (2) allow instability to move from the unstable SOL to the linearly stable core. Nonlinear spreading is also seen prior to mode saturation. We also report on the progress of the first turbulence simulations in the SOL. This work is supported by the Norman Rostoker Fellowship.
Park, Seong C; Finnell, John T
2012-01-01
In 2009, Indianapolis launched an electronic medical record system within their ambulances1 and started to exchange patient data with the Indiana Network for Patient Care (INPC) This unique system allows EMS personnel to get important information prior to the patient's arrival to the hospital. In this descriptive study, we found EMS personnel requested patient data on 14% of all transports, with a "success" match rate of 46%, and a match "failure" rate of 17%. The three major factors for causing match "failure" were ZIP code 55%, Patient Name 22%, and Birth date 12%. We conclude that the ZIP code matching process needs to be improved by applying a limitation of 5 digits in ZIP code instead of using ZIP+4 code. Non-ZIP code identifiers may be a better choice due to inaccuracies and changes of the ZIP code in a patient's record.
NASA Astrophysics Data System (ADS)
Wang, Ke-Yan; Li, Yun-Song; Liu, Kai; Wu, Cheng-Ke
2008-08-01
A novel compression algorithm for interferential multispectral images based on adaptive classification and curve-fitting is proposed. The image is first partitioned adaptively into major-interference region and minor-interference region. Different approximating functions are then constructed for two kinds of regions respectively. For the major interference region, some typical interferential curves are selected to predict other curves. These typical curves are then processed by curve-fitting method. For the minor interference region, the data of each interferential curve are independently approximated. Finally the approximating errors of two regions are entropy coded. The experimental results show that, compared with JPEG2000, the proposed algorithm not only decreases the average output bit-rate by about 0.2 bit/pixel for lossless compression, but also improves the reconstructed images and reduces the spectral distortion greatly, especially at high bit-rate for lossy compression.
Gan, Han Ming; Tan, Mun Hua; Lee, Yin Peng; Austin, Christopher M
2016-05-01
The mitogenome of the Australian freshwater blackfish, Gadopsis marmoratus was recovered coverage by genome skimming using the MiSeq sequencer (GenBank Accession Number: NC_024436). The blackfish mitogenome has 16,407 base pairs made up of 13 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a 819 bp non-coding AT-rich region. This is the 5th mitogenome sequence to be reported for the family Percichthyidae.
Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.
Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K
1991-09-15
We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.
Whole-genome landscapes of major melanoma subtypes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayward, Nicholas K.; Wilmott, James S.; Waddell, Nicola
Melanoma of the skin is a common cancer only in Europeans, whereas it arises in internal body surfaces (mucosal sites) and on the hands and feet (acral sites) in people throughout the world. We report analysis of whole-genome sequences from cutaneous, acral and mucosal subtypes of melanoma. The heavily mutated landscape of coding and non-coding mutations in cutaneous melanoma resolved novel signatures of mutagenesis attributable to ultraviolet radiation. But, acral and mucosal melanomas were dominated by structural changes and mutation signatures of unknown aetiology, not previously identified in melanoma. The number of genes affected by recurrent mutations disrupting non-coding sequencesmore » was similar to that affected by recurrent mutations to coding sequences. Significantly mutated genes included BRAF, CDKN2A, NRAS and TP53 in cutaneous melanoma, BRAF, NRAS and NF1 in acral melanoma and SF3B1 in mucosal melanoma. Mutations affecting the TERT promoter were the most frequent of all; however, neither they nor ATRX mutations, which correlate with alternative telomere lengthening, were associated with greater telomere length. In most cases, melanomas had potentially actionable mutations, most in components of the mitogen-activated protein kinase and phosphoinositol kinase pathways. The whole-genome mutation landscape of melanoma reveals diverse carcinogenic processes across its subtypes, some unrelated to sun exposure, and extends potential involvement of the non-coding genome in its pathogenesis.« less
Whole-genome landscapes of major melanoma subtypes
Hayward, Nicholas K.; Wilmott, James S.; Waddell, Nicola; ...
2017-05-03
Melanoma of the skin is a common cancer only in Europeans, whereas it arises in internal body surfaces (mucosal sites) and on the hands and feet (acral sites) in people throughout the world. We report analysis of whole-genome sequences from cutaneous, acral and mucosal subtypes of melanoma. The heavily mutated landscape of coding and non-coding mutations in cutaneous melanoma resolved novel signatures of mutagenesis attributable to ultraviolet radiation. But, acral and mucosal melanomas were dominated by structural changes and mutation signatures of unknown aetiology, not previously identified in melanoma. The number of genes affected by recurrent mutations disrupting non-coding sequencesmore » was similar to that affected by recurrent mutations to coding sequences. Significantly mutated genes included BRAF, CDKN2A, NRAS and TP53 in cutaneous melanoma, BRAF, NRAS and NF1 in acral melanoma and SF3B1 in mucosal melanoma. Mutations affecting the TERT promoter were the most frequent of all; however, neither they nor ATRX mutations, which correlate with alternative telomere lengthening, were associated with greater telomere length. In most cases, melanomas had potentially actionable mutations, most in components of the mitogen-activated protein kinase and phosphoinositol kinase pathways. The whole-genome mutation landscape of melanoma reveals diverse carcinogenic processes across its subtypes, some unrelated to sun exposure, and extends potential involvement of the non-coding genome in its pathogenesis.« less
The impact of rare variation on gene expression across tissues.
Li, Xin; Kim, Yungil; Tsang, Emily K; Davis, Joe R; Damani, Farhan N; Chiang, Colby; Hess, Gaelen T; Zappala, Zachary; Strober, Benjamin J; Scott, Alexandra J; Li, Amy; Ganna, Andrea; Bassik, Michael C; Merker, Jason D; Hall, Ira M; Battle, Alexis; Montgomery, Stephen B
2017-10-11
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J
2014-12-19
The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.
Transterm—extended search facilities and improved integration with other databases
Jacobs, Grant H.; Stockwell, Peter A.; Tate, Warren P.; Brown, Chris M.
2006-01-01
Transterm has now been publicly available for >10 years. Major changes have been made since its last description in this database issue in 2002. The current database provides data for key regions of mRNA sequences, a curated database of mRNA motifs and tools to allow users to investigate their own motifs or mRNA sequences. The key mRNA regions database is derived computationally from Genbank. It contains 3′ and 5′ flanking regions, the initiation and termination signal context and coding sequence for annotated CDS features from Genbank and RefSeq. The database is non-redundant, enabling summary files and statistics to be prepared for each species. Advances include providing extended search facilities, the database may now be searched by BLAST in addition to regular expressions (patterns) allowing users to search for motifs such as known miRNA sequences, and the inclusion of RefSeq data. The database contains >40 motifs or structural patterns important for translational control. In this release, patterns from UTRsite and Rfam are also incorporated with cross-referencing. Users may search their sequence data with Transterm or user-defined patterns. The system is accessible at . PMID:16381889
Das, Shouvik; Singh, Mohar; Srivastava, Rishi; Bajaj, Deepak; Saxena, Maneesha S.; Rana, Jai C.; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.
2016-01-01
The present study used a whole-genome, NGS resequencing-based mQTL-seq (multiple QTL-seq) strategy in two inter-specific mapping populations (Pusa 1103 × ILWC 46 and Pusa 256 × ILWC 46) to scan the major genomic region(s) underlying QTL(s) governing pod number trait in chickpea. Essentially, the whole-genome resequencing of low and high pod number-containing parental accessions and homozygous individuals (constituting bulks) from each of these two mapping populations discovered >8 million high-quality homozygous SNPs with respect to the reference kabuli chickpea. The functional significance of the physically mapped SNPs was apparent from the identified 2,264 non-synonymous and 23,550 regulatory SNPs, with 8–10% of these SNPs-carrying genes corresponding to transcription factors and disease resistance-related proteins. The utilization of these mined SNPs in Δ (SNP index)-led QTL-seq analysis and their correlation between two mapping populations based on mQTL-seq, narrowed down two (CaqaPN4.1: 867.8 kb and CaqaPN4.2: 1.8 Mb) major genomic regions harbouring robust pod number QTLs into the high-resolution short QTL intervals (CaqbPN4.1: 637.5 kb and CaqbPN4.2: 1.28 Mb) on chickpea chromosome 4. The integration of mQTL-seq-derived one novel robust QTL with QTL region-specific association analysis delineated the regulatory (C/T) and coding (C/A) SNPs-containing one pentatricopeptide repeat (PPR) gene at a major QTL region regulating pod number in chickpea. This target gene exhibited anther, mature pollen and pod-specific expression, including pronounced higher up-regulated (∼3.5-folds) transcript expression in high pod number-containing parental accessions and homozygous individuals of two mapping populations especially during pollen and pod development. The proposed mQTL-seq-driven combinatorial strategy has profound efficacy in rapid genome-wide scanning of potential candidate gene(s) underlying trait-associated high-resolution robust QTL(s), thereby expediting genomics-assisted breeding and genetic enhancement of crop plants, including chickpea. PMID:26685680
SUMIYAMA, KENTA; MIYAKE, TSUTOMU; GRIMWOOD, JANE; STUART, ANDREW; DICKSON, MARK; SCHMUTZ, JEREMY; RUDDLE, FRANK H.; MYERS, RICHARD M.; AMEMIYA, CHRIS T.
2013-01-01
The mammalian Dlx3 and Dlx4 genes are configured as a bigene cluster, and their respective expression patterns are controlled temporally and spatially by cis-elements that largely reside within the intergenic region of the cluster. Previous work revealed that there are conspicuously conserved elements within the intergenic region of the Dlx3–4 bigene clusters of mouse and human. In this paper we have extended these analyses to include 12 additional mammalian taxa (including a marsupial and a monotreme) in order to better define the nature and molecular evolutionary trends of the coding and non-coding functional elements among morphologically divergent mammals. Dlx3–4 regions were fully sequenced from 12 divergent taxa of interest. We identified three theria-specific amino acid replacements in homeodomain of Dlx4 gene that functions in placenta. Sequence analyses of constrained nucleotide sites in the intergenic non-coding region showed that many of the intergenic conserved elements are highly conserved and have evolved slowly within the mammals. In contrast, a branchial arch/craniofacial enhancer I37-2 exhibited accelerated evolution at the branch between the monotreme and therian common ancestor despite being highly conserved among therian species. Functional analysis of I37-2 in transgenic mice has shown that the equivalent region of the platypus fails to drive transcriptional activity in branchial arches. These observations, taken together with our molecular evolutionary data, suggest that theria-specific episodic changes in the I37-2 element may have contributed to craniofacial innovation at the base of the mammalian lineage. PMID:22951979
Knight, Jo; Spain, Sarah L; Capon, Francesca; Hayday, Adrian; Nestle, Frank O; Clop, Alex; Barker, Jonathan N; Weale, Michael E; Trembath, Richard C
2012-12-01
Psoriasis is a common, chronic, inflammatory skin disorder. A number of genetic loci have been shown to confer risk for psoriasis. Collectively, these offer an integrated model for the inherited basis for susceptibility to psoriasis that combines altered skin barrier function together with the dysregulation of innate immune pathogen sensing and adap-tive immunity. The major histocompatibility complex (MHC) harbours the psoriasis susceptibility region which exhibits the largest effect size, driven in part by variation contained on the HLA-Cw*0602 allele. However, the resolution of the number and genomic location of potential independent risk loci are hampered by extensive linkage disequilibrium across the region. We leveraged the power of large psoriasis case and control data sets and the statistical approach of conditional analysis to identify potential further association signals distributed across the MHC. In addition to the major loci at HLA-C (P = 2.20 × 10(-236)), we observed and replicated four additional independent signals for disease association, three of which are novel. We detected evidence for association at SNPs rs2507971 (P = 6.73 × 10(-14)), rs9260313 (P = 7.93 × 10(-09)), rs66609536 (P = 3.54 × 10(-07)) and rs380924 (P = 6.24 × 10(-06)), located within the class I region of the MHC, with each observation replicated in an independent sample (P ≤ 0.01). The previously identified locus is close to MICA, the other three lie near MICB, HLA-A and HCG9 (a non-coding RNA gene). The identification of disease associations with both MICA and MICB is particularly intriguing, since each encodes an MHC class I-related protein with potent immunological function.
Bowden, Deborah L; Vargas-Caro, Carolina; Ovenden, Jennifer R; Bennett, Michael B; Bustamante, Carlos
2016-11-01
The complete mitochondrial genome of the grey nurse shark Carcharias taurus is described from 25 963 828 sequences obtained using Illumina NGS technology. Total length of the mitogenome is 16 715 bp, consisting of 2 rRNAs, 13 protein-coding regions, 22 tRNA and 2 non-coding regions thus updating the previously published mitogenome for this species. The phylogenomic reconstruction inferred from the mitogenome of 15 species of Lamniform and Carcharhiniform sharks supports the inclusion of C. taurus in a clade with the Lamnidae and Cetorhinidae. This complete mitogenome contributes to ongoing investigation into the monophyly of the Family Odontaspididae.
Whole mitochondrial genome sequence for an osteoarthritis model of Guinea pig (Caviidae; Cavia).
Cui, Xin-Gang; Liu, Cheng-Yao; Wei, Bo; Zhao, Wen-Jian; Zhang, Wen-Feng
2016-11-01
Animal models played an important role in osteoarthritis studies. Here, the complete mitochondrial genome sequence of the Guinea pig was reported for the first time. The total length of the mitogenome was 16,797 bp. It contained the typical structure, including two ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and one non-coding control region (D-loop region). The overall composition of the mitogenome was estimated to be 34.9% for A, 26.1% for T, 26.0% for C and 13.0% for G showing an A-T (61.0%)-rich feature. This mitochondrial genome sequence will provide new genetic resource into osteoarthritis disease.
Musculoskeletal disorder costs and medical claim filing in the US retail trade sector.
Bhattacharya, Anasua; Leigh, J Paul
2011-01-01
The average costs of Musculoskeletal Disorder (MSD) and odds ratios for filing medical claims related to MSD were examined. The medical claims were identified by ICD 9 codes for four US Census regions within retail trade. Large private firms' medical claims data from Thomson Reuters Inc. MarketScan databases for the years 2003 through 2006 were used. Average costs were highest for claims related to lumbar region (ICD 9 Code: 724.02) and number of claims were largest for low back syndrome (ICD 9 Code: 724.2). Whereas the odds of filing an MSD claim did not vary greatly over time, average costs declined over time. The odds of filing claims rose with age and were higher for females and southerners than men and non-southerners. Total estimated national medical costs for MSDs within retail trade were $389 million (2007 USD).
Chen, Caihui; Zheng, Yongjie; Liu, Sian; Zhong, Yongda; Wu, Yanfang; Li, Jiang; Xu, Li-An; Xu, Meng
2017-01-01
Cinnamomum camphora , a member of the Lauraceae family, is a valuable aromatic and timber tree that is indigenous to the south of China and Japan. All parts of Cinnamomum camphora have secretory cells containing different volatile chemical compounds that are utilized as herbal medicines and essential oils. Here, we reported the complete sequencing of the chloroplast genome of Cinnamomum camphora using illumina technology. The chloroplast genome of Cinnamomum camphora is 152,570 bp in length and characterized by a relatively conserved quadripartite structure containing a large single copy region of 93,705 bp, a small single copy region of 19,093 bp and two inverted repeat (IR) regions of 19,886 bp. Overall, the genome contained 123 coding regions, of which 15 were repeated in the IR regions. An analysis of chloroplast sequence divergence revealed that the small single copy region was highly variable among the different genera in the Lauraceae family. A total of 40 repeat structures and 83 simple sequence repeats were detected in both the coding and non-coding regions. A phylogenetic analysis indicated that Calycanthus is most closely related to Lauraceae , both being members of Laurales , which forms a sister group to Magnoliids . The complete sequence of the chloroplast of Cinnamomum camphora will aid in in-depth taxonomical studies of the Lauraceae family in the future. The genetic sequence information will also have valuable applications for chloroplast genetic engineering.
Wheeler, Bayly S
2013-12-01
Transposons are mobile genetic elements that are a major constituent of most genomes. Organisms regulate transposable element expression, transposition, and insertion site preference, mitigating the genome instability caused by uncontrolled transposition. A recent burst of research has demonstrated the critical role of small non-coding RNAs in regulating transposition in fungi, plants, and animals. While mechanistically distinct, these pathways work through a conserved paradigm. The presence of a transposon is communicated by the presence of its RNA or by its integration into specific genomic loci. These signals are then translated into small non-coding RNAs that guide epigenetic modifications and gene silencing back to the transposon. In addition to being regulated by the host, transposable elements are themselves capable of influencing host gene expression. Transposon expression is responsive to environmental signals, and many transposons are activated by various cellular stresses. TEs can confer local gene regulation by acting as enhancers and can also confer global gene regulation through their non-coding RNAs. Thus, transposable elements can act as stress-responsive regulators that control host gene expression in cis and trans.
Seifert, Steven A; Oakes, Jennifer A; Boyer, Leslie V
2007-01-01
Non-native (exotic) snake exposures in the United States have not been systematically characterized. The Toxic Exposure Surveillance System (TESS) database of the American Association of Poison Control Centers was analyzed to quantify the number and types, demographic associations, clinical presentations, managements and outcomes, and the health resource utilization of non-native snake exposures. From 1995 through 2004, there were 399 non-native exposures in the TESS database. Of these, 350 snakes (87%) were identified by genus and species, comprising at least 77 different varieties. Roughly equal percentages of snakes originated in Asia, Africa and Latin America, with a smaller number from the Middle-East, Australia, and Europe. Nearly half were viperids and a little more than a third were elapids. The vast majority of exposed individuals were adults. However, almost 15% were aged 17 years or less, and almost 7% were children aged 5 years or younger. Eighty-four percent were males. The vast majority of exposures occurred at the victim's own residence. Over 50% were evaluated at a healthcare facility, with 28.7% admitted to an ICU. Overall, 26% of patients were coded as receiving antivenom treatment. Coded outcomes were similar between viperid and elapid envenomations. There were three deaths, two involving viperid snakes and one elapid. Enhancements to the TESS database are required for better precision in and more complete characterization of non-native snake envenomations.
Bakera, Beata; Makowska, Bogna; Groszyk, Jolanta; Niziołek, Michał; Orczyk, Wacław; Bolibok-Brągoszewska, Hanna; Hromada-Judycka, Aneta; Rakoczy-Trojanowska, Monika
2015-08-01
Benzoxazinoids (BX) are major secondary metabolites of gramineous plants that play an important role in disease resistance and allelopathy. They also have many other unique properties including anti-bacterial and anti-fungal activity, and the ability to reduce alfa-amylase activity. The biosynthesis and modification of BX are controlled by the genes Bx1 ÷ Bx10, GT and glu, and the majority of these Bx genes have been mapped in maize, wheat and rye. However, the genetic basis of BX biosynthesis remains largely uncharacterized apart from some data from maize and wheat. The aim of this study was to isolate, sequence and characterize five genes (ScBx1, ScBx2, ScBx3, ScBx4 and ScBx5) encoding enzymes involved in the synthesis of DIBOA, an important defense compound of rye. Using a modified 3D procedure of BAC library screening, seven BAC clones containing all of the ScBx genes were isolated and sequenced. Bioinformatic analyses of the resulting contigs were used to examine the structure and other features of these genes, including their promoters, introns and 3'UTRs. Comparative analysis showed that the ScBx genes are similar to those of other Poaceae species, especially to the TaBx genes. The polymorphisms present both in the coding sequences and non-coding regions of ScBx in relation to other Bx genes are predicted to have an impact on the expression, structure and properties of the encoded proteins.
Wang, Pei; Song, Fan; Cai, Wanzhi
2014-01-01
Insect mitochondrial genomes are very important to understand the molecular evolution as well as for phylogenetic and phylogeographic studies of the insects. The Miridae are the largest family of Heteroptera encompassing more than 11,000 described species and of great economic importance. For better understanding the diversity and the evolution of plant bugs, we sequence five new mitochondrial genomes and present the first comparative analysis of nine mitochondrial genomes of mirids available to date. Our result showed that gene content, gene arrangement, base composition and sequences of mitochondrial transcription termination factor were conserved in plant bugs. Intra-genus species shared more conserved genomic characteristics, such as nucleotide and amino acid composition of protein-coding genes, secondary structure and anticodon mutations of tRNAs, and non-coding sequences. Control region possessed several distinct characteristics, including: variable size, abundant tandem repetitions, and intra-genus conservation; and was useful in evolutionary and population genetic studies. The AGG codon reassignments were investigated between serine and lysine in the genera Adelphocoris and other cimicomorphans. Our analysis revealed correlated evolution between reassignments of the AGG codon and specific point mutations at the antidocons of tRNALys and tRNASer(AGN). Phylogenetic analysis indicated that mitochondrial genome sequences were useful in resolving family level relationship of Cimicomorpha. Comparative evolutionary analysis of plant bug mitochondrial genomes allowed the identification of previously neglected coding genes or non-coding regions as potential molecular markers. The finding of the AGG codon reassignments between serine and lysine indicated the parallel evolution of the genetic code in Hemiptera mitochondrial genomes. PMID:24988409
Herdewyn, Sarah; Zhao, Hui; Moisse, Matthieu; Race, Valérie; Matthijs, Gert; Reumers, Joke; Kusters, Benno; Schelhaas, Helenius J; van den Berg, Leonard H; Goris, An; Robberecht, Wim; Lambrechts, Diether; Van Damme, Philip
2012-06-01
Motor neuron degeneration in amyotrophic lateral sclerosis (ALS) has a familial cause in 10% of patients. Despite significant advances in the genetics of the disease, many families remain unexplained. We performed whole-genome sequencing in five family members from a pedigree with autosomal-dominant classical ALS. A family-based elimination approach was used to identify novel coding variants segregating with the disease. This list of variants was effectively shortened by genotyping these variants in 2 additional unaffected family members and 1500 unrelated population-specific controls. A novel rare coding variant in SPAG8 on chromosome 9p13.3 segregated with the disease and was not observed in controls. Mutations in SPAG8 were not encountered in 34 other unexplained ALS pedigrees, including 1 with linkage to chromosome 9p13.2-23.3. The shared haplotype containing the SPAG8 variant in this small pedigree was 22.7 Mb and overlapped with the core 9p21 linkage locus for ALS and frontotemporal dementia. Based on differences in coverage depth of known variable tandem repeat regions between affected and non-affected family members, the shared haplotype was found to contain an expanded hexanucleotide (GGGGCC)(n) repeat in C9orf72 in the affected members. Our results demonstrate that rare coding variants identified by whole-genome sequencing can tag a shared haplotype containing a non-coding pathogenic mutation and that changes in coverage depth can be used to reveal tandem repeat expansions. It also confirms (GGGGCC)n repeat expansions in C9orf72 as a cause of familial ALS.
Reddy, Sushma; Kimball, Rebecca T; Pandey, Akanksha; Hosner, Peter A; Braun, Michael J; Hackett, Shannon J; Han, Kin-Lan; Harshman, John; Huddleston, Christopher J; Kingston, Sarah; Marks, Ben D; Miglia, Kathleen J; Moore, William S; Sheldon, Frederick H; Witt, Christopher C; Yuri, Tamaki; Braun, Edward L
2017-09-01
Phylogenomics, the use of large-scale data matrices in phylogenetic analyses, has been viewed as the ultimate solution to the problem of resolving difficult nodes in the tree of life. However, it has become clear that analyses of these large genomic data sets can also result in conflicting estimates of phylogeny. Here, we use the early divergences in Neoaves, the largest clade of extant birds, as a "model system" to understand the basis for incongruence among phylogenomic trees. We were motivated by the observation that trees from two recent avian phylogenomic studies exhibit conflicts. Those studies used different strategies: 1) collecting many characters [$\\sim$ 42 mega base pairs (Mbp) of sequence data] from 48 birds, sometimes including only one taxon for each major clade; and 2) collecting fewer characters ($\\sim$ 0.4 Mbp) from 198 birds, selected to subdivide long branches. However, the studies also used different data types: the taxon-poor data matrix comprised 68% non-coding sequences whereas coding exons dominated the taxon-rich data matrix. This difference raises the question of whether the primary reason for incongruence is the number of sites, the number of taxa, or the data type. To test among these alternative hypotheses we assembled a novel, large-scale data matrix comprising 90% non-coding sequences from 235 bird species. Although increased taxon sampling appeared to have a positive impact on phylogenetic analyses the most important variable was data type. Indeed, by analyzing different subsets of the taxa in our data matrix we found that increased taxon sampling actually resulted in increased congruence with the tree from the previous taxon-poor study (which had a majority of non-coding data) instead of the taxon-rich study (which largely used coding data). We suggest that the observed differences in the estimates of topology for these studies reflect data-type effects due to violations of the models used in phylogenetic analyses, some of which may be difficult to detect. If incongruence among trees estimated using phylogenomic methods largely reflects problems with model fit developing more "biologically-realistic" models is likely to be critical for efforts to reconstruct the tree of life. [Birds; coding exons; GTR model; model fit; Neoaves; non-coding DNA; phylogenomics; taxon sampling.]. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Technical Reports Server (NTRS)
Baumeister, Joseph F.
1994-01-01
A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.
Practice parameters and financial factors impacting developmental-behavioral pediatrics.
Adair, Robin; Perrin, Ellen; Hubbard, Carol; Savageau, Judith A
2010-01-01
Little has been published about the professional activities of developmental-behavioral (DB) pediatricians. To better understand the settings in which DB pediatricians work, allocation of their professional time, and how financial considerations impact their practice, the Society for Developmental and Behavioral Pediatrics surveyed its membership. An extensive on-line three-part survey was conducted in 2006-2007 assessing sociodemographic characteristics, practice descriptors, coding and billing practices, productivity goals and perceived pressures among Society for Developmental and Behavioral Pediatric's 438 physician members. Of the pediatricians responding, representing all regions of the United States, 93% were DB pediatrics subspecialty board certified or eligible. The majority was practicing DB pediatrics full-time (73%); and 67% were exclusively in academic settings. All reported seeing patients, 84% reported teaching, 76% reported having administrative responsibilities, and 46% reported conducting research. Despite having non-clinical responsibilities, full-time equivalent positions included an average of 25 hours per week in direct patient care and 14.5 hours per week (37% of clinical time) in indirect patient care. Only 42% reported working with multidisciplinary teams. Salaries varied widely within and across regions. Deficits in billing/coding practices, awareness of personal clinical productivity, and familiarity with national productivity benchmarks were identified. DB pediatricians work in diverse settings nationwide. They provide considerable time in indirect patient care, which is poorly reimbursed in general and relative to direct patient care. The results of this survey offer opportunities for provider, institutional and payer education.
LOOPREF: A Fluid Code for the Simulation of Coronal Loops
NASA Technical Reports Server (NTRS)
deFainchtein, Rosalinda; Antiochos, Spiro; Spicer, Daniel
1998-01-01
This report documents the code LOOPREF. LOOPREF is a semi-one dimensional finite element code that is especially well suited to simulate coronal-loop phenomena. It has a full implementation of adaptive mesh refinement (AMR), which is crucial for this type of simulation. The AMR routines are an improved version of AMR1D. LOOPREF's versatility makes is suitable to simulate a wide variety of problems. In addition to efficiently providing very high resolution in rapidly changing regions of the domain, it is equipped to treat loops of variable cross section, any non-linear form of heat conduction, shocks, gravitational effects, and radiative loss.
Yang, Zichang; Shi, Xiaonan; Li, Ce; Wang, Xiaoxun; Hou, Kezuo; Li, Zhi; Zhang, Xiaojie; Fan, Yibo; Qu, Xiujuan; Che, Xiaofang; Liu, Yunpeng
2018-05-01
A variety of solid tumors are surrounded by a hypoxic microenvironment, which is known to be associated with high metastatic capability and resistance to various clinical therapies, contributing to a poor survival rate for cancer patients. Although the majority of previous studies on tumor-associated hypoxia have focused on acute hypoxia, chronic hypoxia more closely mimics the actual hypoxic microenvironment of a tumor. In this study, two novel hypoxia-resistant gastric cancer (HRGC) cell lines which could grow normally in 2% oxygen were established. The long non-coding RNA UCA1 was upregulated in HRGC cells, which promoted their migration. Bioinformatics analysis and a luciferase reporter assay showed that miR-7-5p could bind to specific sites of UCA1 to regulate the target EGFR through competitive endogenous RNA function. UCA1 directly interacted with miR-7-5p and decreased the binding of miR-7-5p to the EGFR 3'-untranslated region, which suppressed the degradation of EGFR mRNA by miR-7-5p. Therefore, long-term hypoxia induced UCA1 to promote cell migration by enhancing the expression of EGFR. This study thus reveals a new mechanism by which a hypoxic microenvironment promotes tumor metastasis, and highlights UCA1 as a potential biomarker for predicting the metastasis of gastric cancer to guide clinical treatment. Copyright © 2018 Elsevier Inc. All rights reserved.
Poverty, wealth, and health care utilization: a geographic assessment.
Cooper, Richard A; Cooper, Matthew A; McGinley, Emily L; Fan, Xiaolin; Rosenthal, J Thomas
2012-10-01
Geographic variation has been of interest to both health planners and social epidemiologists. However, while the major focus of interest of planners has been on variation in health care spending, social epidemiologists have focused on health; and while social epidemiologists have observed strong associations between poor health and poverty, planners have concluded that income is not an important determinant of variation in spending. These different conclusions stem, at least in part, from differences in approach. Health planners have generally studied variation among large regions, such as states, counties, or hospital referral regions (HRRs), while epidemiologists have tended to study local areas, such as ZIP codes and census tracts. To better understand the basis for geographic variation in hospital utilization, we drew upon both approaches. Counties and HRRs were disaggregated into their constituent ZIP codes and census tracts and examined the interrelationships between income, disability, and hospital utilization that were examined at both the regional and local levels, using statistical and geomapping tools. Our studies centered on the Milwaukee and Los Angeles HRRs, where per capita health care utilization has been greater than elsewhere in their states. We compared Milwaukee to other HRRs in Wisconsin and Los Angeles to the other populous counties of California and to a region in California of comparable size and diversity, stretching from San Francisco to Sacramento (termed "San-Framento"). When studied at the ZIP code level, we found steep, curvilinear relationships between lower income and both increased hospital utilization and increasing percentages of individuals reporting disabilities. These associations were also evident on geomaps. They were strongest among populations of working-age adults but weaker among seniors, for whom income proved to be a poor proxy for poverty and whose residential locations deviated from the major underlying income patterns. Among working-age adults, virtually all of the excess utilization in Milwaukee was attributable to very high utilization in Milwaukee's segregated "poverty corridor." Similarly, the greater rate of hospital use in Los Angeles than in San-Framento could be explained by proportionately more low-income ZIP codes in Los Angeles and fewer in San-Framento. Indeed, when only high-income ZIP codes were assessed, there was little variation in hospital utilization among California's 18 most populous counties. We estimated that had utilization within each region been at the rate of its high-income ZIP codes, overall utilization would have been 35 % less among working-age adults and 20 % less among seniors. These studies reveal the importance of disaggregating large geographic units into their constituent ZIP codes in order to understand variation in health care utilization among them. They demonstrate the strong association between low ZIP code income and both higher percentages of disability and greater hospital utilization. And they suggest that, given the large contribution of the poorest neighborhoods to aggregate utilization, it will be difficult to curb the growth of health care spending without addressing the underlying social determinants of health.
Abdollahi-Arpanahi, Rostam; Morota, Gota; Valente, Bruno D; Kranis, Andreas; Rosa, Guilherme J M; Gianola, Daniel
2016-02-03
Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.
Parrish, C R; Coia, G; Hill, A; Müllbacher, A; Westaway, E G; Blanden, R V
1991-07-01
A series of recombinant vaccinia viruses expressing various parts of the entire Kunjin virus (KUN) coding region was used to analyse the cytotoxic T (Tc) cell responses to KUN. CBA/H mice inoculated with KUN or West Nile virus were shown to develop responses to KUN or various vaccinia virus expression constructs in either primary cytotoxic assays, or after secondary stimulation of the Tc cells in vitro with KUN antigens. Tc cells from CBA mice showed the strongest response to target cells infected with recombinant vaccinia viruses expressing parts of the KUN NS3 and NS4A proteins, and only a weak response to the other structural or non-structural proteins. Further analysis of deleted versions of the NS3-NS4A region showed that the main epitope recognized was derived from a sequence of 99 amino acids spanning parts of NS3 and NS4A. No other major epitopes were detected by Tc cells from CBA mice in the remaining 3333 amino acids of the KUN polypeptide.
Evidence for regulation of columnar habit in apple by a putative 2OG-Fe(II) oxygenase.
Wolters, Pieter J; Schouten, Henk J; Velasco, Riccardo; Si-Ammour, Azeddine; Baldi, Paolo
2013-12-01
Understanding the genetic mechanisms controlling columnar-type growth in the apple mutant 'Wijcik' will provide insights on how tree architecture and growth are regulated in fruit trees. In apple, columnar-type growth is controlled by a single major gene at the Columnar (Co) locus. By comparing the genomic sequence of the Co region of 'Wijcik' with its wild-type 'McIntosh', a novel non-coding DNA element of 1956 bp specific to Pyreae was found to be inserted in an intergenic region of 'Wijcik'. Expression analysis of selected genes located in the vicinity of the insertion revealed the upregulation of the MdCo31 gene encoding a putative 2OG-Fe(II) oxygenase in axillary buds of 'Wijcik'. Constitutive expression of MdCo31 in Arabidopsis thaliana resulted in compact plants with shortened floral internodes, a phenotype reminiscent of the one observed in columnar apple trees. We conclude that MdCo31 is a strong candidate gene for the control of columnar growth in 'Wijcik'. No claim to original European Union works. New Phytologist © 2013 New Phytologist Trust.
Non-coding RNA networks in cancer.
Anastasiadou, Eleni; Jacob, Leni S; Slack, Frank J
2018-01-01
Thousands of unique non-coding RNA (ncRNA) sequences exist within cells. Work from the past decade has altered our perception of ncRNAs from 'junk' transcriptional products to functional regulatory molecules that mediate cellular processes including chromatin remodelling, transcription, post-transcriptional modifications and signal transduction. The networks in which ncRNAs engage can influence numerous molecular targets to drive specific cell biological responses and fates. Consequently, ncRNAs act as key regulators of physiological programmes in developmental and disease contexts. Particularly relevant in cancer, ncRNAs have been identified as oncogenic drivers and tumour suppressors in every major cancer type. Thus, a deeper understanding of the complex networks of interactions that ncRNAs coordinate would provide a unique opportunity to design better therapeutic interventions.
Zlotnik, Alexander; Cuchi, Miguel Alfaro; Pérez Pérez, Maria Carmen
Public healthcare providers in all Spanish Regions - Autonomous Communities (ACs) use All Patients Diagnosis-Related Groups (AP-DRGs) for billing non-insured patients, cost accounting and inpatient efficiency indicators. A national migration to All Patients Refined Diagnosis-Related Groups (APR-DRGs) has been scheduled for 2016. The analysis was performed on 202,912 inpatient care episodes ranging from 2005 to 2010. All episodes were grouped using AP-DRG v25.0 and APR-DRG v24.0. Normalised DRG weight variations for an AP-DRG to APR-DRG migration scenario were calculated and compared. Major differences exist between normalised weights for inpatient episodes depending on the DRGs family used. The usage of the APR-DRG system in Spain without any adjustments, as it was developed in the United States, should be approached with care. In order to avoid reverse incentives and provider financial risks, coding practices should be reviewed and structural differences between DRG families taken into account.
RNA editing differently affects protein-coding genes in D. melanogaster and H. sapiens.
Grassi, Luigi; Leoni, Guido; Tramontano, Anna
2015-07-14
When an RNA editing event occurs within a coding sequence it can lead to a different encoded amino acid. The biological significance of these events remains an open question: they can modulate protein functionality, increase the complexity of transcriptomes or arise from a loose specificity of the involved enzymes. We analysed the editing events in coding regions that produce or not a change in the encoded amino acid (nonsynonymous and synonymous events, respectively) in D. melanogaster and in H. sapiens and compared them with the appropriate random models. Interestingly, our results show that the phenomenon has rather different characteristics in the two organisms. For example, we confirm the observation that editing events occur more frequently in non-coding than in coding regions, and report that this effect is much more evident in H. sapiens. Additionally, in this latter organism, editing events tend to affect less conserved residues. The less frequently occurring editing events in Drosophila tend to avoid drastic amino acid changes. Interestingly, we find that, in Drosophila, changes from less frequently used codons to more frequently used ones are favoured, while this is not the case in H. sapiens.
Causes of Death Data in the Global Burden of Disease Estimates for Ischemic and Hemorrhagic Stroke.
Truelsen, Thomas; Krarup, Lars-Henrik; Iversen, Helle K; Mensah, George A; Feigin, Valery L; Sposato, Luciano A; Naghavi, Mohsen
2015-01-01
Stroke mortality estimates in the Global Burden of Disease (GBD) study are based on routine mortality statistics and redistribution of ill-defined codes that cannot be a cause of death, the so-called 'garbage codes' (GCs). This study describes the contribution of these codes to stroke mortality estimates. All available mortality data were compiled and non-specific cause codes were redistributed based on literature review and statistical methods. Ill-defined codes were redistributed to their specific cause of disease by age, sex, country and year. The reassignment was done based on the International Classification of Diseases and the pathology behind each code by checking multiple causes of death and literature review. Unspecified stroke and primary and secondary hypertension are leading contributing 'GCs' to stroke mortality estimates for hemorrhagic stroke (HS) and ischemic stroke (IS). There were marked differences in the fraction of death assigned to IS and HS for unspecified stroke and hypertension between GBD regions and between age groups. A large proportion of stroke fatalities are derived from the redistribution of 'unspecified stroke' and 'hypertension' with marked regional differences. Future advancements in stroke certification, data collections and statistical analyses may improve the estimation of the global stroke burden. © 2015 S. Karger AG, Basel.
Liaw, Yu-Ching; Chen, Cheng-Hsu; Shu, Kuo-Hsiung; Fang, Chiung-Yao; Ou, Wei-Chih; Chen, Pei-Lain; Shen, Cheng-Huang; Lin, Mien-Chun; Chang, Deching; Wang, Meilin
2012-12-01
Kidney cells are the common host for JC virus (JCV) and BK virus (BKV). Reactivation of JCV and/or BKV in patients after organ transplantation, such as renal transplantation, may cause hemorrhagic cystitis and polyomavirus-associated nephropathy. Furthermore, JCV and BKV may be shed in the urine after reactivation in the kidney. Rearranged as well as archetypal non-coding control regions (NCCRs) of JCV and BKV have been frequently identified in human samples. In this study, three JC/BK recombined NCCR sequences were identified in the urine of a patient who had undergone renal transplantation. They were designated as JC-BK hybrids 1, 2, and 3. The three JC/BK recombinant NCCRs contain up-stream JCV as well as down-stream BKV sequences. Deletions of both JCV and BKV sequences were found in these recombined NCCRs. Recombination of DNA sequences between JCV and BKV may occur during co-infection due to the relatively high homology of the two viral genomes.
Non-coding recurrent mutations in chronic lymphocytic leukaemia.
Puente, Xose S; Beà, Silvia; Valdés-Mas, Rafael; Villamor, Neus; Gutiérrez-Abril, Jesús; Martín-Subero, José I; Munar, Marta; Rubio-Pérez, Carlota; Jares, Pedro; Aymerich, Marta; Baumann, Tycho; Beekman, Renée; Belver, Laura; Carrio, Anna; Castellano, Giancarlo; Clot, Guillem; Colado, Enrique; Colomer, Dolors; Costa, Dolors; Delgado, Julio; Enjuanes, Anna; Estivill, Xavier; Ferrando, Adolfo A; Gelpí, Josep L; González, Blanca; González, Santiago; González, Marcos; Gut, Marta; Hernández-Rivas, Jesús M; López-Guerra, Mónica; Martín-García, David; Navarro, Alba; Nicolás, Pilar; Orozco, Modesto; Payer, Ángel R; Pinyol, Magda; Pisano, David G; Puente, Diana A; Queirós, Ana C; Quesada, Víctor; Romeo-Casabona, Carlos M; Royo, Cristina; Royo, Romina; Rozman, María; Russiñol, Nuria; Salaverría, Itziar; Stamatopoulos, Kostas; Stunnenberg, Hendrik G; Tamborero, David; Terol, María J; Valencia, Alfonso; López-Bigas, Nuria; Torrents, David; Gut, Ivo; López-Guillermo, Armando; López-Otín, Carlos; Campo, Elías
2015-10-22
Chronic lymphocytic leukaemia (CLL) is a frequent disease in which the genetic alterations determining the clinicobiological behaviour are not fully understood. Here we describe a comprehensive evaluation of the genomic landscape of 452 CLL cases and 54 patients with monoclonal B-lymphocytosis, a precursor disorder. We extend the number of CLL driver alterations, including changes in ZNF292, ZMYM3, ARID1A and PTPN11. We also identify novel recurrent mutations in non-coding regions, including the 3' region of NOTCH1, which cause aberrant splicing events, increase NOTCH1 activity and result in a more aggressive disease. In addition, mutations in an enhancer located on chromosome 9p13 result in reduced expression of the B-cell-specific transcription factor PAX5. The accumulative number of driver alterations (0 to ≥4) discriminated between patients with differences in clinical behaviour. This study provides an integrated portrait of the CLL genomic landscape, identifies new recurrent driver mutations of the disease, and suggests clinical interventions that may improve the management of this neoplasia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.
2001-01-01
Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less
Zhang, Yong; Zhang, Fan; Zhu, Shuangli; Chen, Li; Yan, Dongmei; Wang, Dongyan; Tang, Ruiyan; Zhu, Hui; Hou, Xiaohui; An, Hongqiu; Zhang, Hong; Xu, Wenbo
2010-02-01
A type 2 vaccine-related poliovirus (strain CHN3024), differing from the Sabin 2 strain by 0.44% in the VP1 coding region was isolated from a patient with vaccine-associated paralytic poliomyelitis. Sequences downstream of nucleotide position 6735 (3D(pol) coding region) were derived from an unidentified sequence; no close match for a potential parent was found, but it could be classified into a non-polio human enteroviruses species C (HEV-C) phylogeny. The virus differed antigenically from the parental Sabin strain, having an amino acid substitution in the neutralizing antigenic site 1. The similarity between CHN3024 and Sabin 2 sequences suggests that the recombination was recent; this is supported by the estimation that the initiating OPV dose was given only 36-75 days before sampling. The patient's clinical manifestations, intratypic differentiation examination, and whole-genome sequencing showed that this recombinant exhibited characteristics of neurovirulent vaccine-derived polioviruses (VDPV), which may, thus, pose a potential threat to a polio-free world.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leong, JoAnn Ching
The nucleotide sequence of the IHNV glycoprotein gene has been determined from a cDNA clone containing the entire coding region. The glycoprotein cDNA clone contained a leader sequence of 48 bases, a coding region of 1524 nucleotides, and 39 bases at the 3 foot end. The entire cDNA clone contains 1609 nucleodites and encodes a protein of 508 amino acids. The deduced amino acid sequence gave a translated molecular weight of 56,795 daltons. A hydropathicity profile of the deduced amino acid sequence indicated that there were two major hydrophobic domains: one,at the N-terminus,delineating a signal peptide of 18 amino acidsmore » and the other, at the C-terminus,delineating the region of the transmembrane. Five possible sites of N-linked glyscoylation were identified. Although no nucleic acid homology existed between the IHNV glycoprotein gene and the glycoprotein genes of rabies and VSV, there was significant homology at the amino acid level between all three rhabdovirus glycoproteins.« less
Lai, Edward Chia-Cheng; Man, Kenneth K C; Chaiyakunapruk, Nathorn; Cheng, Ching-Lan; Chien, Hsu-Chih; Chui, Celine S L; Dilokthornsakul, Piyameth; Hardy, N Chantelle; Hsieh, Cheng-Yang; Hsu, Chung Y; Kubota, Kiyoshi; Lin, Tzu-Chieh; Liu, Yanfang; Park, Byung Joo; Pratt, Nicole; Roughead, Elizabeth E; Shin, Ju-Young; Watcharathanakij, Sawaeng; Wen, Jin; Wong, Ian C K; Yang, Yea-Huei Kao; Zhang, Yinghong; Setoguchi, Soko
2015-11-01
This study describes the availability and characteristics of databases in Asian-Pacific countries and assesses the feasibility of a distributed network approach in the region. A web-based survey was conducted among investigators using healthcare databases in the Asia-Pacific countries. Potential survey participants were identified through the Asian Pharmacoepidemiology Network. Investigators from a total of 11 databases participated in the survey. Database sources included four nationwide claims databases from Japan, South Korea, and Taiwan; two nationwide electronic health records from Hong Kong and Singapore; a regional electronic health record from western China; two electronic health records from Thailand; and cancer and stroke registries from Taiwan. We identified 11 databases with capabilities for distributed network approaches. Many country-specific coding systems and terminologies have been already converted to international coding systems. The harmonization of health expenditure data is a major obstacle for future investigations attempting to evaluate issues related to medical costs.
Decoding sORF translation - from small proteins to gene regulation.
Cabrera-Quio, Luis Enrique; Herberg, Sarah; Pauli, Andrea
2016-11-01
Translation is best known as the fundamental mechanism by which the ribosome converts a sequence of nucleotides into a string of amino acids. Extensive research over many years has elucidated the key principles of translation, and the majority of translated regions were thought to be known. The recent discovery of wide-spread translation outside of annotated protein-coding open reading frames (ORFs) came therefore as a surprise, raising the intriguing possibility that these newly discovered translated regions might have unrecognized protein-coding or gene-regulatory functions. Here, we highlight recent findings that provide evidence that some of these newly discovered translated short ORFs (sORFs) encode functional, previously missed small proteins, while others have regulatory roles. Based on known examples we will also speculate about putative additional roles and the potentially much wider impact that these translated regions might have on cellular homeostasis and gene regulation.
Designing and Assessing Learning
ERIC Educational Resources Information Center
Quan, Hong; Liu, Dandan; Cun, Xiangqin; Lu, Yingchun
2009-01-01
This paper analyses the design, implementation and assessment of a level 2 module for non-English major students in higher vocational and professional education. 1132001 is a code of module that uses active methods to teach college English in China. It specifically reflects on the module's advantage and defect for developing and improving learning…
NASA Astrophysics Data System (ADS)
Testa, P.; Polito, V.; De Pontieu, B.; Carlsson, M.; Reale, F.; Allred, J. C.; Hansteen, V. H.
2017-12-01
We investigate coronal heating properties in active region cores in non-flaring conditions, using high spatial, spectral, and temporal resolution chromospheric/transition region/coronal observations coupled with detailed modeling. We will focus, in particular, on observations with the Interface Region Imaging Spectrograph (IRIS), joint with observations with Hinode (XRT and EIS) and SDO/AIA. We will discuss how these observations and models (1D HD and 3D MHD, with the RADYN and Bifrost codes) provide useful diagnostics of the coronal heating processes and mechanisms of energy transport.
Intergenic disease-associated regions are abundant in novel transcripts.
Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E
2017-12-28
Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Comparative architecture of silks, fibrous proteins and their encoding genes in insects and spiders.
Craig, Catherine L; Riekel, Christian
2002-12-01
The known silk fibroins and fibrous glues are thought to be encoded by members of the same gene family. All silk fibroins sequenced to date contain regions of long-range order (crystalline regions) and/or short-range order (non-crystalline regions). All of the sequenced fibroin silks (Flag or silk from flagelliform gland in spiders; Fhc or heavy chain fibroin silks produced by Lepidoptera larvae) are made up of hierarchically organized, repetitive arrays of amino acids. Fhc fibroin genes are characterized by a similar molecular genetic architecture of two exons and one intron, but the organization and size of these units differs. The Flag, Ser (sericin gene) and BR (Balbiani ring genes; both fibrous proteins) genes are made up of multiple exons and introns. Sequences coding for crystalline and non-crystalline protein domains are integrated in the repetitive regions of Fhc and MA exons, but not in the protein glues Ser1 and BR-1. Genetic 'hot-spots' promote recombination errors in Fhc, MA, and Flag. Codon bias, structural constraint, point mutations, and shortened coding arrays may be alternative means of stabilizing precursor mRNA transcripts. Differential regulation of gene expression and selective splicing of the mRNA transcript may allow rapid adaptation of silk functional properties to different physical environments.
Video coding for 3D-HEVC based on saliency information
NASA Astrophysics Data System (ADS)
Yu, Fang; An, Ping; Yang, Chao; You, Zhixiang; Shen, Liquan
2016-11-01
As an extension of High Efficiency Video Coding ( HEVC), 3D-HEVC has been widely researched under the impetus of the new generation coding standard in recent years. Compared with H.264/AVC, its compression efficiency is doubled while keeping the same video quality. However, its higher encoding complexity and longer encoding time are not negligible. To reduce the computational complexity and guarantee the subjective quality of virtual views, this paper presents a novel video coding method for 3D-HEVC based on the saliency informat ion which is an important part of Human Visual System (HVS). First of all, the relationship between the current coding unit and its adjacent units is used to adjust the maximum depth of each largest coding unit (LCU) and determine the SKIP mode reasonably. Then, according to the saliency informat ion of each frame image, the texture and its corresponding depth map will be divided into three regions, that is, salient area, middle area and non-salient area. Afterwards, d ifferent quantization parameters will be assigned to different regions to conduct low complexity coding. Finally, the compressed video will generate new view point videos through the renderer tool. As shown in our experiments, the proposed method saves more bit rate than other approaches and achieves up to highest 38% encoding time reduction without subjective quality loss in compression or rendering.
Non-sanctioning of illegal tackles in South African youth community rugby.
Brown, J C; Boucher, S J; Lambert, M; Viljoen, W; Readhead, C; Hendricks, S; Kraak, W J
2018-06-01
The tackle event in rugby union ('rugby') contributes to the majority of players' injuries. Referees can reduce this risk by sanctioning dangerous tackles. A study in elite adult rugby suggests that referees only sanction a minority of illegal tackles. The aim of this study was to assess if this finding was similar in youth community rugby. Observational study. Using EncodePro, 99 South African Rugby Union U18 Youth Week tournament matches were coded between 2011 and 2015. All tackles were coded by a researcher and an international referee to ensure that laws were interpreted correctly. The inter- and intra-rater reliabilities were 0.97-1.00. A regression analysis compared the non-sanctioned rates over time. In total, 12 216 tackles were coded, of which less than 1% (n=113) were 'illegal'. The majority of the 113 illegal tackles were front-on (75%), high tackles (72%) and occurred in the 2nd/4th quarters (29% each). Of the illegal tackles, only 59% were sanctioned. The proportions of illegal tackles and sanctioning of these illegal tackles to all tackles improved by 0.2% per year from 2011-2015 (p<0.05). In these youth community rugby players, 59% of illegal tackles were not sanctioned appropriately. This was better than a previous study in elite adult rugby, where only 7% of illegal tackles were penalised. Moreover, the rates of illegal tackles and non-sanctioned illegal tackles both improved over time. However, it is critical that referees consistently enforce all laws to enhance injury prevention efforts. Further studies should investigate the reasons for non-sanctioning. Copyright © 2017 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Hundreds of conserved non-coding genomic regions are independently lost in mammals
Hiller, Michael; Schaar, Bruce T.; Bejerano, Gill
2012-01-01
Conserved non-protein-coding DNA elements (CNEs) often encode cis-regulatory elements and are rarely lost during evolution. However, CNE losses that do occur can be associated with phenotypic changes, exemplified by pelvic spine loss in sticklebacks. Using a computational strategy to detect complete loss of CNEs in mammalian genomes while strictly controlling for artifacts, we find >600 CNEs that are independently lost in at least two mammalian lineages, including a spinal cord enhancer near GDF11. We observed several genomic regions where multiple independent CNE loss events happened; the most extreme is the DIAPH2 locus. We show that CNE losses often involve deletions and that CNE loss frequencies are non-uniform. Similar to less pleiotropic enhancers, we find that independently lost CNEs are shorter, slightly less constrained and evolutionarily younger than CNEs without detected losses. This suggests that independently lost CNEs are less pleiotropic and that pleiotropic constraints contribute to non-uniform CNE loss frequencies. We also detected 35 CNEs that are independently lost in the human lineage and in other mammals. Our study uncovers an interesting aspect of the evolution of functional DNA in mammalian genomes. Experiments are necessary to test if these independently lost CNEs are associated with parallel phenotype changes in mammals. PMID:23042682
Khorsandi, Shirin Elizabeth; Salehi, Siamak; Cortes, Miriam; Vilca-Melendez, Hector; Menon, Krishna; Srinivasan, Parthi; Prachalias, Andreas; Jassem, Wayel; Heaton, Nigel
2018-02-15
Mitochondria have their own genomic, transcriptomic and proteomic machinery but are unable to be autonomous, needing both nuclear and mitochondrial genomes. The aim of this work was to use computational biology to explore the involvement of Mitochondrial microRNAs (MitomiRs) and their interactions with the mitochondrial proteome in a clinical model of primary non function (PNF) of the donor after cardiac death (DCD) liver. Archival array data on the differential expression of miRNA in DCD PNF was re-analyzed using a number of publically available computational algorithms. 10 MitomiRs were identified of importance in DCD PNF, 7 with predicted interaction of their seed sequence with the mitochondrial transcriptome that included both coding, and non coding areas of the hypervariability region 1 (HVR1) and control region. Considering miRNA regulation of the nuclear encoded mitochondrial proteome, 7 hypothetical small proteins were identified with homolog function that ranged from co-factor for formation of ATP Synthase, REDOX balance and an importin/exportin protein. In silico, unconventional seed interactions, both non canonical and alternative seed sites, appear to be of greater importance in MitomiR regulation of the mitochondrial genome. Additionally, a number of novel small proteins of relevance in transplantation have been identified which need further characterization.
Correlation approach to identify coding regions in DNA sequences
NASA Technical Reports Server (NTRS)
Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1994-01-01
Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Ivancic-Jelecki, Jelena; Slovic, Anamarija; Šantak, Maja; Tešović, Goran; Forcic, Dubravko
2016-07-29
The canonical genome organization of measles virus (MV) is characterized by total size of 15 894 nucleotides (nts) and defined length of every genomic region, both coding and non-coding. Only rarely have reports of strains possessing non-canonical genomic properties (possessing indels, with or without the change of total genome length) been published. The observed mutations are mutually compensatory in a sense that the total genome length remains polyhexameric. Although programmed and highly precise pseudo-templated nucleotide additions during transcription are inherent to polymerases of all viruses belonging to family Paramyxoviridae, a similar mechanism that would serve to non-randomly correct genome length, if an indel has occurred during replication, has so far not been described in the context of a complete virus genome. We compiled all complete MV genomic sequences (64 in total) available in open access sequence databases. Multiple sequence comparisons and phylogenetic analyses were performed with the aim of exploring whether non-recombinant and non-evolutionary linked measles strains that show deviations from canonical genome organization possess a common genetic characteristic. In 11 MV sequences we detected deviations from canonical genome organization due to short indels located within homopolymeric stretches or next to them. In nine out of 11 identified non-canonical MV sequences, a common feature was observed: one mutation, either an insertion or a deletion, was located in a 28 nts long region in F gene 5' untranslated region (positions 5051-5078 in genomic cDNA of canonical strains). This segment is composed of five tandemly linked homopolymeric stretches, its consensus sequence is G6-7C7-8A6-7G1-3C5-6. Although none of the mononucleotide repeats within this segment has fixed length, the total number of nts in canonical strains is always 28. These nine non-canonical strains, as well as the tenth (not mutated in 5051-5078 segment), can be grouped in three clusters, based on their passage histories/epidemiological data/genetic similarities. There are no indications that the 3 clusters are evolutionary linked, other than the fact that they all belong to clade D. A common narrow genomic region was found to be mutated in different, non-related, wild type strains suggesting that this region might have a function in non-random genome length corrections occurring during MV replication.
Yong, Hoi-Sen; Song, Sze-Looi; Lim, Phaik-Eem; Chan, Kok-Gan; Chow, Wan-Loo; Eamsobhana, Praphathip
2015-01-01
The whole mitochondrial genome of the pest fruit fly Bactrocera arecae was obtained from next-generation sequencing of genomic DNA. It had a total length of 15,900 bp, consisting of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The control region (952 bp) was flanked by rrnS and trnI genes. The start codons included 6 ATG, 3 ATT and 1 each of ATA, ATC, GTG and TCG. Eight TAA, two TAG, one incomplete TA and two incomplete T stop codons were represented in the protein-coding genes. The cloverleaf structure for trnS1 lacked the D-loop, and that of trnN and trnF lacked the TΨC-loop. Molecular phylogeny based on 13 protein-coding genes was concordant with 37 mitochondrial genes, with B. arecae having closest genetic affinity to B. tryoni. The subgenus Bactrocera of Dacini tribe and the Dacinae subfamily (Dacini and Ceratitidini tribes) were monophyletic. The whole mitogenome of B. arecae will serve as a useful dataset for studying the genetics, systematics and phylogenetic relationships of the many species of Bactrocera genus in particular, and tephritid fruit flies in general. PMID:26472633
2014-01-01
Linear algebraic concept of subspace plays a significant role in the recent techniques of spectrum estimation. In this article, the authors have utilized the noise subspace concept for finding hidden periodicities in DNA sequence. With the vast growth of genomic sequences, the demand to identify accurately the protein-coding regions in DNA is increasingly rising. Several techniques of DNA feature extraction which involves various cross fields have come up in the recent past, among which application of digital signal processing tools is of prime importance. It is known that coding segments have a 3-base periodicity, while non-coding regions do not have this unique feature. One of the most important spectrum analysis techniques based on the concept of subspace is the least-norm method. The least-norm estimator developed in this paper shows sharp period-3 peaks in coding regions completely eliminating background noise. Comparison of proposed method with existing sliding discrete Fourier transform (SDFT) method popularly known as modified periodogram method has been drawn on several genes from various organisms and the results show that the proposed method has better as well as an effective approach towards gene prediction. Resolution, quality factor, sensitivity, specificity, miss rate, and wrong rate are used to establish superiority of least-norm gene prediction method over existing method. PMID:24386895
Rozhdestvensky, Timofey S.; Robeck, Thomas; Galiveti, Chenna R.; Raabe, Carsten A.; Seeger, Birte; Wolters, Anna; Gubar, Leonid V.; Brosius, Jürgen; Skryabin, Boris V.
2016-01-01
Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5′HPRT-LoxP-NeoR cassette (5′LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScrp−/m5′LoxP), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScrp−/m5′LoxP mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScrp−/m5′LoxP mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases. PMID:26848093
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript
Rose, Dominic; Stadler, Peter F.
2011-01-01
Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
Wilson, Anthony B; Whittington, Camilla M; Bahr, Angela
2014-12-20
The genes of the major histocompatibility complex (MHC/MH) have attracted considerable scientific interest due to their exceptional levels of variability and important function as part of the adaptive immune system. Despite a large number of studies on MH class II diversity of both model and non-model organisms, most research has focused on patterns of genetic variability at individual loci, failing to capture the functional diversity of the biologically active dimeric molecule. Here, we take a systematic approach to the study of MH variation, analyzing patterns of genetic variation at MH class IIα and IIβ loci of the seahorse, which together form the immunologically active peptide binding cleft of the MH class II molecule. The seahorse carries a minimal class II system, consisting of single copies of both MH class IIα and IIβ, which are physically linked and inherited in a Mendelian fashion. Both genes are ubiquitously expressed and detectible in the brood pouch of male seahorses throughout pregnancy. Genetic variability of the two genes is high, dominated by non-synonymous variation concentrated in their peptide-binding regions. Coding variation outside these regions is negligible, a pattern thought to be driven by intra- and interlocus recombination. Despite the tight physical linkage of MH IIα and IIβ loci, recombination has produced novel composite alleles, increasing functional diversity at sites responsible for antigen recognition. Antigen recognition by the adaptive immune system of the seahorse is enhanced by high variability at both MH class IIα and IIβ loci. Strong positive selection on sites involved in pathogen recognition, coupled with high levels of intra- and interlocus recombination, produce a patchwork pattern of genetic variation driven by genetic hitchhiking. Studies focusing on variation at individual MH loci may unintentionally overlook an important component of ecologically relevant variation.
Pre-Mrna Introns as a Model for Cryptographic Algorithm:. Theory and Experiments
NASA Astrophysics Data System (ADS)
Regoli, Massimo
2010-01-01
The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. In particular the RNA sequences have some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algorithm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences
NASA Astrophysics Data System (ADS)
Regoli, Massimo
2009-02-01
The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture
Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent
2016-01-01
SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population. PMID:26367794
The NACP/synuclein gene: chromosomal assignment and screening for alterations in Alzheimer disease.
Campion, D; Martin, C; Heilig, R; Charbonnier, F; Moreau, V; Flaman, J M; Petit, J L; Hannequin, D; Brice, A; Frebourg, T
1995-03-20
The major component of the vascular and plaque amyloid deposits in Alzheimer disease is the amyloid beta peptide (A beta). A second intrinsic component of amyloid, the NAC (non-A beta component of amyloid) peptide, has recently been identified, and its precursor protein was named NACP. A computer homology search allowed us to establish that the human NACP gene was homologous to the rat synuclein gene. We mapped the NACP/synuclein gene to chromosome 4 and cloned three alternatively spliced transcripts in lymphocytes derived from a normal subject. We analyzed by RT-PCR and direct sequencing the entire coding region of the NACP/synuclein gene in a group of patients with familial early onset Alzheimer disease. No mutation was found in 26 unrelated patients. Further studies are required to investigate the implication of the NACP/synuclein gene in Alzheimer disease.
The NACP/synuclein gene: Chromosomal assignment and screening for alterations in Alzheimer disease
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campion, D.; Martin, C.; Charbonnier, F.
1995-03-20
The major component of the vascular and plaque amyloid deposits in Alzheimer disease is the amyloid {beta} peptide (A{beta}). A second intrinsic component of amyloid, the NAC (non-A{beta} component of amyloid) peptide, has recently been identified, and its precursor protein was named NACP. A computer homology search allowed us to establish that the human NACP gene was homologous to the rat synuclein gene. We mapped the NACP/synuclein gene to chromosome 4 and cloned three alternatively spliced transcripts in lymphocytes derived from a normal subject. We analyzed by RT-PCR and direct sequencing the entire coding region of the NACP/synuclein gene inmore » a group of patients with familial early onset Alzheimer disease. No mutation was found in 26 unrelated patients. Further studies are required to investigate the implication of the NACP/synuclein gene in Alzheimer disease. 21 refs., 3 tabs.« less
RNAcentral: an international database of ncRNA sequences
Williams, Kelly Porter
2014-10-28
The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.
Behavioral analysis of malicious code through network traffic and system call monitoring
NASA Astrophysics Data System (ADS)
Grégio, André R. A.; Fernandes Filho, Dario S.; Afonso, Vitor M.; Santos, Rafael D. C.; Jino, Mario; de Geus, Paulo L.
2011-06-01
Malicious code (malware) that spreads through the Internet-such as viruses, worms and trojans-is a major threat to information security nowadays and a profitable business for criminals. There are several approaches to analyze malware by monitoring its actions while it is running in a controlled environment, which helps to identify malicious behaviors. In this article we propose a tool to analyze malware behavior in a non-intrusive and effective way, extending the analysis possibilities to cover malware samples that bypass current approaches and also fixes some issues with these approaches.
Daware, Anurag; Das, Sweta; Srivastava, Rishi; Badoni, Saurabh; Singh, Ashok K.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.
2016-01-01
Development and use of genome-wide informative simple sequence repeat (SSR) markers and novel integrated genomic strategies are vital to drive genomics-assisted breeding applications and for efficient dissection of quantitative trait loci (QTLs) underlying complex traits in rice. The present study developed 6244 genome-wide informative SSR markers exhibiting in silico fragment length polymorphism based on repeat-unit variations among genomic sequences of 11 indica, japonica, aus, and wild rice accessions. These markers were mapped on diverse coding and non-coding sequence components of known cloned/candidate genes annotated from 12 chromosomes and revealed a much higher amplification (97%) and polymorphic potential (88%) along with wider genetic/functional diversity level (16–74% with a mean 53%) especially among accessions belonging to indica cultivar group, suggesting their utility in large-scale genomics-assisted breeding applications in rice. A high-density 3791 SSR markers-anchored genetic linkage map (IR 64 × Sonasal) spanning 2060 cM total map-length with an average inter-marker distance of 0.54 cM was generated. This reference genetic map identified six major genomic regions harboring robust QTLs (31% combined phenotypic variation explained with a 5.7–8.7 LOD) governing grain weight on six rice chromosomes. One strong grain weight major QTL region (OsqGW5.1) was narrowed-down by integrating traditional QTL mapping with high-resolution QTL region-specific integrated SSR and single nucleotide polymorphism markers-based QTL-seq analysis and differential expression profiling. This led us to delineate two natural allelic variants in two known cis-regulatory elements (RAV1AAT and CARGCW8GAT) of glycosyl hydrolase and serine carboxypeptidase genes exhibiting pronounced seed-specific differential regulation in low (Sonasal) and high (IR 64) grain weight mapping parental accessions. Our genome-wide SSR marker resource (polymorphic within/between diverse cultivar groups) and integrated genomic strategy can efficiently scan functionally relevant potential molecular tags (markers, candidate genes and alleles) regulating complex agronomic traits (grain weight) and expedite marker-assisted genetic enhancement in rice. PMID:27833617
Wang, Ze-Huan; Peng, Hua; Kilian, Norbert
2013-01-01
The first comprehensive molecular phylogenetic reconstruction of the Cichorieae subtribe Lactucinae is provided. Sequences for two datasets, one of the nuclear rDNA ITS region, the other of five concatenated non-coding chloroplast DNA markers including the petD region and the psbA-trnH, 5′trnL(UAA)-trnF, rpl32-trnL(UAG) and trnQ(UUG)-5′rps16 spacers, were, with few exceptions, newly generated for 130 samples of 78 species. The sampling spans the entire subtribe Lactucinae while focusing on its Chinese centre of diversity; more than 3/4 of the Chinese Lactucinae species are represented. The nuclear and plastid phylogenies inferred from the two independent datasets show various hard topological incongruences. They concern the internal topology of major lineages, in one case the placement of taxa in major lineages, the relationships between major lineages and even the circumscription of the subtribe, indicating potential events of ancient as well as of more recent reticulation and chloroplast capture in the evolution of the subtribe. The core of the subtribe is clearly monophyletic, consisting of the six lineages, Cicerbita, Cicerbita II, Lactuca, Melanoseris, Notoseris and Paraprenanthes. The Faberia lineage and the monospecific Prenanthes purpurea lineage are part of a monophyletic subtribe Lactucinae only in the nuclear or plastid phylogeny, respectively. Morphological and karyological support for their placement is considered. In the light of the molecular phylogenetic reconstruction and of additional morphological data, the conflicting taxonomies of the Chinese Lactuca alliance are discussed and it is concluded that the major lineages revealed are best treated at generic rank. An improved species level taxonomy of the Chinese Lactucinae is outlined; new synonymies and some new combinations are provided. PMID:24376566
Polyomavirus BK non-coding control region rearrangements in health and disease.
Sharma, Preety M; Gupta, Gaurav; Vats, Abhay; Shapiro, Ron; Randhawa, Parmjeet S
2007-08-01
BK virus is an increasingly recognized pathogen in transplanted patients. DNA sequencing of this virus shows considerable genomic variability. To understand the clinical significance of rearrangements in the non-coding control region (NCCR) of BK virus (BKV), we report a meta-analysis of 507 sequences, including 40 sequences generated in our own laboratory, for associations between rearrangements and disease, tissue tropism, geographic origin, and viral genotype. NCCR rearrangements were less frequent in (a) asymptomatic BKV viruria compared to patients viral nephropathy (1.7% vs. 22.5%), and (b) viral genotype 1 compared to other genotypes (2.4% vs. 11.2%). Rearrangements were commoner in malignancy (78.6%), and Norwegians (45.7%), and less common in East Indians (0%), and Japanese (4.3%). A surprising number of rearranged sequences were reported from mononuclear cells of healthy subjects, whereas most plasma sequences were archetypal. This difference could not be related to potential recombinase activity in lymphocytes, as consensus recombination signal sequences could not be found in the NCCR region. NCCR rearrangements are neither required nor a sufficient condition to produce clinical disease. BKV nephropathy and hemorrhagic cystitis are not associated with any unique NCCR configuration or nucleotide sequence.
Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dash, Paban Kumar, E-mail: pabandash@rediffmail.com; Sharma, Shashi; Soni, Manisha
Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is notmore » available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a unique clade in South Asia.« less
The complete mitochondrial genome of the Feral Rock Pigeon (Columba livia breed feral).
Li, Chun-Hong; Liu, Fang; Wang, Li
2014-10-01
Abstract In the present work, we report the complete mitochondrial genome sequence of feral rock pigeon for the first time. The total length of the mitogenome was 17,239 bp with the base composition of 30.3% for A, 24.0% for T, 31.9% for C, and 13.8% for G and an A-T (54.3 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of feral rock pigeon would serve as an important data set of the germplasm resources for further study.
Meher, J K; Meher, P K; Dash, G N; Raval, M K
2012-01-01
The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.
The complete mitochondrial genome sequence of the Datong yak (Bos grunniens).
Wu, Xiaoyun; Chu, Min; Liang, Chunnian; Ding, Xuezhi; Guo, Xian; Bao, Pengjia; Yan, Ping
2016-01-01
Datong yak is a famous artificially cultivated breed in China. In the present work, we report the complete mitochondrial genome sequence of Datong yak for the first time. The total length of the mitogenome is 16,323 bp long, containing 13 protein-coding genes, 22 tRNA genes, two rRNA genes and one non-coding region (D-loop region). The gene order of Datong yak mitogenome is identical to that observed in most other vertebrates. The overall base composition is 33.71% A, 25.8.0% C, 13.21% G and 27.27% T, with an A + T content of 60.98%. The complete mitogenome sequence information of Datong yak can provide useful data for further studies on molecular breeding and taxonomic status.
Characterization of the complete mitochondrial genome sequence of Gannan yak (Bos grunniens).
Wu, Xiaoyun; Ding, Xuezhi; Chu, Min; Guo, Xian; Bao, Pengjia; Liang, Chunnian; Yan, Ping
2016-01-01
Gannan yak is the native breed of Gansu province in China. In this work, the complete mitochondrial genome sequence of Gannan yak was determined for the first time. The total length of the mitogenome is 16,322 bp long, with the base composition of 33.74% A, 25.84% T, 13.18% C, and 27.24% G. It contained 13 protein-coding genes, 22 tRNA genes, two rRNA genes and one non-coding region (D-loop region). The gene order of Gannan yak mitogenome is identical to that observed in most other vertebrates. The complete mitogenome sequence information of Gannan yak can provide useful data for further studies on protection of genetic resources and phylogenetic relationships within Bos grunniens.
Véliz, David; Vega-Retter, Caren; Quezada-Romegialli, Claudio
2016-01-01
The complete sequence of the mitochondrial genome for the Chilean silverside Basilichthys microlepidotus is reported for the first time. The entire mitochondrial genome was 16,544 bp in length (GenBank accession no. KM245937); gene composition and arrangement was conformed to that reported for most fishes and contained the typical structure of 2 rRNAs, 13 protein-coding genes, 22 tRNAs and a non-coding region. The assembled mitogenome was validated against sequences of COI and Control Region previously sequenced in our lab, functional genes from RNA-Seq data for the same species and the mitogenome of two other atherinopsid species available in Genbank.
Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.
1992-01-01
The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
A deep learning method for lincRNA detection using auto-encoder algorithm.
Yu, Ning; Yu, Zeng; Pan, Yi
2017-12-06
RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.
Systematic screening for mutations in the promoter and the coding region of the 5-HT{sub 1A} gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erdmann, J.; Shimron-Abarbanell, D.; Cichon, S.
1995-10-09
In the present study we sought to identify genetic variation in the 5-HT{sub 1A} receptor gene which through alteration of protein function or level of expression might contribute to the genetic predisposition to neuropsychiatric diseases. Genomic DNA samples from 159 unrelated subjects (including 45 schizophrenic, 46 bipolar affective, and 43 patients with Tourette`s syndrome, as well as 25 healthy controls) were investigated by single-strand conformation analysis. Overlapping PCR (polymerase chain reaction) fragments covered the whole coding sequence as well as the 5{prime} untranslated region of the 5-HT{sub 1A} gene. The region upstream to the coding sequence we investigated contains amore » functional promoter. We found two rare nucleotide sequence variants. Both mutations are located in the coding region of the gene: a coding mutation (A{yields}G) in nucleotide position 82 which leads to an amino acid exchange (Ile{yields}Val) in position 28 of the receptor protein and a silent mutation (C{yields}T) in nucleotide position 549. The occurrence of the Ile-28-Val substitution was studied in an extended sample of patients (n = 352) and controls (n = 210) but was found in similar frequencies in all groups. Thus, this mutation is unlikely to play a significant role in the genetic predisposition to the diseases investigated. In conclusion, our study does not provide evidence that the 5-HT{sub 1A} gene plays either a major or a minor role in the genetic predisposition to schizophrenia, bipolar affective disorder, or Tourette`s syndrome. 29 refs., 4 figs., 1 tab.« less
Hidden Structural Codes in Protein Intrinsic Disorder.
Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo
2017-10-17
Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.
Dong, Xiaomin; Chen, Kenian; Cuevas-Diaz Duran, Raquel; You, Yanan; Sloan, Steven A.; Zhang, Ye; Zong, Shan; Cao, Qilin; Barres, Ben A.; Wu, Jia Qian
2015-01-01
Long non-coding RNAs (lncRNAs) (> 200 bp) play crucial roles in transcriptional regulation during numerous biological processes. However, it is challenging to comprehensively identify lncRNAs, because they are often expressed at low levels and with more cell-type specificity than are protein-coding genes. In the present study, we performed ab initio transcriptome reconstruction using eight purified cell populations from mouse cortex and detected more than 5000 lncRNAs. Predicting the functions of lncRNAs using cell-type specific data revealed their potential functional roles in Central Nervous System (CNS) development. We performed motif searches in ENCODE DNase I digital footprint data and Mouse ENCODE promoters to infer transcription factor (TF) occupancy. By integrating TF binding and cell-type specific transcriptomic data, we constructed a novel framework that is useful for systematically identifying lncRNAs that are potentially essential for brain cell fate determination. Based on this integrative analysis, we identified lncRNAs that are regulated during Oligodendrocyte Precursor Cell (OPC) differentiation from Neural Stem Cells (NSCs) and that are likely to be involved in oligodendrogenesis. The top candidate, lnc-OPC, shows highly specific expression in OPCs and remarkable sequence conservation among placental mammals. Interestingly, lnc-OPC is significantly up-regulated in glial progenitors from experimental autoimmune encephalomyelitis (EAE) mouse models compared to wild-type mice. OLIG2-binding sites in the upstream regulatory region of lnc-OPC were identified by ChIP (chromatin immunoprecipitation)-Sequencing and validated by luciferase assays. Loss-of-function experiments confirmed that lnc-OPC plays a functional role in OPC genesis. Overall, our results substantiated the role of lncRNA in OPC fate determination and provided an unprecedented data source for future functional investigations in CNS cell types. We present our datasets and analysis results via the interactive genome browser at our laboratory website that is freely accessible to the research community. This is the first lncRNA expression database of collective populations of glia, vascular cells, and neurons. We anticipate that these studies will advance the knowledge of this major class of non-coding genes and their potential roles in neurological development and diseases. PMID:26683846
Zhang, Yimei; Li, Shuai; Wang, Fei; Chen, Zhuang; Chen, Jie; Wang, Liqun
2018-09-01
Toxicity of heavy metals from industrialization poses critical concern, and analysis of sources associated with potential human health risks is of unique significance. Assessing human health risk of pollution sources (factored health risk) concurrently in the whole and the sub region can provide more instructive information to protect specific potential victims. In this research, we establish a new expression model of human health risk based on quantitative analysis of sources contribution in different spatial scales. The larger scale grids and their spatial codes are used to initially identify the level of pollution risk, the type of pollution source and the sensitive population at high risk. The smaller scale grids and their spatial codes are used to identify the contribution of various sources of pollution to each sub region (larger grid) and to assess the health risks posed by each source for each sub region. The results of case study show that, for children (sensitive populations, taking school and residential area as major region of activity), the major pollution source is from the abandoned lead-acid battery plant (ALP), traffic emission and agricultural activity. The new models and results of this research present effective spatial information and useful model for quantifying the hazards of source categories and human health a t complex industrial system in the future. Copyright © 2018 Elsevier Ltd. All rights reserved.
Sieverding, Maia; Onyango, Cynthia; Suchman, Lauren
2018-01-01
Incorporating private healthcare providers into social health insurance schemes is an important means towards achieving universal health coverage in low and middle income countries. However, little research has been conducted about why private providers choose to participate in social health insurance systems in such contexts, or their experiences with these systems. We explored private providers' perceptions of and experiences with participation in two different social health insurance schemes in Sub-Saharan Africa-the National Health Insurance Scheme (NHIS) in Ghana and the National Hospital Insurance Fund (NHIF) in Kenya. In-depth interviews were held with providers working at 79 facilities of varying sizes in three regions of Kenya (N = 52) and three regions of Ghana (N = 27). Most providers were members of a social franchise network. Interviews covered providers' reasons for (non) enrollment in the health insurance system, their experiences with the accreditation process, and benefits and challenges with the system. Interviews were coded in Atlas.ti using an open coding approach and analyzed thematically. Most providers in Ghana were NHIS-accredited and perceived accreditation to be essential to their businesses, despite challenges they encountered due to long delays in claims reimbursement. In Kenya, fewer than half of providers were NHIF-accredited and several said that their clientele were not NHIF enrolled. Understanding of how the NHIF functioned was generally low. The lengthy and cumbersome accreditation process also emerged as a major barrier to providers' participation in the NHIF in Kenya, but the NHIS accreditation process was not a major concern for providers in Ghana. In expanding social health insurance, coordinated efforts are needed to increase coverage rates among underserved populations while also accrediting the private providers who serve those populations. Market pressure was a key force driving providers to gain and maintain accreditation in both countries. Developing mechanisms to engage private providers as stakeholders in social health insurance schemes is important to incentivizing their participation and addressing their concerns.
Adefenwa, Mufliat A; Peters, Sunday O; Agaviezor, Brilliant O; Wheto, Matthew; Adekoya, Khalid O; Okpeku, Moses; Oboh, Bola; Williams, Gabriel O; Adebambo, Olufunmilayo A; Singh, Mahipal; Thomas, Bolaji; De Donato, Marcos; Imumorin, Ikhide G
2013-07-01
The agouti-signaling protein (ASIP) plays a major role in mammalian pigmentation as an antagonist to melanocortin-1 receptor gene to stimulate pheomelanin synthesis, a major pigment conferring mammalian coat color. We sequenced a 352 bp fragment of ASIP gene spanning part of exon 2 and part of intron 2 in 215 animals representing six goat breeds from Nigeria and the United States: West African Dwarf, predominantly black; Red Sokoto, mostly red; and Sahel, mostly white from Nigeria; black and white Alpine, brown and white Spanish and white Saanen from the US. Twenty haplotypes from nine mutations representing three intronic, one silent and five missense (p.S19R, p.N35K, p.L36V, p.M42L and p.L45W) mutations were identified in Nigerian goats. Approximately 89 % of Nigerian goats carry haplotype 1 (TGCCATCCG) which seems to be the wild type configuration of mutations in this region of the gene. Although we found no association between these polymorphisms in the ASIP gene and coat color in Nigerian goats, in-silico functional analysis predicts putative deleterious functional impact of the p.L45W mutation on the basic amino-terminal domain of ASIP. In the American goats, two intronic mutations, g.293G>A and g.327C>A, were identified in the Alpine breed, although the g.293G>A mutation is common to American and Nigerian goat populations. All Sannen and Sahel goats in this study belong to haplotypes 1 of both populations which seem to be the wild-type composite ASIP haplotype. Overall, there was no clear association of this portion of the ASIP gene interrogated in this study with coat color variation. Therefore, additional genomic analyses of promoter sequence, the entire coding and non-coding regions of the ASIP gene will be required to obtain a definite conclusion.
Fiedler, Jan; Baker, Andrew H; Dimmeler, Stefanie; Heymans, Stephane; Mayr, Manuel; Thum, Thomas
2018-05-23
Non-coding RNAs are increasingly recognized not only as regulators of various biological functions but also as targets for a new generation of RNA therapeutics and biomarkers. We hereby review recent insights relating to non-coding RNAs including microRNAs (e.g. miR-126, miR-146a), long non-coding RNAs (e.g. MIR503HG, GATA6-AS, SMILR) and circular RNAs (e.g. cZNF292) and their role in vascular diseases. This includes identification and therapeutic use of hypoxia-regulated non-coding RNAs and endogenous non-coding RNAs that regulate intrinsic smooth muscle cell signalling, age-related non-coding RNAs and non-coding RNAs involved in the regulation of mitochondrial biology and metabolic control. Finally, we discuss non-coding RNA species with biomarker potential.
Biological significance of long non-coding RNA FTX expression in human colorectal cancer.
Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming
2015-01-01
The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer.
Obwaller, A; Duchêne, M; Bruhn, H; Steipe, B; Tripp, C; Kraft, D; Wiedermann, G; Auer, H; Aspöck, H
2001-05-01
Myosins from nematode parasites elicit strong humoral and cellular immune responses and have been investigated as vaccine candidates. In this study we cloned and sequenced a cDNA coding for myosin heavy chain from Toxocara canis, a nematode parasite of canids which may also infect humans and cause various unspecific symptoms. To determine the major antigenic regions the myosin heavy chain was systematically dissected into ten overlapping recombinant fusion polypeptides which were purified by metal chelate chromatography. Single fragments were then tested for their IgG reactivity in sera from toxocarosis patients and healthy probands. Two regions, one region at the mid to carboxy-terminal end of the head domain and one region in the rod domain, were identified as major antigens, which in combination were positive with 86% of the sera. The other domains were less reactive. This shows that the patients' IgG reactivity was not directed evenly against all parts of the molecule, but was rather clustered in few regions.
Complete genome of the cotton bacteria blight pathogen Xanthomonas citri pv. malvacearum strain MSCT
USDA-ARS?s Scientific Manuscript database
Xanthomonas citri pv. malvacearum (Xcm) is a major pathogen of Gossypium hirsutum. In this study we report the complete genome of the Xcm strain MSCT assembled from long read DNA sequencing technology. The MSCT genome is the first Xcm genome that has complete coding regions for Xcm transcriptional a...
NASA Technical Reports Server (NTRS)
Pindera, Marek-Jerzy; Salzar, Robert S.
1996-01-01
The objective of this work was the development of efficient, user-friendly computer codes for optimizing fabrication-induced residual stresses in metal matrix composites through the use of homogeneous and heterogeneous interfacial layer architectures and processing parameter variation. To satisfy this objective, three major computer codes have been developed and delivered to the NASA-Lewis Research Center, namely MCCM, OPTCOMP, and OPTCOMP2. MCCM is a general research-oriented code for investigating the effects of microstructural details, such as layered morphology of SCS-6 SiC fibers and multiple homogeneous interfacial layers, on the inelastic response of unidirectional metal matrix composites under axisymmetric thermomechanical loading. OPTCOMP and OPTCOMP2 combine the major analysis module resident in MCCM with a commercially-available optimization algorithm and are driven by user-friendly interfaces which facilitate input data construction and program execution. OPTCOMP enables the user to identify those dimensions, geometric arrangements and thermoelastoplastic properties of homogeneous interfacial layers that minimize thermal residual stresses for the specified set of constraints. OPTCOMP2 provides additional flexibility in the residual stress optimization through variation of the processing parameters (time, temperature, external pressure and axial load) as well as the microstructure of the interfacial region which is treated as a heterogeneous two-phase composite. Overviews of the capabilities of these codes are provided together with a summary of results that addresses the effects of various microstructural details of the fiber, interfacial layers and matrix region on the optimization of fabrication-induced residual stresses in metal matrix composites.
The functional role of long non-coding RNA in digestive system carcinomas.
Wang, Guang-Yu; Zhu, Yuan-Yuan; Zhang, Yan-Qiao
2014-09-01
In recent years, long non-coding RNAs (lncRNAs) are emerging as either oncogenes or tumor suppressor genes. Recent evidences suggest that lncRNAs play a very important role in digestive system carcinomas. However, the biological function of lncRNAs in the vast majority of digestive system carcinomas remains unclear. Recently, increasing studies has begun to explore their molecular mechanisms and regulatory networks that they are implicated in tumorigenesis. In this review, we highlight the emerging functional role of lncRNAs in digestive system carcinomas. It is becoming clear that lncRNAs will be exciting and potentially useful for diagnosis and treatment of digestive system carcinomas, some of these lncRNAs might function as both diagnostic markers and the treatment targets of digestive system carcinomas.
QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays.
Kalita, Cynthia A; Moyerbrailean, Gregory A; Brown, Christopher; Wen, Xiaoquan; Luca, Francesca; Pique-Regi, Roger
2018-03-01
The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an enhancer affects gene regulation. Recently, allele-specific analysis in high-throughput reporter assays, such as massively parallel reporter assays (MPRAs), have been used to functionally validate non-coding variants. However, we are still missing high-quality and robust data analysis tools for these datasets. We have further developed our method for allele-specific analysis QuASAR (quantitative allele-specific analysis of reads) to analyze allele-specific signals in barcoded read counts data from MPRA. Using this approach, we can take into account the uncertainty on the original plasmid proportions, over-dispersion, and sequencing errors. The provided allelic skew estimate and its standard error also simplifies meta-analysis of replicate experiments. Additionally, we show that a beta-binomial distribution better models the variability present in the allelic imbalance of these synthetic reporters and results in a test that is statistically well calibrated under the null. Applying this approach to the MPRA data, we found 602 SNPs with significant (false discovery rate 10%) allele-specific regulatory function in LCLs. We also show that we can combine MPRA with QuASAR estimates to validate existing experimental and computational annotations of regulatory variants. Our study shows that with appropriate data analysis tools, we can improve the power to detect allelic effects in high-throughput reporter assays. http://github.com/piquelab/QuASAR/tree/master/mpra. fluca@wayne.edu or rpique@wayne.edu. Supplementary data are available online at Bioinformatics. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
The complete mitochondrial genome of the stomatopod crustacean Squilla mantis
Cook, Charles E
2005-01-01
Background Animal mitochondrial genomes are physically separate from the much larger nuclear genomes and have proven useful both for phylogenetic studies and for understanding genome evolution. Within the phylum Arthropoda the subphylum Crustacea includes over 50,000 named species with immense variation in body plans and habitats, yet only 23 complete mitochondrial genomes are available from this subphylum. Results I describe here the complete mitochondrial genome of the crustacean Squilla mantis (Crustacea: Malacostraca: Stomatopoda). This 15994-nucleotide genome, the first described from a hoplocarid, contains the standard complement of 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and a non-coding AT-rich region that is found in most other metazoans. The gene order is identical to that considered ancestral for hexapods and crustaceans. The 70% AT base composition is within the range described for other arthropods. A single unusual feature of the genome is a 230 nucleotide non-coding region between a serine transfer RNA and the nad1 gene, which has no apparent function. I also compare gene order, nucleotide composition, and codon usage of the S. mantis genome and eight other malacostracan crustaceans. A translocation of the histidine transfer RNA gene is shared by three taxa in the order Decapoda, infraorder Brachyura; Callinectes sapidus, Portunus trituberculatus and Pseudocarcinus gigas. This translocation may be diagnostic for the Brachyura. For all nine taxa nucleotide composition is biased towards AT-richness, as expected for arthropods, and is within the range reported for other arthropods. Codon usage is biased, and much of this bias is probably due to the skew in nucleotide composition towards AT-richness. Conclusion The mitochondrial genome of Squilla mantis contains one unusual feature, a 230 base pair non-coding region has so far not been described in any other malacostracan. Comparisons with other Malacostraca show that all nine genomes, like most other mitochondrial genomes, share a bias toward AT-richness and a related bias in codon usage. The nine malacostracans included in this analysis are not representative of the diversity of the class Malacostraca, and additional malacostracan sequences would surely reveal other unusual genomic features that could be useful in understanding mitochondrial evolution in this taxon. PMID:16091132
DOE Office of Scientific and Technical Information (OSTI.GOV)
Myhre, Marit Renee; Olsen, Gunn-Hege; Gosert, Rainer
High-level replication of polyomavirus BK (BKV) in kidney transplant recipients is associated with the emergence of BKV variants with rearranged (rr) non-coding control region (NCCR) increasing viral early gene expression and cytopathology. Cloning and sequencing revealed the presence of a BKV quasispecies which included non-functional variants when assayed in a recombinant virus assay. Here we report that the rr-NCCR of BKV variants RH-3 and RH-12, both bearing a NCCR deletion including the 5' end of the agnoprotein coding sequence, mediated early and late viral reporter gene expression in kidney cells. However, in a recombinant virus they failed to produce infectiousmore » progeny despite large T-antigen and VP1 expression and the formation of nuclear virus-like particles. Infectious progeny was generated when the agnogene was reconstructed in cis or agnoprotein provided in trans from a co-existing BKV rr-NCCR variant. We conclude that complementation can rescue non-functional BKV variants in vitro and possibly in vivo.« less
Mutation Screening of 1,237 Cancer Genes across Six Model Cell Lines of Basal-Like Breast Cancer.
Olsson, Eleonor; Winter, Christof; George, Anthony; Chen, Yilun; Törngren, Therese; Bendahl, Pär-Ola; Borg, Åke; Gruvberger-Saal, Sofia K; Saal, Lao H
2015-01-01
Basal-like breast cancer is an aggressive subtype generally characterized as poor prognosis and lacking the expression of the three most important clinical biomarkers, estrogen receptor, progesterone receptor, and HER2. Cell lines serve as useful model systems to study cancer biology in vitro and in vivo. We performed mutational profiling of six basal-like breast cancer cell lines (HCC38, HCC1143, HCC1187, HCC1395, HCC1954, and HCC1937) and their matched normal lymphocyte DNA using targeted capture and next-generation sequencing of 1,237 cancer-associated genes, including all exons, UTRs and upstream flanking regions. In total, 658 somatic variants were identified, of which 378 were non-silent (average 63 per cell line, range 37-146) and 315 were novel (not present in the Catalogue of Somatic Mutations in Cancer database; COSMIC). 125 novel mutations were confirmed by Sanger sequencing (59 exonic, 48 3'UTR and 10 5'UTR, 1 splicing), with a validation rate of 94% of high confidence variants. Of 36 mutations previously reported for these cell lines but not detected in our exome data, 36% could not be detected by Sanger sequencing. The base replacements C/G>A/T, C/G>G/C, C/G>T/A and A/T>G/C were significantly more frequent in the coding regions compared to the non-coding regions (OR 3.2, 95% CI 2.0-5.3, P<0.0001; OR 4.3, 95% CI 2.9-6.6, P<0.0001; OR 2.4, 95% CI 1.8-3.1, P<0.0001; OR 1.8, 95% CI 1.2-2.7, P = 0.024, respectively). The single nucleotide variants within the context of T[C]T/A[G]A and T[C]A/T[G]A were more frequent in the coding than in the non-coding regions (OR 3.7, 95% CI 2.2-6.1, P<0.0001; OR 3.8, 95% CI 2.0-7.2, P = 0.001, respectively). Copy number estimations were derived from the targeted regions and correlated well to Affymetrix SNP array copy number data (Pearson correlation 0.82 to 0.96 for all compared cell lines; P<0.0001). These mutation calls across 1,237 cancer-associated genes and identification of novel variants will aid in the design and interpretation of biological experiments using these six basal-like breast cancer cell lines.
Bustamante, Carlos; Ovenden, Jennifer R
2016-01-01
The silver gemfish Rexea solandri is an important economic resource but Vulnerable to overfishing in Australian waters. The complete mitochondrial genome sequence is described from 1.6 million reads obtained via next generation sequencing. The total length of the mitogenome is 16,350 bp comprising 2 rRNA, 13 protein-coding genes, 22 tRNA and 2 non-coding regions. The mitogenome sequence was validated against sequences of PCR fragments and BLAST queries of Genbank. Gene order was equivalent to that found in marine fishes.
Vargas-Caro, Carolina; Bustamante, Carlos; Lamilla, Julio; Bennett, Michael B; Ovenden, Jennifer R
2016-07-01
The complete mitochondrial genome of the roughskin skate Dipturus trachyderma is described from 1 455 724 sequences obtained using Illumina NGS technology. Total length of the mitogenome was 16 909 base pairs, comprising 2 rRNAs, 13 protein-coding genes, 22 tRNAs and 2 non-coding regions. Phylogenetic analysis based on mtDNA revealed low genetic divergence among longnose skates, in particular, those dwelling the continental shelf and slope off the coasts of Chile and Argentina.
Bellis, Jennifer R; Kirkham, Jamie J; Nunn, Anthony J; Pirmohamed, Munir
2014-12-17
National Health Service (NHS) hospitals in the UK use a system of coding for patient episodes. The coding system used is the International Classification of Disease (ICD-10). There are ICD-10 codes which may be associated with adverse drug reactions (ADRs) and there is a possibility of using these codes for ADR surveillance. This study aimed to determine whether ADRs prospectively identified in children admitted to a paediatric hospital were coded appropriately using ICD-10. The electronic admission abstract for each patient with at least one ADR was reviewed. A record was made of whether the ADR(s) had been coded using ICD-10. Of 241 ADRs, 76 (31.5%) were coded using at least one ICD-10 ADR code. Of the oncology ADRs, 70/115 (61%) were coded using an ICD-10 ADR code compared with 6/126 (4.8%) non-oncology ADRs (difference in proportions 56%, 95% CI 46.2% to 65.8%; p < 0.001). The majority of ADRs detected in a prospective study at a paediatric centre would not have been identified if the study had relied on ICD-10 codes as a single means of detection. Data derived from administrative healthcare databases are not reliable for identifying ADRs by themselves, but may complement other methods of detection.
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A.; Highland, Heather M.; Locke, Adam E.; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J.; Teslovich, Tanya M.; Rayner, N. William; Robertson, Neil R.; Beer, Nicola L.; Rundle, Jana K.; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P.; Gabriel, Stacey; Gjesing, Anette P.; Groves, Christopher J.; Hollensted, Mette; Huyghe, Jeroen R.; Jackson, Anne U.; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S.; Stringham, Heather M.; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I.; Blangero, John; Cox, Nancy J.; Duggirala, Ravindranath; Hanis, Craig L.; Seielstad, Mark; Wilson, James G.; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L.; Doney, Alex S. F.; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E.; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D.; Morris, Andrew D.; Palmer, Colin N. A.; Collins, Francis S.; Mohlke, Karen L.; Bergman, Richard N.; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M.; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C.; Morris, Andrew P.; Altshuler, David; Meigs, James B.; Boehnke, Michael; McCarthy, Mark I.; Lindgren, Cecilia M.; Gloyn, Anna L.
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights. PMID:25625282
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A; Highland, Heather M; Locke, Adam E; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J; Teslovich, Tanya M; Rayner, N William; Robertson, Neil R; Beer, Nicola L; Rundle, Jana K; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P; Gabriel, Stacey; Gjesing, Anette P; Groves, Christopher J; Hollensted, Mette; Huyghe, Jeroen R; Jackson, Anne U; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S; Stringham, Heather M; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I; Blangero, John; Cox, Nancy J; Duggirala, Ravindranath; Hanis, Craig L; Seielstad, Mark; Wilson, James G; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L; Doney, Alex S F; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D; Morris, Andrew D; Palmer, Colin N A; Collins, Francis S; Mohlke, Karen L; Bergman, Richard N; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C; Morris, Andrew P; Altshuler, David; Meigs, James B; Boehnke, Michael; McCarthy, Mark I; Lindgren, Cecilia M; Gloyn, Anna L
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights.
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.
Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W
2018-05-31
In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.
Biological significance of long non-coding RNA FTX expression in human colorectal cancer
Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming
2015-01-01
The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer. PMID:26629053
NASA Astrophysics Data System (ADS)
Córsico, A. H.; Benvenuto, O. G.
Recently in our Observatory we have developed a new Stellar Pulsation Code, independently of other workers. Such program computes eigenvalues (eigenfrequencies) and eigenfunctions of non-radial modes in spherical non-perturbated stellar models. To accomplish this calculations, the four order eigenvalue problem (in the linear adiabatic approach) is solved by means of the well-know technique of Henyey on the finite differences scheme wich replace to the differential equations of the problem. In order to test the Code, we have computed numerous eigenmodes in polytropic configurations for several values of index n. In this comunication we show the excelent agreement of our results and that best available in the literature. Also, we present results of oscillations in models of white dwarf stars with homogeneus chemical composition (pure Helium). This models have been obtained with the Evolution Stellar Code of our Observatory. The calculations outlined above conform a first preliminary step in a major proyect whose main purpose is the study of pulsational properties of DA, DB and DO white dwarfs stars. Detailed investigations have demonstrated that such objets pulsates in non-radial g-modes with eigenperiods in the range 100-2000 sec.
The Unexpected Tuners: Are LncRNAs Regulating Host Translation during Infections?
Knap, Primoz; Tebaldi, Toma; Di Leva, Francesca; Biagioli, Marta; Dalla Serra, Mauro; Viero, Gabriella
2017-01-01
Pathogenic bacteria produce powerful virulent factors, such as pore-forming toxins, that promote their survival and cause serious damage to the host. Host cells reply to membrane stresses and ionic imbalance by modifying gene expression at the epigenetic, transcriptional and translational level, to recover from the toxin attack. The fact that the majority of the human transcriptome encodes for non-coding RNAs (ncRNAs) raises the question: do host cells deploy non-coding transcripts to rapidly control the most energy-consuming process in cells—i.e., host translation—to counteract the infection? Here, we discuss the intriguing possibility that membrane-damaging toxins induce, in the host, the expression of toxin-specific long non-coding RNAs (lncRNAs), which act as sponges for other molecules, encoding small peptides or binding target mRNAs to depress their translation efficiency. Unravelling the function of host-produced lncRNAs upon bacterial infection or membrane damage requires an improved understanding of host lncRNA expression patterns, their association with polysomes and their function during this stress. This field of investigation holds a unique opportunity to reveal unpredicted scenarios and novel approaches to counteract antibiotic-resistant infections. PMID:29469820
González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-05-21
To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species
Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha
2011-01-01
Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
VLF Trimpi modelling on the path NWC-Dunedin using both finite element and 3D Born modelling
NASA Astrophysics Data System (ADS)
Nunn, D.; Hayakawa, K. B. M.
1998-10-01
This paper investigates the numerical modelling of VLF Trimpis, produced by a D region inhomogeneity on the great circle path. Two different codes are used to model Trimpis on the path NWC-Dunedin. The first is a 2D Finite Element Method Code (FEM), whose solutions are rigorous and valid in the strong scattering or non-Born limit. The second code is a 3D model that invokes the Born approximation. The predicted Trimpis from these codes compare very closely, thus confirming the validity of both models. The modal scattering matrices for both codes are analysed in some detail and are found to have a comparable structure. They indicate strong scattering between the dominant TM modes. Analysis of the scattering matrix from the FEM code shows that departure from linear Born behaviour occurs when the inhomogeneity has a horizontal scale size of about 100 km and a maximum electron density enhancement at 75 km altitude of about 6 electrons.
He, Hongjuan; Xiu, Youcheng; Guo, Jing; Liu, Hui; Liu, Qi; Zeng, Tiebo; Chen, Yan; Zhang, Yan; Wu, Qiong
2013-01-01
Long non-coding RNAs (lncRNAs) as a key group of non-coding RNAs have gained widely attention. Though lncRNAs have been functionally annotated and systematic explored in higher mammals, few are under systematical identification and annotation. Owing to the expression specificity, known lncRNAs expressed in embryonic brain tissues remain still limited. Considering a large number of lncRNAs are only transcribed in brain tissues, studies of lncRNAs in developmental brain are therefore of special interest. Here, publicly available RNA-sequencing (RNA-seq) data in embryonic brain are integrated to identify thousands of embryonic brain lncRNAs by a customized pipeline. A significant proportion of novel transcripts have not been annotated by available genomic resources. The putative embryonic brain lncRNAs are shorter in length, less spliced and show less conservation than known genes. The expression of putative lncRNAs is in one tenth on average of known coding genes, while comparable with known lncRNAs. From chromatin data, putative embryonic brain lncRNAs are associated with active chromatin marks, comparable with known lncRNAs. Embryonic brain expressed lncRNAs are also indicated to have expression though not evident in adult brain. Gene Ontology analysis of putative embryonic brain lncRNAs suggests that they are associated with brain development. The putative lncRNAs are shown to be related to possible cis-regulatory roles in imprinting even themselves are deemed to be imprinted lncRNAs. Re-analysis of one knockdown data suggests that four regulators are associated with lncRNAs. Taken together, the identification and systematic analysis of putative lncRNAs would provide novel insights into uncharacterized mouse non-coding regions and the relationships with mammalian embryonic brain development. PMID:23967161
Enuka, Yehoshua; Lauriola, Mattia; Feldman, Morris E.; Sas-Chen, Aldema; Ulitsky, Igor; Yarden, Yosef
2016-01-01
Circular RNAs (circRNAs) are widespread circles of non-coding RNAs with largely unknown function. Because stimulation of mammary cells with the epidermal growth factor (EGF) leads to dynamic changes in the abundance of coding and non-coding RNA molecules, and culminates in the acquisition of a robust migratory phenotype, this cellular model might disclose functions of circRNAs. Here we show that circRNAs of EGF-stimulated mammary cells are stably expressed, while mRNAs and microRNAs change within minutes. In general, the circRNAs we detected are relatively long-lived and weakly expressed. Interestingly, they are almost ubiquitously co-expressed with the corresponding linear transcripts, and the respective, shared promoter regions are more active compared to genes producing linear isoforms with no detectable circRNAs. These findings imply that altered abundance of circRNAs, unlike changes in the levels of other RNAs, might not play critical roles in signaling cascades and downstream transcriptional networks that rapidly commit cells to specific outcomes. PMID:26657629
Non-contact assessment of melanin distribution via multispectral temporal illumination coding
NASA Astrophysics Data System (ADS)
Amelard, Robert; Scharfenberger, Christian; Wong, Alexander; Clausi, David A.
2015-03-01
Melanin is a pigment that is highly absorptive in the UV and visible electromagnetic spectra. It is responsible for perceived skin tone, and protects against harmful UV effects. Abnormal melanin distribution is often an indicator for melanoma. We propose a novel approach for non-contact melanin distribution via multispectral temporal illumination coding to estimate the two-dimensional melanin distribution based on its absorptive characteristics. In the proposed system, a novel multispectral, cross-polarized, temporally-coded illumination sequence is synchronized with a camera to measure reflectance under both multispectral and ambient illumination. This allows us to eliminate the ambient illumination contribution from the acquired reflectance measurements, and also to determine the melanin distribution in an observed region based on the spectral properties of melanin using the Beer-Lambert law. Using this information, melanin distribution maps can be generated for objective, quantitative assessment of skin type of individuals. We show that the melanin distribution map correctly identifies areas with high melanin densities (e.g., nevi).
Methodology for fast detection of false sharing in threaded scientific codes
Chung, I-Hsin; Cong, Guojing; Murata, Hiroki; Negishi, Yasushi; Wen, Hui-Fang
2014-11-25
A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
The identification of cis-regulatory elements: A review from a machine learning perspective.
Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W
2015-12-01
The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
O'Leary, Valerie Bríd; Maugg, Doris; Smida, Jan; Baumhoer, Daniel; Nathrath, Michaela; Ovsepian, Saak Victor; Atkinson, Michael John
2017-10-20
Breakage of the fragile site FRA16D disrupts the WWOX (WW Domain Containing Oxidoreductase) tumor suppressor gene in osteosarcoma. However, the frequency of breakage is not sufficient to explain the rate of WWOX loss in pathogenesis. The involvement of non-coding RNA transcripts is proposed due to their accumulation at fragile sites, where they are advocated to influence specific chromosomal regions associated with malignancy. The long ncRNA PARTICLE (promoter of MAT2A antisense radiation-induced circulating long non-coding RNA) is transiently elevated in response to irradiation and influences epigenetic silencing modification within WWOX . It now emerges that elevated PARTICLE levels are significantly associated with FRA16D non-breakage in OS patients. Although not associated with overall survival, high PARTICLE levels were found to be significantly linked to metastasis free outcome. The transcription of both PARTICLE and WWOX are transiently responsive to exposure to low doses of radiation in osteosarcoma cell lines. Herein, a relationship between WWOX and PARTICLE transcription is suggested in human osteosarcoma cell lines representing alternative genetic backgrounds. PARTICLE over-expression ameliorated WWOX promoter activity in U2OS harboring FRA16D non-breakage. It can be concluded that the lncRNA PARTICLE influences the WWOX tumor suppressor and in the absence of WWOX FRA16D breakage, it is associated with OS metastasis-free survival.
Causes of Death Data in the Global Burden of Disease Estimates for Ischemic and Hemorrhagic Stroke
Truelsen, Thomas; Krarup, Lars-Henrik; Iversen, Helle; Mensah, George A.; Feigin, Valery; Sposato, Luciano; Naghavi, Mohsen
2015-01-01
Background Stroke mortality estimates in the Global Burden of Disease (GBD) study are based on routine mortality statistics and redistribution of ill-defined codes that cannot be a cause of death, the so-called “garbage codes”. This study describes the contribution of these codes to stroke mortality estimates. Methods All available mortality data were compiled and non-specific cause codes were redistributed based on literature review and statistical methods. Ill-defined codes were redistributed to their specific cause of disease by age, sex, country, and year. The reassignment was done based on the international classification of diseases and the pathology behind each code by checking multiple causes of death and literature review. Results Unspecified stroke, and primary and secondary hypertension are leading contributing “garbage codes” to stroke mortality estimates for intracranial hemorrhagic stroke and ischemic stroke. There were marked differences in the fraction of death assigned to ischemic stroke and hemorrhagic stroke for unspecified stroke and hypertension between GBD regions and between age groups. Conclusions A large proportion of stroke fatalities is derived from the redistribution of “unspecified stroke” and “hypertension” with marked regional differences. Future advancements in stroke certification, data collections, and statistical analyses may improve the estimation of the global stroke burden. PMID:26505189
Nucleic Acid Chaperone Activity of the ORF1 Protein from the Mouse LINE-1 Retrotransposon
Martin, Sandra L.; Bushman, Frederic D.
2001-01-01
Non-LTR retrotransposons such as L1 elements are major components of the mammalian genome, but their mechanism of replication is incompletely understood. Like retroviruses and LTR-containing retrotransposons, non-LTR retrotransposons replicate by reverse transcription of an RNA intermediate. The details of cDNA priming and integration, however, differ between these two classes. In retroviruses, the nucleocapsid (NC) protein has been shown to assist reverse transcription by acting as a “nucleic acid chaperone,” promoting the formation of the most stable duplexes between nucleic acid molecules. A protein-coding region with an NC-like sequence is present in most non-LTR retrotransposons, but no such sequence is evident in mammalian L1 elements or other members of its class. Here we investigated the ORF1 protein from mouse L1 and found that it does in fact display nucleic acid chaperone activities in vitro. L1 ORF1p (i) promoted annealing of complementary DNA strands, (ii) facilitated strand exchange to form the most stable hybrids in competitive displacement assays, and (iii) facilitated melting of an imperfect duplex but stabilized perfect duplexes. These findings suggest a role for L1 ORF1p in mediating nucleic acid strand transfer steps during L1 reverse transcription. PMID:11134335
Photon migration in non-scattering tissue and the effects on image reconstruction
NASA Astrophysics Data System (ADS)
Dehghani, H.; Delpy, D. T.; Arridge, S. R.
1999-12-01
Photon propagation in tissue can be calculated using the relationship described by the transport equation. For scattering tissue this relationship is often simplified and expressed in terms of the diffusion approximation. This approximation, however, is not valid for non-scattering regions, for example cerebrospinal fluid (CSF) below the skull. This study looks at the effects of a thin clear layer in a simple model representing the head and examines its effect on image reconstruction. Specifically, boundary photon intensities (total number of photons exiting at a point on the boundary due to a source input at another point on the boundary) are calculated using the transport equation and compared with data calculated using the diffusion approximation for both non-scattering and scattering regions. The effect of non-scattering regions on the calculated boundary photon intensities is presented together with the advantages and restrictions of the transport code used. Reconstructed images are then presented where the forward problem is solved using the transport equation for a simple two-dimensional system containing a non-scattering ring and the inverse problem is solved using the diffusion approximation to the transport equation.
Voigt, Karsten; Sharma, Cynthia M; Mitschke, Jan; Joke Lambrecht, S; Voß, Björn; Hess, Wolfgang R; Steglich, Claudia
2014-01-01
Prochlorococcus is a genus of abundant and ecologically important marine cyanobacteria. Here, we present a comprehensive comparison of the structure and composition of the transcriptomes of two Prochlorococcus strains, which, despite their similarities, have adapted their gene pool to specific environmental constraints. We present genome-wide maps of transcriptional start sites (TSS) for both organisms, which are representatives of the two most diverse clades within the two major ecotypes adapted to high- and low-light conditions, respectively. Our data suggest antisense transcription for three-quarters of all genes, which is substantially more than that observed in other bacteria. We discovered hundreds of TSS within genes, most notably within 16 of the 29 prochlorosin genes, in strain MIT9313. A direct comparison revealed very little conservation in the location of TSS and the nature of non-coding transcripts between both strains. We detected extremely short 5′ untranslated regions with a median length of only 27 and 29 nt for MED4 and MIT9313, respectively, and for 8% of all protein-coding genes the median distance to the start codon is only 10 nt or even shorter. These findings and the absence of an obvious Shine–Dalgarno motif suggest that leaderless translation and ribosomal protein S1-dependent translation constitute alternative mechanisms for translation initiation in Prochlorococcus. We conclude that genome-wide antisense transcription is a major component of the transcriptional output from these relatively small genomes and that a hitherto unrecognized high degree of complexity and variability of gene expression exists in their transcriptional architecture. PMID:24739626
2015-01-01
Conformational polymorphism of DNA is a major causative factor behind several incurable trinucleotide repeat expansion disorders that arise from overexpansion of trinucleotide repeats located in coding/non-coding regions of specific genes. Hairpin DNA structures that are formed due to overexpansion of CAG repeat lead to Huntington’s disorder and spinocerebellar ataxias. Nonetheless, DNA hairpin stem structure that generally embraces B-form with canonical base pairs is poorly understood in the context of periodic noncanonical A…A mismatch as found in CAG repeat overexpansion. Molecular dynamics simulations on DNA hairpin stems containing A…A mismatches in a CAG repeat overexpansion show that A…A dictates local Z-form irrespective of starting glycosyl conformation, in sharp contrast to canonical DNA duplex. Transition from B-to-Z is due to the mechanistic effect that originates from its pronounced nonisostericity with flanking canonical base pairs facilitated by base extrusion, backbone and/or base flipping. Based on these structural insights we envisage that such an unusual DNA structure of the CAG hairpin stem may have a role in disease pathogenesis. As this is the first study that delineates the influence of a single A…A mismatch in reversing DNA helicity, it would further have an impact on understanding DNA mismatch repair. PMID:25876062
Raymond, Frédéric; Boisvert, Sébastien; Roy, Gaétan; Ritt, Jean-François; Légaré, Danielle; Isnard, Amandine; Stanke, Mario; Olivier, Martin; Tremblay, Michel J.; Papadopoulou, Barbara; Ouellette, Marc; Corbeil, Jacques
2012-01-01
The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae and L. infantum, while remaining syntenic to L. major. Globally, >90% of the L. tarentolae gene content was shared with the other Leishmania species. We identified 95 predicted coding sequences unique to L. tarentolae and 250 genes that were absent from L. tarentolae. Interestingly, many of the latter genes were expressed in the intracellular amastigote stage of pathogenic species. In addition, genes coding for products involved in antioxidant defence or participating in vesicular-mediated protein transport were underrepresented in L. tarentolae. In contrast to other Leishmania genomes, two gene families were expanded in L. tarentolae, namely the zinc metallo-peptidase surface glycoprotein GP63 and the promastigote surface antigen PSA31C. Overall, L. tarentolae's gene content appears better adapted to the promastigote insect stage rather than the amastigote mammalian stage. PMID:21998295
Schmidt, Ellen M; Zhang, Ji; Zhou, Wei; Chen, Jin; Mohlke, Karen L; Chen, Y Eugene; Willer, Cristen J
2015-08-15
The majority of variation identified by genome wide association studies falls in non-coding genomic regions and is hypothesized to impact regulatory elements that modulate gene expression. Here we present a statistically rigorous software tool GREGOR (Genomic Regulatory Elements and Gwas Overlap algoRithm) for evaluating enrichment of any set of genetic variants with any set of regulatory features. Using variants from five phenotypes, we describe a data-driven approach to determine the tissue and cell types most relevant to a trait of interest and to identify the subset of regulatory features likely impacted by these variants. Last, we experimentally evaluate six predicted functional variants at six lipid-associated loci and demonstrate significant evidence for allele-specific impact on expression levels. GREGOR systematically evaluates enrichment of genetic variation with the vast collection of regulatory data available to explore novel biological mechanisms of disease and guide us toward the functional variant at trait-associated loci. GREGOR, including source code, documentation, examples, and executables, is available at http://genome.sph.umich.edu/wiki/GREGOR. cristen@umich.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Gharoro, E P; Enabudoso, E J; Sodje, D K J
2011-01-01
The objective of the study is to investigate the prevalence and risk factors of non-consensual sex/rape in Benin. We surveyed 580 females in the University Community of Benin, 414 questionnaires were sufficiently completed for analysis. Seventy-six (18.4%) respondents reported that they had been victims of non-consensual sex (NCS), 36 in their current relationship. The unmarried single respondents had the lowest mean age at NCS experience of 18 years, while the divorced victims had the highest mean age of 32.5 ( P = 0.000). There was a major exposure peak age at 19 years with a smaller peak at 25. The majority of sex offenders were their present partners and next the husbands (22.2%). The father was the perpetrator on one (2.78%) occasion, while armed robbers raped two of the victims. Eighteen of the seventy-six respondents made a formal report. Cumulatively, 95.4% of the respondents felt it was futile reporting, four (5.3%) felt it was not all a bad experience. The risk of being infected with the HIV/AIDS virus was the worst fear. Ninety-five of four hundred and fourteen respondents want the public and parents to be educated, 64 would like the penal code to be tougher and better implemented, while 64 (14.0%) crave for a dress code for the University community. The self-reported incidence of NCS is high, majority were not formally reported as most of the sex offenders were the (ex)partners of the victims. There was a major exposure peak age at 19 with a smaller peak at 25 years. There was a condoned sense of futility and frustration in reporting.
2018-01-01
FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722
Onufrak, Stephen; Wilking, Cara; Cradock, Angie
2018-01-01
We examined community-level characteristics associated with free drinking water access policies in U.S. municipalities using data from a nationally representative survey of city managers/officials from 2,029 local governments in 2014. Outcomes were 4 free drinking water access policies. Explanatory measures were population size, rural/urban status, census region, poverty prevalence, education, and racial/ethnic composition. We used multivariable logistic regression to test differences and presented only significant findings. Many (56.3%) local governments had at least one community plan with a written objective to provide free drinking water in outdoor areas; municipalities in the Northeast and South regions and municipalities with ≤ 50% of non-Hispanic whites were less likely and municipalities with larger population size were more likely to have a plan. About 59% had polices/budget provisions for free drinking water in parks/outdoor recreation areas; municipalities in the Northeast and South regions were less likely and municipalities with larger population size were more likely to have it. Only 9.3% provided development incentives for placing drinking fountains in outdoor, publicly accessible areas; municipalities with larger population size were more likely to have it. Only 7.7% had a municipal plumbing code with a drinking fountain standard that differed from the statewide plumbing code; municipalities with a lower proportion of non-Hispanic whites were more likely to have it. In conclusion, over half of municipalities had written plans or a provision for providing free drinking water in parks, but providing development incentives or having a local plumbing code provision were rare. PMID:29713617
Park, Sohyun; Onufrak, Stephen; Wilking, Cara; Cradock, Angie
2018-04-01
We examined community-level characteristics associated with free drinking water access policies in U.S. municipalities using data from a nationally representative survey of city managers/officials from 2,029 local governments in 2014. Outcomes were 4 free drinking water access policies. Explanatory measures were population size, rural/urban status, census region, poverty prevalence, education, and racial/ethnic composition. We used multivariable logistic regression to test differences and presented only significant findings. Many (56.3%) local governments had at least one community plan with a written objective to provide free drinking water in outdoor areas; municipalities in the Northeast and South regions and municipalities with ≤ 50% of non-Hispanic whites were less likely and municipalities with larger population size were more likely to have a plan. About 59% had polices/budget provisions for free drinking water in parks/outdoor recreation areas; municipalities in the Northeast and South regions were less likely and municipalities with larger population size were more likely to have it. Only 9.3% provided development incentives for placing drinking fountains in outdoor, publicly accessible areas; municipalities with larger population size were more likely to have it. Only 7.7% had a municipal plumbing code with a drinking fountain standard that differed from the statewide plumbing code; municipalities with a lower proportion of non-Hispanic whites were more likely to have it. In conclusion, over half of municipalities had written plans or a provision for providing free drinking water in parks, but providing development incentives or having a local plumbing code provision were rare.
Wagner, Josiah T; Herrejon Chavez, Florisela; Podrabsky, Jason E
2016-01-01
The annual killifish Austrofundulus limnaeus inhabits ephemeral ponds in regions of Venezuela, South America. Permanent populations of A. limnaeus are maintained by production of stress-tolerant embryos that are able to persist in the desiccated sediment. Previous work has demonstrated that A. limnaeus have a remarkable ability to tolerate extended periods of anoxia and desiccating conditions. After considering temperature, A. limnaeus embryos have the highest known tolerance to anoxia when compared to any other vertebrate yet studied. Oxygen is completely essential for the process of oxidative phosphorylation by mitochondria, the intracellular organelle responsible for the majority of adenosine triphosphate production. Thus, understanding the unique properties of A. limnaeus mitochondria is of great interest. In this work, we describe the first complete mitochondrial genome (mtgenome) sequence of a single adult A. limnaeus individual and compare both coding and non-coding regions to several other closely related fish mtgenomes. Mitochondrial features were predicted using MitoAnnotator and polyadenylation sites were predicted using RNAseq mapping. To estimate the responsiveness of A. limnaeus mitochondria to anoxia treatment, we measure relative mitochondrial DNA copy number and total citrate synthase activity in both relatively anoxia-tolerant and anoxia-sensitive embryonic stages. Our cross-species comparative approach identifies unique features of ND1, ND5, ND6, and ATPase-6 that may facilitate the unique phenotype of A. limnaeus embryos. Additionally, we do not find evidence for mitochondrial degradation or biogenesis during anoxia/reoxygenation treatment in A. limnaeus embryos, suggesting that anoxia-tolerant mitochondria do not respond to anoxia in a manner similar to anoxia-sensitive mitochondria.
Swalla, B J; Just, M A; Pederson, E L; Jeffery, W R
1999-04-01
The Manx gene is required for the development of the tail and other chordate features in the ascidian tadpole larva. To determine the structure of the Manx gene, we isolated and sequenced genomic clones from the tailed ascidian Molgula oculata. The Manx gene contains 9 exons and encodes both major and minor Manx mRNAs, which differ in the length of their 5' untranslated regions. The coding region of the single-copy bobcat gene, which encodes a DEAD-box RNA helicase, is embedded within the first Manx intron. The organization of the bobcat and Manx transcription units was determined by comparing genomic and cDNA clones. The Manx-bobcat gene locus has an unusual organization in which a non-coding first exon is alternatively spliced at the 5' end of two different mRNAs. The bobcat and Manx genes are expressed coordinately during oogenesis and embryogenesis, but not during spermatogenesis, in which bobcat mRNA accumulates independently of Manx mRNA. Similar to Manx, zygotic bobcat transcripts accumulate in the embryonic primordia responsible for generating chordate features, including the dorsal neural tube and notochord, are downregulated during embryogenesis in the tailless species Molgula occulta and are upregulated in M. occulta X M. oculata hybrids, which restore these chordate features. Antisense experiments indicate that zygotic bobcat expression is required for development of the same suite of chordate features as Manx. The results show that the Manx-bobcat gene complex has a role in the development of chordate features in ascidian tadpole larvae.
The complete sequence of mitochondrial genome of polled yak (Bos grunniens).
Chu, Min; Wu, Xiaoyun; Liang, Chunnian; Pei, Jie; Ding, Xuezhi; Guo, Xian; Bao, Pengjia; Yan, Ping
2016-05-01
Generally speaking, the hornless trait is also known as polled. Although the POLL locus could be assigned to a 1.36-Mb interval in the centromeric region of BTA1 (Georges et al., 1993; Drögemüller et al., 2005)), and (Liu et al., 2014) reported a 147-kb segment that included three protein-coding genes was the most likely location of the POLL mutation in domestic yaks, the underlying genetic basis for the polled trait is still unknown. In this work, the complete mitochondrial genome sequence of polled yak was determined for the first time. The total length of the mitogenome is 16,324 bp long, with the base composition of 33.72% A, 27.25% T, 25.83% C, and 13.20% G. It contained 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and 1 non-coding region (D-loop region). The gene order of polled yak mitogenome is identical to that observed in most other vertebrates. The complete mitogenome sequence information of polled yak will provide useful data for further studies on protection of genetic resources and phylogenetic relationships within Bos grunniens.
Self-organizing approach for meta-genomes.
Zhu, Jianfeng; Zheng, Wei-Mou
2014-12-01
We extend the self-organizing approach for annotation of a bacterial genome to analyze the raw sequencing data of the human gut metagenome without sequence assembling. The original approach divides the genomic sequence of a bacterium into non-overlapping segments of equal length and assigns to each segment one of seven 'phases', among which one is for the noncoding regions, three for the direct coding regions to indicate the three possible codon positions of the segment starting site, and three for the reverse coding regions. The noncoding phase and the six coding phases are described by two frequency tables of the 64 triplet types or 'codon usages'. A set of codon usages can be used to update the phase assignment and vice versa. An iteration after an initialization leads to a convergent phase assignment to give an annotation of the genome. In the extension of the approach to a metagenome, we consider a mixture model of a number of categories described by different codon usages. The Illumina Genome Analyzer sequencing data of the total DNA from faecal samples are then examined to understand the diversity of the human gut microbiome. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wang, Jiajia; Li, Hu; Dai, Renhuai
2017-12-01
Here, we describe the first complete mitochondrial genome (mitogenome) sequence of the leafhopper Taharana fasciana (Coelidiinae). The mitogenome sequence contains 15,161 bp with an A + T content of 77.9%. It includes 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and one non-coding (A + T-rich) region; in addition, a repeat region is also present (GenBank accession no. KY886913). These genes/regions are in the same order as in the inferred insect ancestral mitogenome. All protein-coding genes have ATN as the start codon, and TAA or single T as the stop codons, except the gene ND3, which ends with TAG. Furthermore, we predicted the secondary structures of the rRNAs in T. fasciana. Six domains (domain III is absent in arthropods) and 41 helices were predicted for 16S rRNA, and 12S rRNA comprised three structural domains and 24 helices. Phylogenetic tree analysis confirmed that T. fasciana and other members of the Cicadellidae are clustered into a clade, and it identified the relationships among the subfamilies Deltocephalinae, Coelidiinae, Idiocerinae, Cicadellinae, and Typhlocybinae.
Sost, independent of the non-coding enhancer ECR5, is required for bone mechanoadaptation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robling, Alexander G.; Kang, Kyung Shin; Bullock, Whitney A.
Here, sclerostin ( Sost) is a negative regulator of bone formation that acts upon the Wnt signaling pathway. Sost is mechanically regulated at both mRNA and protein level such that loading represses and unloading enhances Sost expression, in osteocytes and in circulation. The non-coding evolutionarily conserved enhancer ECR5 has been previously reported as a transcriptional regulatory element required for modulating Sost expression in osteocytes. Here we explored the mechanisms by which ECR5, or several other putative transcriptional enhancers regulate Sost expression, in response to mechanical stimulation. We found that in vivo ulna loading is equally osteoanabolic in wildtype and Sostmore » –/– mice, although Sost is required for proper distribution of load-induced bone formation to regions of high strain. Using Luciferase reporters carrying the ECR5 non-coding enhancer and heterologous or homologous h SOST promoters, we found that ECR5 is mechanosensitive in vitro and that ECR5-driven Luciferase activity decreases in osteoblasts exposed to oscillatory fluid flow. Yet, ECR5–/– mice showed similar magnitude of load-induced bone formation and similar periosteal distribution of bone formation to high-strain regions compared to wildtype mice. Further, we found that in contrast to Sost–/– mice, which are resistant to disuse-induced bone loss, ECR5–/– mice lose bone upon unloading to a degree similar to wildtype control mice. ECR5 deletion did not abrogate positive effects of unloading on Sost, suggesting that additional transcriptional regulators and regulatory elements contribute to load-induced regulation of Sost.« less
Sost, independent of the non-coding enhancer ECR5, is required for bone mechanoadaptation
Robling, Alexander G.; Kang, Kyung Shin; Bullock, Whitney A.; ...
2016-09-04
Here, sclerostin ( Sost) is a negative regulator of bone formation that acts upon the Wnt signaling pathway. Sost is mechanically regulated at both mRNA and protein level such that loading represses and unloading enhances Sost expression, in osteocytes and in circulation. The non-coding evolutionarily conserved enhancer ECR5 has been previously reported as a transcriptional regulatory element required for modulating Sost expression in osteocytes. Here we explored the mechanisms by which ECR5, or several other putative transcriptional enhancers regulate Sost expression, in response to mechanical stimulation. We found that in vivo ulna loading is equally osteoanabolic in wildtype and Sostmore » –/– mice, although Sost is required for proper distribution of load-induced bone formation to regions of high strain. Using Luciferase reporters carrying the ECR5 non-coding enhancer and heterologous or homologous h SOST promoters, we found that ECR5 is mechanosensitive in vitro and that ECR5-driven Luciferase activity decreases in osteoblasts exposed to oscillatory fluid flow. Yet, ECR5–/– mice showed similar magnitude of load-induced bone formation and similar periosteal distribution of bone formation to high-strain regions compared to wildtype mice. Further, we found that in contrast to Sost–/– mice, which are resistant to disuse-induced bone loss, ECR5–/– mice lose bone upon unloading to a degree similar to wildtype control mice. ECR5 deletion did not abrogate positive effects of unloading on Sost, suggesting that additional transcriptional regulators and regulatory elements contribute to load-induced regulation of Sost.« less
77 FR 11163 - Notice of Buy American Waiver Under the American Recovery and Reinvestment Act of 2009
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-24
... holding power balanced anchors that will be used in the Alaska Region Research Vessel (ARRV). These... and section 176.80 of Title 2 of the Code of Federal Regulations, the National Science Foundation (NSF... being funded by the Foundation's Major Research Equipment and Facilities Construction (MREFC) account...
Potassium bromate (KBr03) is a rat renal carcinogen and a major drinking water disinfection by-product from ozonization. While KBr03 is a human nephro- and neuro-toxicant, its carcinogenicity in humans is unknown. Clear cell renal tumors, the common form of human renal carcinomas...
Gill, Andrew B; Anandappa, Gayathri; Patterson, Andrew J; Priest, Andrew N; Graves, Martin J; Janowitz, Tobias; Jodrell, Duncan I; Eisen, Tim; Lomas, David J
2015-02-01
This study introduces the use of 'error-category mapping' in the interpretation of pharmacokinetic (PK) model parameter results derived from dynamic contrast-enhanced (DCE-) MRI data. Eleven patients with metastatic renal cell carcinoma were enrolled in a multiparametric study of the treatment effects of bevacizumab. For the purposes of the present analysis, DCE-MRI data from two identical pre-treatment examinations were analysed by application of the extended Tofts model (eTM), using in turn a model arterial input function (AIF), an individually-measured AIF and a sample-average AIF. PK model parameter maps were calculated. Errors in the signal-to-gadolinium concentration ([Gd]) conversion process and the model-fitting process itself were assigned to category codes on a voxel-by-voxel basis, thereby forming a colour-coded 'error-category map' for each imaged slice. These maps were found to be repeatable between patient visits and showed that the eTM converged adequately in the majority of voxels in all the tumours studied. However, the maps also clearly indicated sub-regions of low Gd uptake and of non-convergence of the model in nearly all tumours. The non-physical condition ve ≥ 1 was the most frequently indicated error category and appeared sensitive to the form of AIF used. This simple method for visualisation of errors in DCE-MRI could be used as a routine quality-control technique and also has the potential to reveal otherwise hidden patterns of failure in PK model applications. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Nuclear export of RNA: Different sizes, shapes and functions.
Williams, Tobias; Ngo, Linh H; Wickramasinghe, Vihandha O
2018-03-01
Export of protein-coding and non-coding RNA molecules from the nucleus to the cytoplasm is critical for gene expression. This necessitates the continuous transport of RNA species of different size, shape and function through nuclear pore complexes via export receptors and adaptor proteins. Here, we provide an overview of the major RNA export pathways in humans, highlighting the similarities and differences between each. Its importance is underscored by the growing appreciation that deregulation of RNA export pathways is associated with human diseases like cancer. Crown Copyright © 2017. Published by Elsevier Ltd. All rights reserved.
Giakountis, Antonis; Moulos, Panagiotis; Zarkou, Vasiliki; Oikonomou, Christina; Harokopos, Vaggelis; Hatzigeorgiou, Artemis G; Reczko, Martin; Hatzis, Pantelis
2016-06-21
The canonical Wnt pathway plays a central role in stem cell maintenance, differentiation, and proliferation in the intestinal epithelium. Constitutive, aberrant activity of the TCF4/β-catenin transcriptional complex is the primary transforming factor in colorectal cancer. We identify a nuclear long non-coding RNA, termed WiNTRLINC1, as a direct target of TCF4/β-catenin in colorectal cancer cells. WiNTRLINC1 positively regulates the expression of its genomic neighbor ASCL2, a transcription factor that controls intestinal stem cell fate. WiNTRLINC1 interacts with TCF4/β-catenin to mediate the juxtaposition of its promoter with the regulatory regions of ASCL2. ASCL2, in turn, regulates WiNTRLINC1 transcriptionally, closing a feedforward regulatory loop that controls stem cell-related gene expression. This regulatory circuitry is highly amplified in colorectal cancer and correlates with increased metastatic potential and decreased patient survival. Our results uncover the interplay between non-coding RNA-mediated regulation and Wnt signaling and point to the diagnostic and therapeutic potential of WiNTRLINC1. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
VaDiR: an integrated approach to Variant Detection in RNA.
Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy
2018-02-01
Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Incremental Parallelization of Non-Data-Parallel Programs Using the Charon Message-Passing Library
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.
2000-01-01
Message passing is among the most popular techniques for parallelizing scientific programs on distributed-memory architectures. The reasons for its success are wide availability (MPI), efficiency, and full tuning control provided to the programmer. A major drawback, however, is that incremental parallelization, as offered by compiler directives, is not generally possible, because all data structures have to be changed throughout the program simultaneously. Charon remedies this situation through mappings between distributed and non-distributed data. It allows breaking up the parallelization into small steps, guaranteeing correctness at every stage. Several tools are available to help convert legacy codes into high-performance message-passing programs. They usually target data-parallel applications, whose loops carrying most of the work can be distributed among all processors without much dependency analysis. Others do a full dependency analysis and then convert the code virtually automatically. Even more toolkits are available that aid construction from scratch of message passing programs. None, however, allows piecemeal translation of codes with complex data dependencies (i.e. non-data-parallel programs) into message passing codes. The Charon library (available in both C and Fortran) provides incremental parallelization capabilities by linking legacy code arrays with distributed arrays. During the conversion process, non-distributed and distributed arrays exist side by side, and simple mapping functions allow the programmer to switch between the two in any location in the program. Charon also provides wrapper functions that leave the structure of the legacy code intact, but that allow execution on truly distributed data. Finally, the library provides a rich set of communication functions that support virtually all patterns of remote data demands in realistic structured grid scientific programs, including transposition, nearest-neighbor communication, pipelining, gather/scatter, and redistribution. At the end of the conversion process most intermediate Charon function calls will have been removed, the non-distributed arrays will have been deleted, and virtually the only remaining Charon functions calls are the high-level, highly optimized communications. Distribution of the data is under complete control of the programmer, although a wide range of useful distributions is easily available through predefined functions. A crucial aspect of the library is that it does not allocate space for distributed arrays, but accepts programmer-specified memory. This has two major consequences. First, codes parallelized using Charon do not suffer from encapsulation; user data is always directly accessible. This provides high efficiency, and also retains the possibility of using message passing directly for highly irregular communications. Second, non-distributed arrays can be interpreted as (trivial) distributions in the Charon sense, which allows them to be mapped to truly distributed arrays, and vice versa. This is the mechanism that enables incremental parallelization. In this paper we provide a brief introduction of the library and then focus on the actual steps in the parallelization process, using some representative examples from, among others, the NAS Parallel Benchmarks. We show how a complicated two-dimensional pipeline-the prototypical non-data-parallel algorithm- can be constructed with ease. To demonstrate the flexibility of the library, we give examples of the stepwise, efficient parallel implementation of nonlocal boundary conditions common in aircraft simulations, as well as the construction of the sequence of grids required for multigrid.
ICF target 2D modeling using Monte Carlo SNB electron thermal transport in DRACO
NASA Astrophysics Data System (ADS)
Chenhall, Jeffrey; Cao, Duc; Moses, Gregory
2016-10-01
The iSNB (implicit Schurtz Nicolai Busquet multigroup diffusion electron thermal transport method is adapted into a Monte Carlo (MC) transport method to better model angular and long mean free path non-local effects. The MC model was first implemented in the 1D LILAC code to verify consistency with the iSNB model. Implementation of the MC SNB model in the 2D DRACO code enables higher fidelity non-local thermal transport modeling in 2D implosions such as polar drive experiments on NIF. The final step is to optimize the MC model by hybridizing it with a MC version of the iSNB diffusion method. The hybrid method will combine the efficiency of a diffusion method in intermediate mean free path regions with the accuracy of a transport method in long mean free path regions allowing for improved computational efficiency while maintaining accuracy. Work to date on the method will be presented. This work was supported by Sandia National Laboratories and the Univ. of Rochester Laboratory for Laser Energetics.
Brečević, Lukrecija; Rinčić, Martina; Krsnik, Željka; Sedmak, Goran; Hamid, Ahmed B.; Kosyakova, Nadezda; Galić, Ivan; Liehr, Thomas; Borovečki, Fran
2015-01-01
We describe an as yet unreported neocentric small supernumerary marker chromosome (sSMC) derived from chromosome 1p21.3p21.2. It was present in 80% of the lymphocytes in a male patient with intellectual disability, severe speech deficit, mild dysmorphic features, and hyperactivity with elements of autism spectrum disorder (ASD). Several important neurodevelopmental genes are affected by the 3.56 Mb copy number gain of 1p21.3p21.2, which may be considered reciprocal in gene content to the recently recognized 1p21.3 microdeletion syndrome. Both 1p21.3 deletions and the presented duplication display overlapping symptoms, fitting the same disorder category. Contribution of coding and non-coding genes to the phenotype is discussed in the light of cellular and intercellular homeostasis disequilibrium. In line with this the presented 1p21.3p21.2 copy number gain correlated to 1p21.3 microdeletion syndrome verifies the hypothesis of a cumulative effect of the number of deregulated genes - homeostasis disequilibrium leading to overlapping phenotypes between microdeletion and microduplication syndromes. Although miR-137 appears to be the major player in the 1p21.3p21.2 region, deregulation of the DPYD (dihydropyrimidine dehydrogenase) gene may potentially affect neighboring genes underlying the overlapping symptoms present in both the copy number loss and copy number gain of 1p21. Namely, the all-in approach revealed that DPYD is a complex gene whose expression is epigenetically regulated by long non-coding RNAs (lncRNAs) within the locus. Furthermore, the long interspersed nuclear element-1 (LINE-1) L1MC1 transposon inserted in DPYD intronic transcript 1 (DPYD-IT1) lncRNA with its parasites, TcMAR-Tigger5b and pair of Alu repeats appears to be the “weakest link” within the DPYD gene liable to break. Identification of the precise mechanism through which DPYD is epigenetically regulated, and underlying reasons why exactly the break (FRA1E) happens, will consequently pave the way toward preventing severe toxicity to the antineoplastic drug 5-fluorouracil (5-FU) and development of the causative therapy for the dihydropyrimidine dehydrogenase deficiency. PMID:28123791
A Model of Ethical Decision Making from a Multicultural Perspective
ERIC Educational Resources Information Center
Frame, Marsha Wiggins; Williams, Carmen Braun
2005-01-01
Because shifts in the world's ethnic and racial demographics mean that the majority of the world's population is non-White (M. D'Andrea & P Arredondo, 1997), it is imperative that counselors develop a means for working ethically with a diverse clientele. In this article, the authors argue that the current Code of Ethics and Standards of Practice…
USDA-ARS?s Scientific Manuscript database
Background: Small non-coding RNAs (smRNAs) are known to have major roles in gene regulation in eukaryotes. In plants, knowledge of the biogenesis and mechanisms of action of smRNA classes including microRNAs (miRNAs), short interfering RNAs (siRNAs), and trans-acting siRNAs (tasiRNAs) has been gaine...
The non-coding RNA landscape of human hematopoiesis and leukemia.
Schwarzer, Adrian; Emmrich, Stephan; Schmidt, Franziska; Beck, Dominik; Ng, Michelle; Reimer, Christina; Adams, Felix Ferdinand; Grasedieck, Sarah; Witte, Damian; Käbler, Sebastian; Wong, Jason W H; Shah, Anushi; Huang, Yizhou; Jammal, Razan; Maroz, Aliaksandra; Jongen-Lavrencic, Mojca; Schambach, Axel; Kuchenbauer, Florian; Pimanda, John E; Reinhardt, Dirk; Heckl, Dirk; Klusmann, Jan-Henning
2017-08-09
Non-coding RNAs have emerged as crucial regulators of gene expression and cell fate decisions. However, their expression patterns and regulatory functions during normal and malignant human hematopoiesis are incompletely understood. Here we present a comprehensive resource defining the non-coding RNA landscape of the human hematopoietic system. Based on highly specific non-coding RNA expression portraits per blood cell population, we identify unique fingerprint non-coding RNAs-such as LINC00173 in granulocytes-and assign these to critical regulatory circuits involved in blood homeostasis. Following the incorporation of acute myeloid leukemia samples into the landscape, we further uncover prognostically relevant non-coding RNA stem cell signatures shared between acute myeloid leukemia blasts and healthy hematopoietic stem cells. Our findings highlight the importance of the non-coding transcriptome in the formation and maintenance of the human blood hierarchy.While micro-RNAs are known regulators of haematopoiesis and leukemogenesis, the role of long non-coding RNAs is less clear. Here the authors provide a non-coding RNA expression landscape of the human hematopoietic system, highlighting their role in the formation and maintenance of the human blood hierarchy.
Region 9 NPL Sites (Superfund Sites 2013)
NPL site POINT locations for the US EPA Region 9. NPL (National Priorities List) sites are hazardous waste sites that are eligible for extensive long-term cleanup under the Superfund program. Eligibility is determined by a scoring method called Hazard Ranking System. Sites with high scores are listed on the NPL. The majority of the locations are derived from polygon centroids of digitized site boundaries. The remaining locations were generated from address geocoding and digitizing. Area covered by this data set include Arizona, California, Nevada, Hawaii, Guam, American Samoa, Northern Marianas and Trust Territories. Attributes include NPL status codes, NPL industry type codes and environmental indicators. Related table, NPL_Contaminants contains information about contaminated media types and chemicals. This is a one-to-many relate and can be related to the feature class using the relationship classes under the Feature Data Set ENVIRO_CONTAMINANT.
Shaffer, Christopher D.; Chen, Elizabeth J.; Quisenberry, Thomas J.; Ko, Kevin; Braverman, John M.; Giarla, Thomas C.; Mortimer, Nathan T.; Reed, Laura K.; Smith, Sheryl T.; Robic, Srebrenka; McCartha, Shannon R.; Perry, Danielle R.; Prescod, Lindsay M.; Sheppard, Zenyth A.; Saville, Ken J.; McClish, Allison; Morlock, Emily A.; Sochor, Victoria R.; Stanton, Brittney; Veysey-White, Isaac C.; Revie, Dennis; Jimenez, Luis A.; Palomino, Jennifer J.; Patao, Melissa D.; Patao, Shane M.; Himelblau, Edward T.; Campbell, Jaclyn D.; Hertz, Alexandra L.; McEvilly, Maddison F.; Wagner, Allison R.; Youngblom, James; Bedi, Baljit; Bettincourt, Jeffery; Duso, Erin; Her, Maiye; Hilton, William; House, Samantha; Karimi, Masud; Kumimoto, Kevin; Lee, Rebekah; Lopez, Darryl; Odisho, George; Prasad, Ricky; Robbins, Holly Lyn; Sandhu, Tanveer; Selfridge, Tracy; Tsukashima, Kara; Yosif, Hani; Kokan, Nighat P.; Britt, Latia; Zoellner, Alycia; Spana, Eric P.; Chlebina, Ben T.; Chong, Insun; Friedman, Harrison; Mammo, Danny A.; Ng, Chun L.; Nikam, Vinayak S.; Schwartz, Nicholas U.; Xu, Thomas Q.; Burg, Martin G.; Batten, Spencer M.; Corbeill, Lindsay M.; Enoch, Erica; Ensign, Jesse J.; Franks, Mary E.; Haiker, Breanna; Ingles, Judith A.; Kirkland, Lyndsay D.; Lorenz-Guertin, Joshua M.; Matthews, Jordan; Mittig, Cody M.; Monsma, Nicholaus; Olson, Katherine J.; Perez-Aragon, Guillermo; Ramic, Alen; Ramirez, Jordan R.; Scheiber, Christopher; Schneider, Patrick A.; Schultz, Devon E.; Simon, Matthew; Spencer, Eric; Wernette, Adam C.; Wykle, Maxine E.; Zavala-Arellano, Elizabeth; McDonald, Mitchell J.; Ostby, Kristine; Wendland, Peter; DiAngelo, Justin R.; Ceasrine, Alexis M.; Cox, Amanda H.; Docherty, James E.B.; Gingras, Robert M.; Grieb, Stephanie M.; Pavia, Michael J.; Personius, Casey L.; Polak, Grzegorz L.; Beach, Dale L.; Cerritos, Heaven L.; Horansky, Edward A.; Sharif, Karim A.; Moran, Ryan; Parrish, Susan; Bickford, Kirsten; Bland, Jennifer; Broussard, Juliana; Campbell, Kerry; Deibel, Katelynn E.; Forka, Richard; Lemke, Monika C.; Nelson, Marlee B.; O'Keeffe, Catherine; Ramey, S. Mariel; Schmidt, Luke; Villegas, Paola; Jones, Christopher J.; Christ, Stephanie L.; Mamari, Sami; Rinaldi, Adam S.; Stity, Ghazal; Hark, Amy T.; Scheuerman, Mark; Silver Key, S. Catherine; McRae, Briana D.; Haberman, Adam S.; Asinof, Sam; Carrington, Harriette; Drumm, Kelly; Embry, Terrance; McGuire, Richard; Miller-Foreman, Drew; Rosen, Stella; Safa, Nadia; Schultz, Darrin; Segal, Matt; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Skuse, Gary; Paetkau, Don W.; Bridgman, Rachael K.; Brown, Charlotte M.; Carroll, Alicia R.; Gifford, Francesca M.; Gillespie, Julie Beth; Herman, Susan E.; Holtcamp, Krystal L.; Host, Misha A.; Hussey, Gabrielle; Kramer, Danielle M.; Lawrence, Joan Q.; Martin, Madeline M.; Niemiec, Ellen N.; O'Reilly, Ashleigh P.; Pahl, Olivia A.; Quintana, Guadalupe; Rettie, Elizabeth A.S.; Richardson, Torie L.; Rodriguez, Arianne E.; Rodriguez, Mona O.; Schiraldi, Laura; Smith, Joanna J.; Sugrue, Kelsey F.; Suriano, Lindsey J.; Takach, Kaitlyn E.; Vasquez, Arielle M.; Velez, Ximena; Villafuerte, Elizabeth J.; Vives, Laura T.; Zellmer, Victoria R.; Hauke, Jeanette; Hauser, Charles R.; Barker, Karolyn; Cannon, Laurie; Parsamian, Perouza; Parsons, Samantha; Wichman, Zachariah; Bazinet, Christopher W.; Johnson, Diana E.; Bangura, Abubakarr; Black, Jordan A.; Chevee, Victoria; Einsteen, Sarah A.; Hilton, Sarah K.; Kollmer, Max; Nadendla, Rahul; Stamm, Joyce; Fafara-Thompson, Antoinette E.; Gygi, Amber M.; Ogawa, Emmy E.; Van Camp, Matt; Kocsisova, Zuzana; Leatherman, Judith L.; Modahl, Cassie M.; Rubin, Michael R.; Apiz-Saab, Susana S.; Arias-Mejias, Suzette M.; Carrion-Ortiz, Carlos F.; Claudio-Vazquez, Patricia N.; Espada-Green, Debbie M.; Feliciano-Camacho, Marium; Gonzalez-Bonilla, Karina M.; Taboas-Arroyo, Mariela; Vargas-Franco, Dorianmarie; Montañez-Gonzalez, Raquel; Perez-Otero, Joseph; Rivera-Burgos, Myrielis; Rivera-Rosario, Francisco J.; Eisler, Heather L.; Alexander, Jackie; Begley, Samatha K.; Gabbard, Deana; Allen, Robert J.; Aung, Wint Yan; Barshop, William D.; Boozalis, Amanda; Chu, Vanessa P.; Davis, Jeremy S.; Duggal, Ryan N.; Franklin, Robert; Gavinski, Katherine; Gebreyesus, Heran; Gong, Henry Z.; Greenstein, Rachel A.; Guo, Averill D.; Hanson, Casey; Homa, Kaitlin E.; Hsu, Simon C.; Huang, Yi; Huo, Lucy; Jacobs, Sarah; Jia, Sasha; Jung, Kyle L.; Wai-Chee Kong, Sarah; Kroll, Matthew R.; Lee, Brandon M.; Lee, Paul F.; Levine, Kevin M.; Li, Amy S.; Liu, Chengyu; Liu, Max Mian; Lousararian, Adam P.; Lowery, Peter B.; Mallya, Allyson P.; Marcus, Joseph E.; Ng, Patrick C.; Nguyen, Hien P.; Patel, Ruchik; Precht, Hashini; Rastogi, Suchita; Sarezky, Jonathan M.; Schefkind, Adam; Schultz, Michael B.; Shen, Delia; Skorupa, Tara; Spies, Nicholas C.; Stancu, Gabriel; Vivian Tsang, Hiu Man; Turski, Alice L.; Venkat, Rohit; Waldman, Leah E.; Wang, Kaidi; Wang, Tracy; Wei, Jeffrey W.; Wu, Dennis Y.; Xiong, David D.; Yu, Jack; Zhou, Karen; McNeil, Gerard P.; Fernandez, Robert W.; Menzies, Patrick Gomez; Gu, Tingting; Buhler, Jeremy; Mardis, Elaine R.; Elgin, Sarah C.R.
2017-01-01
The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (∼5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae. To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae. Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. PMID:28667019
Gulyaeva, Anastasia; Hoogendoorn, Erik; Giles, Julia; Samborskiy, Dmitry
2017-01-01
ABSTRACT In five experimentally characterized arterivirus species, the 5′-end genome coding region encodes the most divergent nonstructural proteins (nsp's), nsp1 and nsp2, which include papain-like proteases (PLPs) and other poorly characterized domains. These are involved in regulation of transcription, polyprotein processing, and virus-host interaction. Here we present results of a bioinformatics analysis of this region of 14 arterivirus species, including that of the most distantly related virus, wobbly possum disease virus (WPDV), determined by a modified 5′ rapid amplification of cDNA ends (RACE) protocol. By combining profile-profile comparisons and phylogeny reconstruction, we identified an association of the four distinct domain layouts of nsp1-nsp2 with major phylogenetic lineages, implicating domain gain, including duplication, and loss in the early nsp1 evolution. Specifically, WPDV encodes highly divergent homologs of PLP1a, PLP1b, PLP1c, and PLP2, with PLP1a lacking the catalytic Cys residue, but does not encode nsp1 Zn finger (ZnF) and “nuclease” domains, which are conserved in other arteriviruses. Unexpectedly, our analysis revealed that the only catalytically active nsp1 PLP of equine arteritis virus (EAV), known as PLP1b, is most similar to PLP1c and thus is likely to be a PLP1b paralog. In all non-WPDV arteriviruses, PLP1b/c and PLP1a show contrasting patterns of conservation, with the N- and C-terminal subdomains, respectively, being enriched with conserved residues, which is indicative of different functional specializations. The least conserved domain of nsp2, the hypervariable region (HVR), has its size varied 5-fold and includes up to four copies of a novel PxPxPR motif that is potentially recognized by SH3 domain-containing proteins. Apparently, only EAV lacks the signal that directs −2 ribosomal frameshifting in the nsp2 coding region. IMPORTANCE Arteriviruses comprise a family of mammalian enveloped positive-strand RNA viruses that include some of the most economically important pathogens of swine. Most of our knowledge about this family has been obtained through characterization of viruses from five species: Equine arteritis virus, Simian hemorrhagic fever virus, Lactate dehydrogenase-elevating virus, Porcine respiratory and reproductive syndrome virus 1, and Porcine respiratory and reproductive syndrome virus 2. Here we present the results of comparative genomics analyses of viruses from all known 14 arterivirus species, including the most distantly related virus, WPDV, whose genome sequence was completed in this study. Our analysis focused on the multifunctional 5′-end genome coding region that encodes multidomain nonstructural proteins 1 and 2. Using diverse bioinformatics techniques, we identified many patterns of evolutionary conservation that are specific to members of distinct arterivirus species, both characterized and novel, or their groups. They are likely associated with structural and functional determinants important for virus replication and virus-host interaction. PMID:28053107
Obesity surgery in England: an examination of the health episode statistics 1996-2005.
Ells, Louisa J; Macknight, Neil; Wilkinson, John R
2007-03-01
The authors examined the uptake of obesity surgery across England. Data were analyzed from the Hospital Episode Statistics covering all 9 goverment office regions with a total population of 49.1 million. The data analyzed covered 9 years 1996/97 - 2004/05. 1,465 records were identified with a primary diagnostic code for obesity and an operation code for obesity surgery. The surgery was performed mostly in women (male to female ratio of 1:5), who were predominantly mid-aged (average 40.4 years +/- SD 9.00), the majority of whom reside in local authority districts ranked within the lowest two deprivation quintiles. The availability of obesity surgery varied considerably across the 9 different regions of England, although the number of operations increased nationally over time. Access to this intervention is highly variable and does not appear to reflect estimated regional differences in morbid obesity. This specialist service may benefit from more effective national organization, to ensure appropriate capacity and eliminate inequalities in service delivery.
Perceived Academic Entitlement of Non-Traditional Students in Higher Education
ERIC Educational Resources Information Center
Johnson, Jeffery M.
2014-01-01
The purpose of this quantitative research study was to determine if perceived academic entitlement exists among non-traditional students in higher education. The study examined students enrolled as juniors and seniors at two of the regional campuses of a major public university and students enrolled at a regional university in the southern United…
Fan, Zenghua; Zhao, Meng; Joshi, Parth D.; Li, Ping; Zhang, Yan; Guo, Weimin; Xu, Yichi; Wang, Haifang; Zhao, Zhihu
2017-01-01
Abstract Circadian rhythm exerts its influence on animal physiology and behavior by regulating gene expression at various levels. Here we systematically explored circadian long non-coding RNAs (lncRNAs) in mouse liver and examined their circadian regulation. We found that a significant proportion of circadian lncRNAs are expressed at enhancer regions, mostly bound by two key circadian transcription factors, BMAL1 and REV-ERBα. These circadian lncRNAs showed similar circadian phases with their nearby genes. The extent of their nuclear localization is higher than protein coding genes but less than enhancer RNAs. The association between enhancer and circadian lncRNAs is also observed in tissues other than liver. Comparative analysis between mouse and rat circadian liver transcriptomes showed that circadian transcription at lncRNA loci tends to be conserved despite of low sequence conservation of lncRNAs. One such circadian lncRNA termed lnc-Crot led us to identify a super-enhancer region interacting with a cluster of genes involved in circadian regulation of metabolism through long-range interactions. Further experiments showed that lnc-Crot locus has enhancer function independent of lnc-Crot's transcription. Our results suggest that the enhancer-associated circadian lncRNAs mark the genomic loci modulating long-range circadian gene regulation and shed new lights on the evolutionary origin of lncRNAs. PMID:28335007
Subramaniam, Saravanan; Mohapatra, Jajati K; Das, Biswajit; Sharma, Gaurav K; Biswal, Jitendra K; Mahajan, Sonalika; Misri, Jyoti; Dash, Bana B; Pattnaik, Bramhadev
2015-07-01
Foot-and-mouth disease virus (FMDV) serotype Asia1 was first reported in India in 1951, where three major genetic lineages (B, C and D) of this serotype have been described until now. In this study, the capsid protein coding region of serotype Asia1 viruses (n = 99) from India were analyzed, giving importance to the viruses circulating since 2007. All of the isolates (n = 50) recovered during 2007-2013 were found to group within the re-emerging cluster of lineage C (designated as sublineage C(R)). The evolutionary rate of sublineage C(R) was estimated to be slightly higher than that of the serotype as a whole, and the time of the most recent common ancestor for this cluster was estimated to be approximately 2001. In comparison to the older isolates of lineage C (1993-2001), the re-emerging viruses showed variation at eight amino acid positions, including substitutions at the antigenically critical residues VP279 and VP2131. However, no direct correlation was found between sequence variations and antigenic relationships. The number of codons under positive selection and the nature of the selection pressure varied widely among the structural proteins, implying a heterogeneous pattern of evolution in serotype Asia1. While episodic diversifying selection appears to play a major role in shaping the evolution of VP1 and VP3, selection pressure acting on codons of VP2 is largely pervasive. Further, episodic positive selection appears to be responsible for the early diversification of lineage C. Recombination events identified in the structural protein coding region indicates its probable role in adaptive evolution of serotype Asia1 viruses.
Kim, Sangkyu; Welsh, David A; Myers, Leann; Cherry, Katie E; Wyckoff, Jennifer; Jazwinski, S Michal
2015-02-28
We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13-14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity.
Kim, Sangkyu; Welsh, David A.; Myers, Leann; Cherry, Katie E.; Wyckoff, Jennifer; Jazwinski, S. Michal
2015-01-01
We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13–14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity. PMID:25682868
Implementation of non-axisymmetric mesh system in the gyrokinetic PIC code (XGC) for Stellarators
NASA Astrophysics Data System (ADS)
Moritaka, Toseo; Hager, Robert; Cole, Micheal; Chang, Choong-Seock; Lazerson, Samuel; Ku, Seung-Hoe; Ishiguro, Seiji
2017-10-01
Gyrokinetic simulation is a powerful tool to investigate turbulent and neoclassical transports based on the first-principles of plasma kinetics. The gyrokinetic PIC code XGC has been developed for integrated simulations that cover the entire region of Tokamaks. Complicated field line and boundary structures should be taken into account to demonstrate edge plasma dynamics under the influence of X-point and vessel components. XGC employs gyrokinetic Poisson solver on unstructured triangle mesh to deal with this difficulty. We introduce numerical schemes newly developed for XGC simulation in non-axisymmetric Stellarator geometry. Triangle meshes in each poloidal plane are defined by PEST poloidal angle in the VMEC equilibrium so that they have the same regular structure in the straight field line coordinate. Electric charge of marker particle is distributed to the triangles specified by the field-following projection to the neighbor poloidal planes. 3D spline interpolation in a cylindrical mesh is also used to obtain equilibrium magnetic field at the particle position. These schemes capture the anisotropic plasma dynamics and resulting potential structure with high accuracy. The triangle meshes can smoothly connect to unstructured meshes in the edge region. We will present the validation test in the core region of Large Helical Device and discuss about future challenges toward edge simulations.
Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.
Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D
2017-12-03
A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Comprehensive Analysis of Genome Rearrangements in Eight Human Malignant Tumor Tissues
Wang, Chong
2016-01-01
Carcinogenesis is a complex multifactorial, multistage process, but the precise mechanisms are not well understood. In this study, we performed a genome-wide analysis of the copy number variation (CNV), breakpoint region (BPR) and fragile sites in 2,737 tumor samples from eight tumor entities and in 432 normal samples. CNV detection and BPR identification revealed that BPRs tended to accumulate in specific genomic regions in tumor samples whereas being dispersed genome-wide in the normal samples. Hotspots were observed, at which segments with similar alteration in copy number were overlapped along with BPRs adjacently clustered. Evaluation of BPR occurrence frequency showed that at least one was detected in about and more than 15% of samples for each tumor entity while BPRs were maximal in 12% of the normal samples. 127 of 2,716 tumor-relevant BPRs (termed ‘common BPRs’) exhibited also a noticeable occurrence frequency in the normal samples. Colocalization assessment identified 20,077 CNV-affecting genes and 169 of these being known tumor-related genes. The most noteworthy genes are KIAA0513 important for immunologic, synaptic and apoptotic signal pathways, intergenic non-coding RNA RP11-115C21.2 possibly acting as oncogene or tumor suppressor by changing the structure of chromatin, and ADAM32 likely importance in cancer cell proliferation and progression by ectodomain-shedding of diverse growth factors, and the well-known tumor suppressor gene p53. The BPR distributions indicate that CNV mutations are likely non-random in tumor genomes. The marked recurrence of BPRs at specific regions supports common progression mechanisms in tumors. The presence of hotspots together with common BPRs, despite its small group size, imply a relation between fragile sites and cancer-gene alteration. Our data further suggest that both protein-coding and non-coding genes possessing a range of biological functions might play a causative or functional role in tumor biology. This research enhances our understanding of the mechanisms for tumorigenesis and progression. PMID:27391163
Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu
2017-01-01
Apostasioideae, consists of only two genera, Apostasia and Neuwiedia , which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla ), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase ( ndh ) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci- ndhA intron, matK-5'trnK , clpP-psbB , rps8-rpl14 , trnT-trnL , 3'trnK-matK , clpP intron , psbK-trnK , trnS-psbC , and ndhF-rpl32 -that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed.
Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu
2017-01-01
Apostasioideae, consists of only two genera, Apostasia and Neuwiedia, which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase (ndh) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci—ndhA intron, matK-5′trnK, clpP-psbB, rps8-rpl14, trnT-trnL, 3′trnK-matK, clpP intron, psbK-trnK, trnS-psbC, and ndhF-rpl32—that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed. PMID:29046685
Rao, Shu-Quan; Hu, Hui-Ling; Ye, Ning; Shen, Yan; Xu, Qi
2015-08-01
The heritability of schizophrenia has been reported to be as high as ~80%, but the contribution of genetic variants identified to this heritability remains to be estimated. Long non-coding RNAs (LncRNAs) are involved in multiple processes critical to normal cellular function and dysfunction of lncRNA MIAT may contribute to the pathophysiology of schizophrenia. However, the genetic evidence of lncRNAs involved in schizophrenia has not been documented. Here, we conducted a two-stage association analysis on 8 tag SNPs that cover the whole MIAT locus in two independent Han Chinese schizophrenia case-control cohorts (discovery sample from Shanxi Province: 1093 patients with paranoid schizophrenia and 1180 control subjects; replication cohort from Jilin Province: 1255 cases and 1209 healthy controls). In discovery stage, significant genetic association with paranoid schizophrenia was observed for rs1894720 (χ(2)=74.20, P=7.1E-18), of which minor allele (T) had an OR of 1.70 (95% CI=1.50-1.91). This association was confirmed in the replication cohort (χ(2)=22.66, P=1.9E-06, OR=1.32, 95%CI 1.18-1.49). Besides, a weak genotypic association was detected for rs4274 (χ(2)=4.96, df=2, P=0.03); the AA carriers showed increased disease risk (OR=1.30, 95%CI=1.03-1.64). No significant association was found between any haplotype and paranoid schizophrenia. The present studies showed that lncRNA MIAT was a novel susceptibility gene for paranoid schizophrenia in the Chinese Han population. Considering that most lncRNAs locate in non-coding regions, our result may explain why most susceptibility loci for schizophrenia identified by genome wide association studies were out of coding regions. Copyright © 2015 Elsevier B.V. All rights reserved.
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...
2016-09-20
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Moralli, Daniela; Nudel, Ron; Chan, May T M; Green, Catherine M; Volpi, Emanuela V; Benítez-Burraco, Antonio; Newbury, Dianne F; García-Bellido, Paloma
2015-01-01
We report on a young female, who presents with a severe speech and language disorder and a balanced de novo complex chromosomal rearrangement, likely to have resulted from a chromosome 7 pericentromeric inversion, followed by a chromosome 7 and 11 translocation. Using molecular cytogenetics, we mapped the four breakpoints to 7p21.1-15.3 (chromosome position: 20,954,043-21,001,537, hg19), 7q31 (chromosome position: 114,528,369-114,556,605, hg19), 7q21.3 (chromosome position: 93,884,065-93,933,453, hg19) and 11p12 (chromosome position: 38,601,145-38,621,572, hg19). These regions contain only non-coding transcripts (ENSG00000232790 on 7p21.1 and TCONS_00013886, TCONS_00013887, TCONS_00014353, TCONS_00013888 on 7q21) indicating that no coding sequences are directly disrupted. The breakpoint on 7q31 mapped 200 kb downstream of FOXP2, a well-known language gene. No splice site or non-synonymous coding variants were found in the FOXP2 coding sequence. We were unable to detect any changes in the expression level of FOXP2 in fibroblast cells derived from the proband, although this may be the result of the low expression level of FOXP2 in these cells. We conclude that the phenotype observed in this patient either arises from a subtle change in FOXP2 regulation due to the disruption of a downstream element controlling its expression, or from the direct disruption of non-coding RNAs.
Singh, Kh Dhanachandra; Karthikeyan, Muthusamy
2014-12-01
The renin-angiotensin-aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool.
Zhou, Xia; Tambo, Ernest; Su, Jing; Fang, Qiang; Ruan, Wei; Chen, Jun-Hu; Yin, Ming-Bo; Zhou, Xiao-Nong
2017-10-01
Plasmodium vivax merozoite surface protein-1 (PvMSP1) gene codes for a major malaria vaccine candidate antigen. However, its polymorphic nature represents an obstacle to the design of a protective vaccine. In this study, we analyzed the genetic polymorphism and natural selection of the C-terminal 42 kDa fragment within PvMSP1 gene (Pv MSP142) from 77 P. vivax isolates, collected from imported cases of China-Myanmar border (CMB) areas in Yunnan province and the inland cases from Anhui, Yunnan, and Zhejiang province in China during 2009-2012. Totally, 41 haplotypes were identified and 30 of them were new haplotypes. The differences between the rates of non-synonymous and synonymous mutations suggest that PvMSP142 has evolved under natural selection, and a high selective pressure preferentially acted on regions identified of PvMSP133. Our results also demonstrated that PvMSP142 of P. vivax isolates collected on China-Myanmar border areas display higher genetic polymorphisms than those collected from inland of China. Such results have significant implications for understanding the dynamic of the P. vivax population and may be useful information towards China malaria elimination campaign strategies.
Diversidad haplotípica en el manatí Trichechus manatus en Cuba: resultados preliminares
Hernandez-Martinez, Damir; Alvarez-Aleman, Anmari; Bonde, Robert K.; Powell, James A.; Garcia-Machado, Erik
2013-01-01
The aim of this analysis was to obtain information regarding the mtDNA haplotype composition of the manatee (T. manatus) occupying the Cuban archipelago. A fragment of 410 bp of the non-coding region was analyzed for 12 individual manatees from Cuba and one from Florida, USA. Only two haplotypes were identified. Haplotype A1, found exclusively in Florida (including in the sample analyzed here) but also found in Mexico, the Dominican Republic and Puerto Rico, was the most frequent haplotype (11 of the 12 samples from Cuba) and widely distributed. The second haplotype A3, previously referred to as endemic from Belize, was identified from an individual stranded in Isabela de Sagua, north of Cuba. These preliminary results provide information about three major aspects of manatee biology: (1) the mtDNA genetic diversity of T. manatus in Cuba seems low as compared to other regions of the Caribbean; (2) the Cuban population likely belongs to the group comprising Florida and the portions of the Greater Antilles; and (3) the territories of Belize and Cuba have exchanged individuals at present or in a relatively recent past.
MtDNA profile of West Africa Guineans: towards a better understanding of the Senegambia region.
Rosa, Alexandra; Brehm, António; Kivisild, Toomas; Metspalu, Ene; Villems, Richard
2004-07-01
The matrilineal genetic composition of 372 samples from the Republic of Guiné-Bissau (West African coast) was studied using RFLPs and partial sequencing of the mtDNA control and coding region. The majority of the mtDNA lineages of Guineans (94%) belong to West African specific sub-clusters of L0-L3 haplogroups. A new L3 sub-cluster (L3h) that is found in both eastern and western Africa is present at moderately low frequencies in Guinean populations. A non-random distribution of haplogroups U5 in the Fula group, the U6 among the "Brame" linguistic family and M1 in the Balanta-Djola group, suggests a correlation between the genetic and linguistic affiliation of Guinean populations. The presence of M1 in Balanta populations supports the earlier suggestion of their Sudanese origin. Haplogroups U5 and U6, on the other hand, were found to be restricted to populations that are thought to represent the descendants of a southern expansion of Berbers. Particular haplotypes, found almost exclusively in East-African populations, were found in some ethnic groups with an oral tradition claiming Sudanese origin.
Benoit, Jamie; Ayoub, Albert; Rakic, Pasko
2016-11-01
Histone acetylation is considered a major epigenetic process that affects brain development and synaptic plasticity, as well as learning and memory. The transcriptional effectors and morphological changes responsible for plasticity as a result of long-term modifications to histone acetylation are not fully understood. To this end, we pharmacologically inhibited histone deacetylation using Trichostatin A in adult (6-month-old) mice and found significant increases in the levels of the acetylated histone marks H3Lys9, H3Lys14 and H4Lys12. High-resolution transcriptome analysis of diverse brain regions uncovered few differences in gene expression between treated and control animals, none of which were plasticity related. Instead, after increased histone acetylation, we detected a large number of novel transcriptionally active regions, which correspond to long non-coding RNAs (lncRNAs). We also surprisingly found no significant changes in dendritic spine plasticity in layers 1 and 2/3 of the visual cortex using long-term in vivo two-photon imaging. Our results indicate that chronic pharmacologically induced histone acetylation can be decoupled from gene expression and instead, may potentially exert a post-transcriptional effect through the differential production of lncRNAs.
Tilman, Gaëlle; Arnoult, Nausica; Lenglez, Sandrine; Van Beneden, Amandine; Loriot, Axelle; De Smet, Charles; Decottignies, Anabelle
2012-08-01
Epigenetic dysfunctions, including DNA methylation alterations, play major roles in cancer initiation and progression. Although it is well established that gene promoter demethylation activates transcription, it remains unclear whether hypomethylation of repetitive heterochromatin similarly affects expression of non-coding RNA from these loci. Understanding how repetitive non-coding RNAs are transcriptionally regulated is important given that their established upregulation by the heat shock (HS) pathway suggests important functions in cellular response to stress, possibly by promoting heterochromatin reconstruction. We found that, although pericentromeric satellite 2 (Sat2) DNA hypomethylation is detected in a majority of cancer cell lines of various origins, DNA methylation loss does not constitutively hyperactivate Sat2 expression, and also does not facilitate Sat2 transcriptional induction upon heat shock. In melanoma tumor samples, our analysis revealed that the HS response, frequently upregulated in tumors, is probably the main determinant of Sat2 RNA expression in vivo. Next, we tested whether HS pathway hyperactivation may drive Sat2 demethylation. Strikingly, we found that both hyperthermia and hyperactivated RasV12 oncogene, another potent inducer of the HS pathway, reduced Sat2 methylation levels by up to 27% in human fibroblasts recovering from stress. Demethylation occurred locally on Sat2 repeats, resulting in a demethylation signature that was also detected in cancer cell lines with moderate genome-wide hypomethylation. We therefore propose that upregulation of Sat2 transcription in response to HS pathway hyperactivation during tumorigenesis may promote localized demethylation of the locus. This, in turn, may contribute to tumorigenesis, as demethylation of Sat2 was previously reported to favor chromosomal rearrangements.
Atkinson, Quentin D; Gray, Russell D; Drummond, Alexei J
2008-02-01
The relative timing and size of regional human population growth following our expansion from Africa remain unknown. Human mitochondrial DNA (mtDNA) diversity carries a legacy of our population history. Given a set of sequences, we can use coalescent theory to estimate past population size through time and draw inferences about human population history. However, recent work has challenged the validity of using mtDNA diversity to infer species population sizes. Here we use Bayesian coalescent inference methods, together with a global data set of 357 human mtDNA coding-region sequences, to infer human population sizes through time across 8 major geographic regions. Our estimates of relative population sizes show remarkable concordance with the contemporary regional distribution of humans across Africa, Eurasia, and the Americas, indicating that mtDNA diversity is a good predictor of population size in humans. Plots of population size through time show slow growth in sub-Saharan Africa beginning 143-193 kya, followed by a rapid expansion into Eurasia after the emergence of the first non-African mtDNA lineages 50-70 kya. Outside Africa, the earliest and fastest growth is inferred in Southern Asia approximately 52 kya, followed by a succession of growth phases in Northern and Central Asia (approximately 49 kya), Australia (approximately 48 kya), Europe (approximately 42 kya), the Middle East and North Africa (approximately 40 kya), New Guinea (approximately 39 kya), the Americas (approximately 18 kya), and a second expansion in Europe (approximately 10-15 kya). Comparisons of relative regional population sizes through time suggest that between approximately 45 and 20 kya most of humanity lived in Southern Asia. These findings not only support the use of mtDNA data for estimating human population size but also provide a unique picture of human prehistory and demonstrate the importance of Southern Asia to our recent evolutionary past.
DOE Office of Scientific and Technical Information (OSTI.GOV)
MacArthur, Stewart; Li, Xiao-Yong; Li, Jingyi
2009-05-15
BACKGROUND: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. RESULTS: Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of functionmore » and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. CONCLUSIONS: It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.« less
Gan, Han Ming; Tan, Mun Hua; Lee, Yin Peng; Austin, Christopher M
2016-05-01
The mitochondrial genome sequence of the Australian tadpole shrimp, Triops australiensis is presented (GenBank Accession Number: NC_024439) and compared with other Triops species. Triops australiensis has a mitochondrial genome of 15,125 base pairs consisting of 13 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a non-coding AT-rich region. The T. australiensis mitogenome is composed of 36.4% A, 16.1% C, 12.3% G and 35.1% T. The mitogenome gene order conforms to the primitive arrangement for Branchiopod crustaceans, which is also conserved within the Pancrustacean.
Austin, Christopher M; Tan, Mun Hua; Lee, Yin Peng; Croft, Laurence J; Meekan, Mark G; Pierce, Simon J; Gan, Han Ming
2016-01-01
The complete mitochondrial genome of the parasitic copepod Pandarus rhincodonicus was obtained from a partial genome scan using the HiSeq sequencing system. The Pandarus rhincodonicus mitogenome has 14,480 base pairs (62% A+T content) made up of 12 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a putative 384 bp non-coding AT-rich region. This Pandarus mitogenome sequence is the first for the family Pandaridae, the second for the order Siphonostomatoida and the sixth for the Copepoda.
Industry self regulation of television food advertising: responsible or responsive?
King, Lesley; Hebden, Lana; Grunseit, Anne; Kelly, Bridget; Chapman, Kathy; Venugopal, Kamalesh
2011-06-01
This study evaluated the impact of the Australian Food and Grocery Council (AFGC) self-regulatory initiative on unhealthy food marketing to children, introduced in January 2009. The study compared patterns of food advertising by AFGC and non-AFGC signatory companies in 2009, 2007 and 2006 on three Sydney commercial free-to-air television channels. Data were collected across seven days in May 2006 and 2007, and four days in May 2009. Advertised foods were coded as core, non-core and miscellaneous. Regression for counts analyses was used to examine change in rates of advertisements across the sampled periods and differential change between AFGC-signatory or non-signatory companies between 2007 and 2009. Of 36 food companies that advertised during the 2009 sample period, 14 were AFGC signatories. The average number of food advertisements decreased significantly from 7.0 per hour in 2007 to 5.9 in 2009. There was a significant reduction in non-core food advertising from 2007 to 2009 by AFGC signatories compared with non-signatory companies overall and during peak times, when the largest numbers of children were viewing. There was no reduction in the rate of non-core food advertisements by all companies, and these advertisements continue to comprise the majority during peak viewing times. While some companies have responded to pressures to reduce unhealthy food advertising on television, the impact of the self-regulatory code is limited by the extent of uptake by food companies. The continued advertising of unhealthy foods indicates that this self-regulatory code does not adequately protect children.
Design of a double-anode magnetron-injection gun for the W-band gyrotron
NASA Astrophysics Data System (ADS)
Jang, Kwang Ho; Choi, Jin Joo; So, Joon Ho
2015-07-01
A double-anode magnetron-injection gun (MIG) was designed. The MIG is for a W-band 10-kW gyrotron. Analytic equations based on adiabatic theory and angular momentum conservation were used to examine the initial design parameters such as the cathode angle, and the radius of the beam emitting surface. The MIG's performances were predicted by using an electron trajectory code, the EGUN code. The beam spread of the axial velocity, Δvz/vz, obtained from the EGUN code was observed to be 1.34% at α = 1.3. The cathode edge emission and the thermal effect were modeled. The cathode edge emission was found to have a major effect on the velocity spread. The electron beam's quality was significantly improved by affixing non-emissive cylinders to the cathode.
Wang, Quanxiu; Xie, Weibo; Xing, Hongkun; Yan, Ju; Meng, Xiangzhou; Li, Xinglei; Fu, Xiangkui; Xu, Jiuyue; Lian, Xingming; Yu, Sibin; Xing, Yongzhong; Wang, Gongwei
2015-06-01
Chlorophyll content is one of the most important physiological traits as it is closely related to leaf photosynthesis and crop yield potential. So far, few genes have been reported to be involved in natural variation of chlorophyll content in rice (Oryza sativa) and the extent of variations explored is very limited. We conducted a genome-wide association study (GWAS) using a diverse worldwide collection of 529 O. sativa accessions. A total of 46 significant association loci were identified. Three F2 mapping populations with parents selected from the association panel were tested for validation of GWAS signals. We clearly demonstrated that Grain number, plant height, and heading date7 (Ghd7) was a major locus for natural variation of chlorophyll content at the heading stage by combining evidence from near-isogenic lines and transgenic plants. The enhanced expression of Ghd7 decreased the chlorophyll content, mainly through down-regulating the expression of genes involved in the biosynthesis of chlorophyll and chloroplast. In addition, Narrow leaf1 (NAL1) corresponded to one significant association region repeatedly detected over two years. We revealed a high degree of polymorphism in the 5' UTR and four non-synonymous SNPs in the coding region of NAL1, and observed diverse effects of the major haplotypes. The loci or candidate genes identified would help to fine-tune and optimize the antenna size of canopies in rice breeding. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Massive black hole and gas dynamics in galaxy nuclei mergers - I. Numerical implementation
NASA Astrophysics Data System (ADS)
Lupi, Alessandro; Haardt, Francesco; Dotti, Massimo
2015-01-01
Numerical effects are known to plague adaptive mesh refinement (AMR) codes when treating massive particles, e.g. representing massive black holes (MBHs). In an evolving background, they can experience strong, spurious perturbations and then follow unphysical orbits. We study by means of numerical simulations the dynamical evolution of a pair MBHs in the rapidly and violently evolving gaseous and stellar background that follows a galaxy major merger. We confirm that spurious numerical effects alter the MBH orbits in AMR simulations, and show that numerical issues are ultimately due to a drop in the spatial resolution during the simulation, drastically reducing the accuracy in the gravitational force computation. We therefore propose a new refinement criterion suited for massive particles, able to solve in a fast and precise way for their orbits in highly dynamical backgrounds. The new refinement criterion we designed enforces the region around each massive particle to remain at the maximum resolution allowed, independently upon the local gas density. Such maximally resolved regions then follow the MBHs along their orbits, and effectively avoids all spurious effects caused by resolution changes. Our suite of high-resolution, AMR hydrodynamic simulations, including different prescriptions for the sub-grid gas physics, shows that the new refinement implementation has the advantage of not altering the physical evolution of the MBHs, accounting for all the non-trivial physical processes taking place in violent dynamical scenarios, such as the final stages of a galaxy major merger.
Hypermethylation in the ZBTB20 gene is associated with major depressive disorder
2014-01-01
Background Although genetic variation is believed to contribute to an individual’s susceptibility to major depressive disorder, genome-wide association studies have not yet identified associations that could explain the full etiology of the disease. Epigenetics is increasingly believed to play a major role in the development of common clinical phenotypes, including major depressive disorder. Results Genome-wide MeDIP-Sequencing was carried out on a total of 50 monozygotic twin pairs from the UK and Australia that are discordant for depression. We show that major depressive disorder is associated with significant hypermethylation within the coding region of ZBTB20, and is replicated in an independent cohort of 356 unrelated case-control individuals. The twins with major depressive disorder also show increased global variation in methylation in comparison with their unaffected co-twins. ZBTB20 plays an essential role in the specification of the Cornu Ammonis-1 field identity in the developing hippocampus, a region previously implicated in the development of major depressive disorder. Conclusions Our results suggest that aberrant methylation profiles affecting the hippocampus are associated with major depressive disorder and show the potential of the epigenetic twin model in neuro-psychiatric disease. PMID:24694013
Death Certification Errors and the Effect on Mortality Statistics.
McGivern, Lauri; Shulman, Leanne; Carney, Jan K; Shapiro, Steven; Bundock, Elizabeth
Errors in cause and manner of death on death certificates are common and affect families, mortality statistics, and public health research. The primary objective of this study was to characterize errors in the cause and manner of death on death certificates completed by non-Medical Examiners. A secondary objective was to determine the effects of errors on national mortality statistics. We retrospectively compared 601 death certificates completed between July 1, 2015, and January 31, 2016, from the Vermont Electronic Death Registration System with clinical summaries from medical records. Medical Examiners, blinded to original certificates, reviewed summaries, generated mock certificates, and compared mock certificates with original certificates. They then graded errors using a scale from 1 to 4 (higher numbers indicated increased impact on interpretation of the cause) to determine the prevalence of minor and major errors. They also compared International Classification of Diseases, 10th Revision (ICD-10) codes on original certificates with those on mock certificates. Of 601 original death certificates, 319 (53%) had errors; 305 (51%) had major errors; and 59 (10%) had minor errors. We found no significant differences by certifier type (physician vs nonphysician). We did find significant differences in major errors in place of death ( P < .001). Certificates for deaths occurring in hospitals were more likely to have major errors than certificates for deaths occurring at a private residence (59% vs 39%, P < .001). A total of 580 (93%) death certificates had a change in ICD-10 codes between the original and mock certificates, of which 348 (60%) had a change in the underlying cause-of-death code. Error rates on death certificates in Vermont are high and extend to ICD-10 coding, thereby affecting national mortality statistics. Surveillance and certifier education must expand beyond local and state efforts. Simplifying and standardizing underlying literal text for cause of death may improve accuracy, decrease coding errors, and improve national mortality statistics.
Davis, Matthew P; Carrieri, Claudia; Saini, Harpreet K; van Dongen, Stijn; Leonardi, Tommaso; Bussotti, Giovanni; Monahan, Jack M; Auchynnikava, Tania; Bitetti, Angelo; Rappsilber, Juri; Allshire, Robin C; Shkumatava, Alena; O'Carroll, Dónal; Enright, Anton J
2017-07-01
Spermatogenesis is associated with major and unique changes to chromosomes and chromatin. Here, we sought to understand the impact of these changes on spermatogenic transcriptomes. We show that long terminal repeats (LTRs) of specific mouse endogenous retroviruses (ERVs) drive the expression of many long non-coding transcripts (lncRNA). This process occurs post-mitotically predominantly in spermatocytes and round spermatids. We demonstrate that this transposon-driven lncRNA expression is a conserved feature of vertebrate spermatogenesis. We propose that transposon promoters are a mechanism by which the genome can explore novel transcriptional substrates, increasing evolutionary plasticity and allowing for the genesis of novel coding and non-coding genes. Accordingly, we show that a small fraction of these novel ERV-driven transcripts encode short open reading frames that produce detectable peptides. Finally, we find that distinct ERV elements from the same subfamilies act as differentially activated promoters in a tissue-specific context. In summary, we demonstrate that LTRs can act as tissue-specific promoters and contribute to post-mitotic spermatogenic transcriptome diversity. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.
Rosenthal, Marjorie S; Jeon, Sangchoon; Crowley, Angela A
2016-05-01
To determine frequency of non-compliance with child care regulations among family day care homes (FDCH) and identify the role of income in compliance. We analyzed non-compliance in 746 routine, unannounced inspection and re-inspection reports of FDCH collected by the Connecticut Department of Public Health licensing specialists in 2007-2008 and linked results to median income of zip code data. We grouped the 83 state regulations into 12 regulation categories, analyzed 11 categories, and used latent class analysis to classify each FDCH as high or low compliance for each category. We used logistic regression analysis to estimate the odds ratios of low compliance. Among the 746 FDCH inspections (594 first inspections and 152 re-inspections), we found high rates of non-compliance in inspection regulations in immunizations (32.9 %), water temperature (35.6 %) and hazards (30.0 %). Among the 11 regulation categories, 4 categories (indoor safety, emergency preparedness, child/family/staff documentation, and qualifications of provider) had regulations with high non-compliance. Median household income of FDCH zip code was lower for re-inspection sites than for inspection sites ($34,715 vs. $57,118, p < 0.0001) and FDCH in the lowest quartile of income had greater odds of low compliance in indoor safety (OR 1.86, 95 % CI 1.04, 3.35, p < 0.05). The majority of FDCH were in compliance with the majority of regulations, yet there are glaring non-compliance issues in inspections and re-inspections and there are income-based inequities that place children at higher risk who are already at high risk for suboptimal health outcomes.
Jia, Dongjie; Shen, Fei; Wang, Yi; Wu, Ting; Xu, Xuefeng; Zhang, Xinzhong; Han, Zhenhai
2018-05-11
Many efforts have been made to map quantitative trait loci (QTLs) to facilitate practical marker-assisted selection (MAS) in plants. In the present study, we identified four genome-wide major QTLs responsible for apple fruit acidity by MapQTL and BSA-seq analyses using two independent pedigree-based populations. Candidate genes were screened in major QTL regions, and three functional gene markers, including a non-synonymous A/G single nucleotide polymorphism (SNP) in the coding region of MdPP2CH, a 36-bp insertion in the promoter of MdSAUR37, and a previously reported SNP in MdALMTII, were validated to influence the malate content of apple fruits. In addition, MdPP2CH inactivated three vacuolar H + -ATPases (MdVHA-A3, MdVHA-B2 and MdVHA-D2) and one aluminium-activated malate transporter (MdALMTII) via dephosphorylation and negatively influenced fruit malate accumulation. The dephosphotase activity of MdPP2CH was suppressed by MdSAUR37, which implied a higher hierarchy of genetic interaction. Therefore, the MdSAUR37/MdPP2CH/MdALMTII chain cascaded hierarchical epistatic genetic effects to precisely determine apple fruit malate content. An A/G SNP (-1010) on MdMYB44 promoter region from a major QTL (qtl08.1) was closely associated with fruit malate content. The predicted phenotype values (PPVs) were estimated using the tentative genotype values of the gene markers, and the PPVs were significantly correlated with the observed phenotype values. Our findings provide an insight into plant genome-based selection in apples and will aid in conducting research to understand the physiological fundamentals of quantitative genetics. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells
Carlile, Thomas M.; Rojas-Duran, Maria F.; Zinshteyn, Boris; Shin, Hakyung; Bartoli, Kristen M.; Gilbert, Wendy V.
2014-01-01
Post-transcriptional modification of RNA nucleosides occurs in all living organisms. Pseudouridine, the most abundant modified nucleoside in non-coding RNAs1, enhances the function of transfer RNA and ribosomal RNA by stabilizing RNA structure2–8. mRNAs were not known to contain pseudouridine, but artificial pseudouridylation dramatically affects mRNA function – it changes the genetic code by facilitating non-canonical base pairing in the ribosome decoding center9,10. However, without evidence of naturally occurring mRNA pseudouridylation, its physiological was unclear. Here we present a comprehensive analysis of pseudouridylation in yeast and human RNAs using Pseudo-seq, a genome-wide, single-nucleotide-resolution method for pseudouridine identification. Pseudo-seq accurately identifies known modification sites as well as 100 novel sites in non-coding RNAs, and reveals hundreds of pseudouridylated sites in mRNAs. Genetic analysis allowed us to assign most of the new modification sites to one of seven conserved pseudouridine synthases, Pus1–4, 6, 7 and 9. Notably, the majority of pseudouridines in mRNA are regulated in response to environmental signals, such as nutrient deprivation in yeast and serum starvation in human cells. These results suggest a mechanism for the rapid and regulated rewiring of the genetic code through inducible mRNA modifications. Our findings reveal unanticipated roles for pseudouridylation and provide a resource for identifying the targets of pseudouridine synthases implicated in human disease11–13. PMID:25192136
Kawano, Tomonori
2013-03-01
There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed.
NASA Astrophysics Data System (ADS)
Grenier, Christophe; Anbergen, Hauke; Bense, Victor; Chanzy, Quentin; Coon, Ethan; Collier, Nathaniel; Costard, François; Ferry, Michel; Frampton, Andrew; Frederick, Jennifer; Gonçalvès, Julio; Holmén, Johann; Jost, Anne; Kokh, Samuel; Kurylyk, Barret; McKenzie, Jeffrey; Molson, John; Mouche, Emmanuel; Orgogozo, Laurent; Pannetier, Romain; Rivière, Agnès; Roux, Nicolas; Rühaak, Wolfram; Scheidegger, Johanna; Selroos, Jan-Olof; Therrien, René; Vidstrand, Patrik; Voss, Clifford
2018-04-01
In high-elevation, boreal and arctic regions, hydrological processes and associated water bodies can be strongly influenced by the distribution of permafrost. Recent field and modeling studies indicate that a fully-coupled multidimensional thermo-hydraulic approach is required to accurately model the evolution of these permafrost-impacted landscapes and groundwater systems. However, the relatively new and complex numerical codes being developed for coupled non-linear freeze-thaw systems require verification. This issue is addressed by means of an intercomparison of thirteen numerical codes for two-dimensional test cases with several performance metrics (PMs). These codes comprise a wide range of numerical approaches, spatial and temporal discretization strategies, and computational efficiencies. Results suggest that the codes provide robust results for the test cases considered and that minor discrepancies are explained by computational precision. However, larger discrepancies are observed for some PMs resulting from differences in the governing equations, discretization issues, or in the freezing curve used by some codes.
1976-01-01
Lymphocytic choriomeningitis virus (LCMV) and ectromelia virus-specific T-cell-mediated cytotoxicity was assayed in various strain combinations using as targets peritoneal macrophages which have been shown to express Ia antigens. Virus-specific cytotoxicity was found only in H-2K- or D-region compatible combinations. I-region compatibility was not necessary nor alone sufficient for lysis. Six different I-region specificities had no obvious effect on the capacity to generate in vivo specific cytotoxicity (expressed in vitro) associated with Dd. Low LCMV- specific cytotoxic activity generated in DBA/2 mice was caused by the non-H-2 genetic background. This trait was inversely related to the infectious virus dose and recessive. Non-H-2 genes, possibly involved in controlling initial spread and multiplication of virus, seem to be, at least in the examples tested, more important in determining virus- specific cytotoxic T-cell activity in spleens than are Ir genes coded in H-2. PMID:1085331
Catalog of MicroRNA Seed Polymorphisms in Vertebrates
Calin, George Adrian; Horvat, Simon; Jiang, Zhihua; Dovc, Peter; Kunej, Tanja
2012-01-01
MicroRNAs (miRNAs) are a class of non-coding RNA that plays an important role in posttranscriptional regulation of mRNA. Evidence has shown that miRNA gene variability might interfere with its function resulting in phenotypic variation and disease susceptibility. A major role in miRNA target recognition is ascribed to complementarity with the miRNA seed region that can be affected by polymorphisms. In the present study, we developed an online tool for the detection of miRNA polymorphisms (miRNA SNiPer) in vertebrates (http://www.integratomics-time.com/miRNA-SNiPer) and generated a catalog of miRNA seed region polymorphisms (miR-seed-SNPs) consisting of 149 SNPs in six species. Although a majority of detected polymorphisms were due to point mutations, two consecutive nucleotide substitutions (double nucleotide polymorphisms, DNPs) were also identified in nine miRNAs. We determined that miR-SNPs are frequently located within the quantitative trait loci (QTL), chromosome fragile sites, and cancer susceptibility loci, indicating their potential role in the genetic control of various complex traits. To test this further, we performed an association analysis between the mmu-miR-717 seed SNP rs30372501, which is polymorphic in a large number of standard inbred strains, and all phenotypic traits in these strains deposited in the Mouse Phenome Database. Analysis showed a significant association between the mmu-miR-717 seed SNP and a diverse array of traits including behavior, blood-clinical chemistry, body weight size and growth, and immune system suggesting that seed SNPs can indeed have major pleiotropic effects. The bioinformatics analyses, data and tools developed in the present study can serve researchers as a starting point in testing more targeted hypotheses and designing experiments using optimal species or strains for further mechanistic studies. PMID:22303453
The complete mitochondrial genome of the ice pigeon (Columba livia breed ice).
Zhang, Rui-Hua; He, Wen-Xiao
2015-02-01
The ice pigeon is a breed of fancy pigeon developed over many years of selective breeding. In the present work, we report the complete mitochondrial genome sequence of ice pigeon for the first time. The total length of the mitogenome was 17,236 bp with the base composition of 30.2% for A, 24.0% for T, 31.9% for C, and 13.9% for G and an A-T (54.2 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of ice pigeon would serve as an important data set of the germplasm resources for further study.
Mitochondrial genome sequence of Egyptian swift Rock Pigeon (Columba livia breed Egyptian swift).
Li, Chun-Hong; Shi, Wei; Shi, Wan-Yu
2015-06-01
The Egyptian swift Rock Pigeon is a breed of fancy pigeon developed over many years of selective breeding. In this work, we report the complete mitochondrial genome sequence of Egyptian swift Rock Pigeon. The total length of the mitogenome was 17,239 bp and its overall base composition was estimated to be 30.2% for A, 24.0% for T, 31.9% for C and 13.9% for G, indicating an A-T (54.2%)-rich feature in the mitogenome. It contained the typical structure of 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a non-coding control region (D-loop region). The complete mitochondrial genome sequence of Egyptian swift Rock Pigeon would serve as an important data set of the germplasm resources for further study.
The complete mitochondrial genome of the Fancy Pigeon, Columba livia (Columbiformes: Columbidae).
Zhang, Rui-Hua; Xu, Ming-Ju; Wang, Cun-Lian; Xu, Tong; Wei, Dong; Liu, Bao-Jian; Wang, Guo-Hua
2015-02-01
The fancy pigeons are domesticated varieties of the rock pigeon developed over many years of selective breeding. In the present work, we report the complete mitochondrial genome sequence of fancy pigeon for the first time. The total length of the mitogenome was 17,233 bp with the base composition of 30.1% for A, 24.0% for T, 31.9% for C, and 14.0% for G and an A-T (54.2 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of fancy pigeon would serve as an important data set of the germplasm resources for further study.
Hemipteran Mitochondrial Genomes: Features, Structures and Implications for Phylogeny
Wang, Yuan; Chen, Jing; Jiang, Li-Yun; Qiao, Ge-Xia
2015-01-01
The study of Hemipteran mitochondrial genomes (mitogenomes) began with the Chagas disease vector, Triatoma dimidiata, in 2001. At present, 90 complete Hemipteran mitogenomes have been sequenced and annotated. This review examines the history of Hemipteran mitogenomes research and summarizes the main features of them including genome organization, nucleotide composition, protein-coding genes, tRNAs and rRNAs, and non-coding regions. Special attention is given to the comparative analysis of repeat regions. Gene rearrangements are an additional data type for a few families, and most mitogenomes are arranged in the same order to the proposed ancestral insect. We also discuss and provide insights on the phylogenetic analyses of a variety of taxonomic levels. This review is expected to further expand our understanding of research in this field and serve as a valuable reference resource. PMID:26039239
USDA-ARS?s Scientific Manuscript database
The aneupolyploidy genome of sugarcane (Saccharum hybrids spp.) and lack of a classical genetic linkage map make genetics research most difficult for sugarcane. Whole genome sequencing and genetic characterization of sugarcane and related taxa are far behind other crops. In this study, universal PCR...
High-speed inlet research program and supporting analyses
NASA Technical Reports Server (NTRS)
Coltrin, Robert E.
1987-01-01
A Mach 5 cruise aircraft was studied in a joint program effort. The propulsion system chosen for this aircraft was an over-under turbojet/ramjet system. The ramjet portion of the inlet is to be tested in NASA Lewis' 10 x 10 SWT. Goals of the test program are to obtain performance data and bleed requirements, and also to obtain analysis code validation data. Supporting analysis of the inlet using a three-dimensional Navier-Stokes code (PEPSIS) indicates that sidewall shock/boundary layer interactions cause large separated regions in the corners underneath the cowl. Such separations generally lead to inlet unstart, and are thus a major concern. As a result of the analysis, additional bleed regions were added to the inlet model sidewalls and cowl to control separations in the corners. A two-dimensional analysis incorporating bleed on the ramp is also presented. Supporting experiments for the Mach 5 programs were conducted in the Lewis' 1 x 1 SWT. A small-scale model representing the inlet geometry up to the ramp shoulder and cowl lip was tested to verify the accelerator plate test technique and to obtain data on flow migration in the ramp and sidewall boundary layers. Another study explored several ramp bleed configurations to control boundary layer separations in that region. Design of a two-dimensional Mach 5 cruise inlet represents several major challenges including multimode operation and dual flow, high temperatures, and three-dimensional airflow effects.
Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing
Hoque, Mainul; Ji, Zhe; Zheng, Dinghai; Luo, Wenting; Li, Wencheng; You, Bei; Park, Ji Yeon; Yehia, Ghassan; Tian, Bin
2012-01-01
Alternative cleavage and polyadenylation (APA) leads to mRNA isoforms with different coding sequences (CDS) and/or 3′ untranslated regions (3′UTRs). Using 3′ Region Extraction And Deep Sequencing (3′READS), a method which addresses the internal priming and oligo(A) tail issues that commonly plague polyA site (pA) identification, we comprehensively mapped pAs in the mouse genome, thoroughly annotating 3′ ends of genes and revealing over five thousand pAs (~8% of total) flanked by A-rich sequences, which have hitherto been overlooked. About 79% of mRNA genes and 66% of long non-coding RNA (lncRNA) genes have APA; but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Promoter-distal pAs become relatively more abundant during embryonic development and cell differentiation, a trend affecting pAs in both 3′-most exons and upstream regions. Upregulated isoforms generally have stronger pAs, suggesting global modulation of the 3′ end processing activity in development and differentiation. PMID:23241633
Region-of-interest determination and bit-rate conversion for H.264 video transcoding
NASA Astrophysics Data System (ADS)
Huang, Shu-Fen; Chen, Mei-Juan; Tai, Kuang-Han; Li, Mian-Shiuan
2013-12-01
This paper presents a video bit-rate transcoder for baseline profile in H.264/AVC standard to fit the available channel bandwidth for the client when transmitting video bit-streams via communication channels. To maintain visual quality for low bit-rate video efficiently, this study analyzes the decoded information in the transcoder and proposes a Bayesian theorem-based region-of-interest (ROI) determination algorithm. In addition, a curve fitting scheme is employed to find the models of video bit-rate conversion. The transcoded video will conform to the target bit-rate by re-quantization according to our proposed models. After integrating the ROI detection method and the bit-rate transcoding models, the ROI-based transcoder allocates more coding bits to ROI regions and reduces the complexity of the re-encoding procedure for non-ROI regions. Hence, it not only keeps the coding quality but improves the efficiency of the video transcoding for low target bit-rates and makes the real-time transcoding more practical. Experimental results show that the proposed framework gets significantly better visual quality.
Hypoxia-induced long non-coding RNA Malat1 is dispensable for renal ischemia/reperfusion-injury.
Kölling, Malte; Genschel, Celina; Kaucsar, Tamas; Hübner, Anika; Rong, Song; Schmitt, Roland; Sörensen-Zender, Inga; Haddad, George; Kistler, Andreas; Seeger, Harald; Kielstein, Jan T; Fliser, Danilo; Haller, Hermann; Wüthrich, Rudolf; Zörnig, Martin; Thum, Thomas; Lorenzen, Johan
2018-02-21
Renal ischemia-reperfusion (I/R) injury is a major cause of acute kidney injury (AKI). Non-coding RNAs are crucially involved in its pathophysiology. We identified hypoxia-induced long non-coding RNA Malat1 (Metastasis Associated Lung Adenocarcinoma Transcript 1) to be upregulated in renal I/R injury. We here elucidated the functional role of Malat1 in vitro and its potential contribution to kidney injury in vivo. Malat1 was upregulated in kidney biopsies and plasma of patients with AKI, in murine hypoxic kidney tissue as well as in cultured and ex vivo sorted hypoxic endothelial cells and tubular epithelial cells. Malat1 was transcriptionally activated by hypoxia-inducible factor 1-α. In vitro, Malat1 inhibition reduced proliferation and the number of endothelial cells in the S-phase of the cell cycle. In vivo, Malat1 knockout and wildtype mice showed similar degrees of outer medullary tubular epithelial injury, proliferation, capillary rarefaction, inflammation and fibrosis, survival and kidney function. Small-RNA sequencing and whole genome expression analysis revealed only minor changes between ischemic Malat1 knockout and wildtype mice. Contrary to previous studies, which suggested a prominent role of Malat1 in the induction of disease, we did not confirm an in vivo role of Malat1 concerning renal I/R-injury.
Wang, Po-Shun; Chou, Cheng-Han; Lin, Cheng-Han; Yao, Yun-Chin; Cheng, Hui-Chuan; Li, Hao-Yi; Chuang, Yu-Chung; Yang, Chia-Ning; Ger, Luo-Ping; Chen, Yu-Chia; Lin, Forn-Chia; Shen, Tang-Long; Hsiao, Michael; Lu, Pei-Jung
2018-05-14
Triple-negative breast cancer (TNBC) patients usually lead to poor prognosis and survival because of metastasis. The major sites for TNBC metastasis include the lungs, brain, liver, and bone. Long non-coding RNAs (lncRNAs) are non-protein-coding transcripts longer than 200 nucleotides and have been reported as important regulators in BC metastasis. However, the underlying mechanisms for lncRNAs regulating TNBC metastasis are not fully understood. Here we found that linc-ZNF469-3 was highly expressed in lung-metastatic LM2-4175 TNBC cells and overexpression of linc-ZNF469-3 enhanced invasion ability and stemness properties in vitro and lung metastasis in vivo. Furthermore, we found linc-ZNF469-3 physically interacted with miR-574-5p and overexpression of miR-574-5p attenuated ZEB1 expression. Importantly, endogenous high expressions of linc-ZNF469-3 and ZEB1 were correlated with tumor recurrence in TNBC patients with lung metastasis. Taken together, our findings suggested that linc-ZNF469-3 promotes lung metastasis of TNBC through miR-574-5p-ZEB1 signaling axis and may be used as potential prognostic marker for TNBC patients.
Heery, Richard; Finn, Stephen P.; Cuffe, Sinead; Gray, Steven G.
2017-01-01
Epithelial mesenchymal transition (EMT), the adoption by epithelial cells of a mesenchymal-like phenotype, is a process co-opted by carcinoma cells in order to initiate invasion and metastasis. In addition, it is becoming clear that is instrumental to both the development of drug resistance by tumour cells and in the generation and maintenance of cancer stem cells. EMT is thus a pivotal process during tumour progression and poses a major barrier to the successful treatment of cancer. Non-coding RNAs (ncRNA) often utilize epigenetic programs to regulate both gene expression and chromatin structure. One type of ncRNA, called long non-coding RNAs (lncRNAs), has become increasingly recognized as being both highly dysregulated in cancer and to play a variety of different roles in tumourigenesis. Indeed, over the last few years, lncRNAs have rapidly emerged as key regulators of EMT in cancer. In this review, we discuss the lncRNAs that have been associated with the EMT process in cancer and the variety of molecular mechanisms and signalling pathways through which they regulate EMT, and finally discuss how these EMT-regulating lncRNAs impact on both anti-cancer drug resistance and the cancer stem cell phenotype. PMID:28430163
Bimolata, Waikhom; Kumar, Anirudh; Sundaram, Raman Meenakshi; Laha, Gouri Shankar; Qureshi, Insaf Ahmed; Reddy, Gajjala Ashok; Ghazi, Irfan Ahmad
2013-08-01
Xa27 is one of the important R-genes, effective against bacterial blight disease of rice caused by Xanthomonas oryzae pv. oryzae (Xoo). Using natural population of Oryza, we analyzed the sequence variation in the functionally important domains of Xa27 across the Oryza species. DNA sequences of Xa27 alleles from 27 rice accessions revealed higher nucleotide diversity among the reported R-genes of rice. Sequence polymorphism analysis revealed synonymous and non-synonymous mutations in addition to a number of InDels in non-coding regions of the gene. High sequence variation was observed in the promoter region including the 5'UTR with 'π' value 0.00916 and 'θ w ' = 0.01785. Comparative analysis of the identified Xa27 alleles with that of IRBB27 and IR24 indicated the operation of both positive selection (Ka/Ks > 1) and neutral selection (Ka/Ks ≈ 0). The genetic distances of alleles of the gene from Oryza nivara were nearer to IRBB27 as compared to IR24. We also found the presence of conserved and null UPT (upregulated by transcriptional activator) box in the isolated alleles. Considerable amino acid polymorphism was localized in the trans-membrane domain for which the functional significance is yet to be elucidated. However, the absence of functional UPT box in all the alleles except IRBB27 suggests the maintenance of single resistant allele throughout the natural population.
Sun, Jing; Wu, Wenbin; Tang, Huajun; Liu, Jianguo
2015-01-01
Despite heated debates over the safety of genetically modified (GM) food, GM crops have been expanding rapidly. Much research has focused on the expansion of GM crops. However, the spatiotemporal dynamics of non-genetically modified (non-GM) crops are not clear, although they may have significant environmental and agronomic impacts and important policy implications. To understand the dynamics of non-GM crops and to inform the debates among relevant stakeholders, we conducted spatiotemporal analyses of China’s major non-GM soybean production region, the Heilongjiang Province. Even though the total soybean planting area decreased from 2005 to 2010, surprisingly, there were hotspots of increase. The results also showed hotspots of loss as well as a large decline in the number and continuity of soybean plots. Since China is the largest non-GM soybean producer in the world, the decline of its major production region may signal the continual decline of global non-GM soybeans. PMID:26380899
Sun, Jing; Wu, Wenbin; Tang, Huajun; Liu, Jianguo
2015-09-18
Despite heated debates over the safety of genetically modified (GM) food, GM crops have been expanding rapidly. Much research has focused on the expansion of GM crops. However, the spatiotemporal dynamics of non-genetically modified (non-GM) crops are not clear, although they may have significant environmental and agronomic impacts and important policy implications. To understand the dynamics of non-GM crops and to inform the debates among relevant stakeholders, we conducted spatiotemporal analyses of China's major non-GM soybean production region, the Heilongjiang Province. Even though the total soybean planting area decreased from 2005 to 2010, surprisingly, there were hotspots of increase. The results also showed hotspots of loss as well as a large decline in the number and continuity of soybean plots. Since China is the largest non-GM soybean producer in the world, the decline of its major production region may signal the continual decline of global non-GM soybeans.
Liu, Ye; Li, Nan; Zhang, Shoufeng; Zhang, Fei; Lian, Hai; Wang, Ying; Zhang, Jinxia; Hu, Rongliang
2013-12-01
The genome of Irkut virus, isolate IRKV-THChina12, the first non-rabies lyssavirus from China (of bat origin), has been completely sequenced. In general, coding and non-coding regions of this viral genome are similar to those of other lyssaviruses. However, alignment of the deduced amino acid sequences of the structural proteins of IRKV-THChina12 with those of other lyssavirus representatives revealed significant variability between viral species. The nucleoprotein and matrix protein were found to be the most conserved, followed by the large protein, glycoprotein and phosphoprotein. Differences in the antigenic sites in glycoprotein may result in only partial protection of the available rabies biologics against Irkut virus, which is of particular concern for pre- and post-exposure rabies prophylaxis. Copyright © 2013 Elsevier Inc. All rights reserved.
DataRocket: Interactive Visualisation of Data Structures
NASA Astrophysics Data System (ADS)
Parkes, Steve; Ramsay, Craig
2010-08-01
CodeRocket is a software engineering tool that provides cognitive support to the software engineer for reasoning about a method or procedure and for documenting the resulting code [1]. DataRocket is a software engineering tool designed to support visualisation and reasoning about program data structures. DataRocket is part of the CodeRocket family of software tools developed by Rapid Quality Systems [2] a spin-out company from the Space Technology Centre at the University of Dundee. CodeRocket and DataRocket integrate seamlessly with existing architectural design and coding tools and provide extensive documentation with little or no effort on behalf of the software engineer. Comprehensive, abstract, detailed design documentation is available early on in a project so that it can be used for design reviews with project managers and non expert stakeholders. Code and documentation remain fully synchronised even when changes are implemented in the code without reference to the existing documentation. At the end of a project the press of a button suffices to produce the detailed design document. Existing legacy code can be easily imported into CodeRocket and DataRocket to reverse engineer detailed design documentation making legacy code more manageable and adding substantially to its value. This paper introduces CodeRocket. It then explains the rationale for DataRocket and describes the key features of this new tool. Finally the major benefits of DataRocket for different stakeholders are considered.
Tam, Vivian; Edge, Jennifer S; Hoffman, Steven J
2016-10-12
Shortages of health workers in low-income countries are exacerbated by the international migration of health workers to more affluent countries. This problem is compounded by the active recruitment of health workers by destination countries, particularly Australia, Canada, UK and USA. The World Health Organization (WHO) adopted a voluntary Code of Practice in May 2010 to mitigate tensions between health workers' right to migrate and the shortage of health workers in source countries. The first empirical impact evaluation of this Code was conducted 11-months after its adoption and demonstrated a lack of impact on health workforce recruitment policy and practice in the short-term. This second empirical impact evaluation was conducted 4-years post-adoption using the same methodology to determine whether there have been any changes in the perceived utility, applicability, and implementation of the Code in the medium-term. Forty-four respondents representing government, civil society and the private sector from Australia, Canada, UK and USA completed an email-based survey evaluating their awareness of the Code, perceived impact, changes to policy or recruitment practices resulting from the Code, and the effectiveness of non-binding Codes generally. The same survey instrument from the original study was used to facilitate direct comparability of responses. Key lessons were identified through thematic analysis. The main findings between the initial impact evaluation and the current one are unchanged. Both sets of key informants reported no significant policy or regulatory changes to health worker recruitment in their countries as a direct result of the Code due to its lack of incentives, institutional mechanisms and interest mobilizers. Participants emphasized the existence of previous bilateral and regional Codes, the WHO Code's non-binding nature, and the primacy of competing domestic healthcare priorities in explaining this perceived lack of impact. The Code has probably still not produced the tangible improvements in health worker flows it aspired to achieve. Several actions, including a focus on developing bilateral codes, linking the Code to topical global priorities, and reframing the Code's purpose to emphasize health system sustainability, are proposed to improve the Code's uptake and impact.
Prediction of plant lncRNA by ensemble machine learning classifiers.
Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian
2018-05-02
In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.
Michikawa, Takehiro; Morokuma, Seiichi; Nitta, Hiroshi; Kato, Kiyoko; Yamazaki, Shin
2017-06-13
Numerous earlier studies examining the association of air pollution with maternal and foetal health estimated maternal exposure to air pollutants based on the women's residential addresses. However, residential addresses, which are personally identifiable information, are not always obtainable. Since a majority of pregnant women reside near their delivery hospitals, the concentrations of air pollutants at the respective delivery hospitals may be surrogate markers of pollutant exposure at home. We compared air pollutant concentrations measured at the nearest monitoring station to Kyushu University Hospital with those measured at the closest monitoring stations to the respective residential postal code regions of pregnant women in Fukuoka. Aggregated postal code data for the home addresses of pregnant women who delivered at Kyushu University Hospital in 2014 was obtained from Kyushu University Hospital. For each of the study's 695 women who resided in Fukuoka Prefecture, we assigned pollutant concentrations measured at the nearest monitoring station to Kyushu University Hospital and pollutant concentrations measured at the nearest monitoring station to their respective residential postal code regions. Among the 695 women, 584 (84.0%) resided in the proximity of the nearest monitoring station to hospital or one of the four other stations (as the nearest stations to their respective residential postal code region) in Fukuoka city. Pearson's correlation for daily mean concentrations among the monitoring stations in Fukuoka city was strong for fine particulate matter (PM 2.5 ), suspended particulate matter (SPM), and photochemical oxidants (Ox) (coefficients ≥0.9), but moderate for coarse particulate matter (the result of subtracting the PM 2.5 from the SPM concentrations), nitrogen dioxide, and sulphur dioxide. Hospital-based and residence-based concentrations of PM 2.5 , SPM, and Ox were comparable. For PM 2.5 , SPM, and Ox, exposure estimation based on the delivery hospital is likely to approximate that based on the home of pregnant women.
Supersymmetric and non-supersymmetric models without catastrophic Goldstone bosons
NASA Astrophysics Data System (ADS)
Braathen, Johannes; Goodsell, Mark D.; Staub, Florian
2017-11-01
The calculation of the Higgs mass in general renormalisable field theories has been plagued by the so-called "Goldstone Boson Catastrophe," where light (would-be) Goldstone bosons give infra-red divergent loop integrals. In supersymmetric models, previous approaches included a workaround that ameliorated the problem for most, but not all, parameter space regions; while giving divergent results everywhere for non-supersymmetric models! We present an implementation of a general solution to the problem in the public code SARAH, along with new calculations of some necessary loop integrals and generic expressions. We discuss the validation of our code in the Standard Model, where we find remarkable agreement with the known results. We then show new applications in Split SUSY, the NMSSM, the Two-Higgs-Doublet Model, and the Georgi-Machacek model. In particular, we take some first steps to exploring where the habit of using tree-level mass relations in non-supersymmetric models breaks down, and show that the loop corrections usually become very large well before naive perturbativity bounds are reached.
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction
Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin
2016-01-01
High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems. PMID:27814367
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction.
Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin
2016-01-01
High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-30
...] FDA's Public Database of Products With Orphan-Drug Designation: Replacing Non-Informative Code Names... replaced non- informative code names with descriptive identifiers on its public database of products that... on our public database with non-informative code names. After careful consideration of this matter...
Facts and updates about cardiovascular non-coding RNAs in heart failure.
Thum, Thomas
2015-09-01
About 11% of all deaths include heart failure as a contributing cause. The annual cost of heart failure amounts to US $34,000,000,000 in the United States alone. With the exception of heart transplantation, there is no curative therapy available. Only occasionally there are new areas in science that develop into completely new research fields. The topic on non-coding RNAs, including microRNAs, long non-coding RNAs, and circular RNAs, is such a field. In this short review, we will discuss the latest developments about non-coding RNAs in cardiovascular disease. MicroRNAs are short regulatory non-coding endogenous RNA species that are involved in virtually all cellular processes. Long non-coding RNAs also regulate gene and protein levels; however, by much more complicated and diverse mechanisms. In general, non-coding RNAs have been shown to be of great value as therapeutic targets in adverse cardiac remodelling and also as diagnostic and prognostic biomarkers for heart failure. In the future, non-coding RNA-based therapeutics are likely to enter the clinical reality offering a new treatment approach of heart failure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poliakov, Alexander; Couronne, Olivier
2002-11-04
Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, O.; Masters, C.; Lewis, M.B.
1994-09-01
In an 8-year-old girl and her father, both of whom have severe type III OI, we have previously used RNA/RNA hybrid analysis to demonstrate a mismatch in the region of {alpha}1(I) mRNA coding for aa 558-861. We used SSCP to further localize the abnormality to a subregion coding for aa 579-679. This region was subcloned and sequenced. Each patient`s cDNA has a deletion of the sequences coding for the last residue of exon 34, and all of exons 35 and 36 (aa 604-639), followed by an insertion of 156 nt from the 3{prime}-end of intron 36. PCR amplification of leukocytemore » DNA from the patients and the clinically normal paternal grandmother yielded two fragments: a 1007 bp fragment predicted from normal genomic sequences and a 445 bp fragment. Subcloning and sequencing of the shorter genomic PCR product confirmed the presence of a 565 bp genomic deletion from the end of exon 34 to the middle of intron 36. The abnormal protein is apparently synthesized and incorporated into helix. The inserted nucleotides are in frame with the collagenous sequence and contain no stop codons. They encode a 52 aa non-collagenous region. The fibroblast procollagen of the patients has both normal and electrophoretically delayed pro{alpha}(I) bands. The electrophoretically delayed procollagen is very sensitive to pepsin or trypsin digestion, as predicted by its non-collagenous sequence, and cannot be visualized as collagen. This unique OI collagen mutation is an excellent candidate for molecular targeting to {open_quotes}turn off{close_quotes} a dominant mutant allele.« less
Maruyama, Atsushi; Mimura, Junsei; Itoh, Ken
2014-01-01
Recent studies have disclosed the function of enhancer RNAs (eRNAs), which are long non-coding RNAs transcribed from gene enhancer regions, in transcriptional regulation. However, it remains unclear whether eRNAs are involved in the regulation of human heme oxygenase-1 gene (HO-1) induction. Here, we report that multiple nuclear-enriched eRNAs are transcribed from the regions adjacent to two human HO-1 enhancers (i.e. the distal E2 and proximal E1 enhancers), and some of these eRNAs are induced by the oxidative stress-causing reagent diethyl maleate (DEM). We demonstrated that the expression of one forward direction (5′ to 3′) eRNA transcribed from the human HO-1 E2 enhancer region (named human HO-1enhancer RNA E2-3; hereafter called eRNA E2-3) was induced by DEM in an NRF2-dependent manner in HeLa cells. Conversely, knockdown of BACH1, a repressor of HO-1 transcription, further increased DEM-inducible eRNA E2-3 transcription as well as HO-1 expression. In addition, we showed that knockdown of eRNA E2-3 selectively down-regulated DEM-induced HO-1 expression. Furthermore, eRNA E2-3 knockdown attenuated DEM-induced Pol II binding to the promoter and E2 enhancer regions of HO-1 without affecting NRF2 recruitment to the E2 enhancer. These findings indicate that eRNAE2-3 is functional and is required for HO-1 induction. PMID:25404134
Global Estimates of Trace Gas Fluxes Affected by Land Use Change and Irrigation of Major Crops
NASA Astrophysics Data System (ADS)
Ojima, D. S.; del Grosso, S.; Parton, W. J.; Keough, C.
2005-12-01
Cropland conversions have altered many fertile regions of the earth and have modified the biogeochemical and hydrological cycling in these regions. These croplands are significant sources of N trace gas emissions however, the extent of changing trace gas emission due to land management changes and irrigation need further analysis. We use the DAYCENT biogeochemical model which is a daily time step version of the CENTURY model. DAYCENT simulates fluxes of N2O between croplands and the atmosphere for major crop types, and allows for a dynamic representation of GHG fluxes that accounts for environmental conditions, soil characteristics, climate, specific crop qualities, and fertilizer and irrigation management practices. DAYCENT is applied to all world cropland regions. Global datasets of weather, soils, native vegetation and cropping fractions were mapped to an approximate 2° x 2° resolution. Non-spatial data (such as planting date and fertilizer application rates) were assigned as point values for each region (i.e. country), and were assumed to be similar within crop types across the region. Three major crops were simulated (corn, wheat and soybeans) under both irrigated and non-irrigated conditions. Results indicate that N2O emission for maize and soy bean increase between 3 to 10%, where as wheat emission decline by about 1% when irrigated systems are compared to non-irrigated systems.
Bayesian variable selection for post-analytic interrogation of susceptibility loci.
Chen, Siying; Nunez, Sara; Reilly, Muredach P; Foulkes, Andrea S
2017-06-01
Understanding the complex interplay among protein coding genes and regulatory elements requires rigorous interrogation with analytic tools designed for discerning the relative contributions of overlapping genomic regions. To this aim, we offer a novel application of Bayesian variable selection (BVS) for classifying genomic class level associations using existing large meta-analysis summary level resources. This approach is applied using the expectation maximization variable selection (EMVS) algorithm to typed and imputed SNPs across 502 protein coding genes (PCGs) and 220 long intergenic non-coding RNAs (lncRNAs) that overlap 45 known loci for coronary artery disease (CAD) using publicly available Global Lipids Gentics Consortium (GLGC) (Teslovich et al., 2010; Willer et al., 2013) meta-analysis summary statistics for low-density lipoprotein cholesterol (LDL-C). The analysis reveals 33 PCGs and three lncRNAs across 11 loci with >50% posterior probabilities for inclusion in an additive model of association. The findings are consistent with previous reports, while providing some new insight into the architecture of LDL-cholesterol to be investigated further. As genomic taxonomies continue to evolve, additional classes such as enhancer elements and splicing regions, can easily be layered into the proposed analysis framework. Moreover, application of this approach to alternative publicly available meta-analysis resources, or more generally as a post-analytic strategy to further interrogate regions that are identified through single point analysis, is straightforward. All coding examples are implemented in R version 3.2.1 and provided as supplemental material. © 2016, The International Biometric Society.
Localization of TFIIB binding regions using serial analysis of chromatin occupancy
Yochum, Gregory S; Rajaraman, Veena; Cleland, Ryan; McWeeney, Shannon
2007-01-01
Background: RNA Polymerase II (RNAP II) is recruited to core promoters by the pre-initiation complex (PIC) of general transcription factors. Within the PIC, transcription factor for RNA polymerase IIB (TFIIB) determines the start site of transcription. TFIIB binding has not been localized, genome-wide, in metazoans. Serial analysis of chromatin occupancy (SACO) is an unbiased methodology used to empirically identify transcription factor binding regions. In this report, we use TFIIB and SACO to localize TFIIB binding regions across the rat genome. Results: A sample of the TFIIB SACO library was sequenced and 12,968 TFIIB genomic signature tags (GSTs) were assigned to the rat genome. GSTs are 20–22 base pair fragments that are derived from TFIIB bound chromatin. TFIIB localized to both non-protein coding and protein-coding loci. For 21% of the 1783 protein-coding genes in this sample of the SACO library, TFIIB binding mapped near the characterized 5' promoter that is upstream of the transcription start site (TSS). However, internal TFIIB binding positions were identified in 57% of the 1783 protein-coding genes. Internal positions are defined as those within an inclusive region greater than 2.5 kb downstream from the 5' TSS and 2.5 kb upstream from the transcription stop. We demonstrate that both TFIIB and TFIID (an additional component of PICs) bound to internal regions using chromatin immunoprecipitation (ChIP). The 5' cap of transcripts associated with internal TFIIB binding positions were identified using a cap-trapping assay. The 5' TSSs for internal transcripts were confirmed by primer extension. Additionally, an analysis of the functional annotation of mouse 3 (FANTOM3) databases indicates that internally initiated transcripts identified by TFIIB SACO in rat are conserved in mouse. Conclusion: Our findings that TFIIB binding is not restricted to the 5' upstream region indicates that the propensity for PIC to contribute to transcript diversity is far greater than previously appreciated. PMID:17997859
Identifying Neck and Back Pain in Administrative Data: Defining the right cohort
Siroka, Andrew M.; Shane, Andrea C.; Trafton, Jodie A.; Wagner, Todd H.
2017-01-01
Structured Abstract Study design We reviewed existing methods for identifying patients with neck and back pain in administrative data. We compared these methods using data from the Department of Veterans Affairs. Objective To answer the following questions: 1) what diagnosis codes should be used to identify patients with neck and back pain in administrative data; 2) because the majority of complaints are characterized as non-specific or mechanical, what diagnosis codes should be used to identify patients with non-specific or mechanical problems in administrative data; and 3) what procedure and surgical codes should be used to identify patients who have undergone a surgical procedure on the neck or back. Summary of background data Musculoskeletal neck and back pain are pervasive problems, associated with chronic pain, disability, and high rates of healthcare utilization. Administrative data have been widely used in formative research which has largely relied on the original work of Volinn, Cherkin, Deyo and Einstadter and the Back Pain Patient Outcomes Assessment Team first published in 1992. Significant variation in reports of incidence, prevalence, and morbidity associated with these problems may be due to non standard or conflicting methods to define study cohorts. Methods A literature review produced seven methods for identifying neck and back pain in administrative data. These code lists were used to search VA data for patients with back and neck problems, and to further categorize each case by spinal segment involved, as non- specific/mechanical and as surgical or not. Results There is considerable overlap in most algorithms. However, gaps remain. Conclusions Gaps are evident in existing methods and a new framework to identify patients with neck and back pain in administrative data is proposed. PMID:22127268
Jin, Xiaoxin; Cai, Lifeng; Wang, Changfa; Deng, Xiaofeng; Yi, Shengen; Lei, Zhao; Xiao, Qiangsheng; Xu, Hongbo; Luo, Hongwu; Sun, Jichun
2018-02-23
Hepatocellular carcinoma is one of the most common solid tumors in the digestive system. The prognosis of patients with hepatocellular carcinoma is still poor due to the acquisition of multi-drug resistance. TNF Related Apoptosis Inducing Ligand (TRAIL), an attractive anticancer agent, exerts its effect of selectively inducing apoptosis in tumor cells through death receptors and the formation of the downstream death-inducing signaling complex, which activates apical caspases 3/8 and leads to apoptosis. However, hepatocellular carcinoma cells are resistant to TRAIL. Non-coding RNAs, including long non-coding RNAs (lncRNAs) and miRNAs have been regarded as major regulators of normal development and diseases, including cancers. Moreover, lncRNAs and miRNAs have been reported to be associated with multi-drug resistance. In the present study, we investigated the mechanism by which TRAIL resistance of hepatocellular carcinoma is affected from the view of non-coding RNA regulation. We selected and validated candidate miRNAs, miR-24 and miR-221, that regulated caspase 3/8 expression through direct targeting, and thereby affecting TRAIL-induced tumor cell apoptosis TRAIL resistance of hepatocellular carcinoma. In addition, we revealed that CASC2, a well-established tumor suppressive long non-coding RNA, could serve as a "Sponge" of miR-24 and miR-221, thus modulating TRAIL-induced tumor cell apoptosis TRAIL resistance of hepatocellular carcinoma. Taken together, we demonstrated a CASC2/miR-24/miR-221 axis, which can affect the TRAIL resistance of hepatocellular carcinoma through regulating caspase 3/8; through acting as a "Sponge" of miR-24 and miR-221, CASC2 may contribute to improving hepatocellular carcinoma TRAIL resistance, and finally promoting the treatment efficiency of TRAIL-based therapies.
Dental Faculty Accuracy When Using Diagnostic Codes: A Pilot Study.
Sutton, Jeanne C; Fay, Rose-Marie; Huynh, Carolyn P; Johnson, Cleverick D; Zhu, Liang; Quock, Ryan L
2017-05-01
The aim of this study was to examine the accuracy of dental faculty members' utilization of diagnostic codes and resulting treatment planning based on radiographic interproximal tooth radiolucencies. In 2015, 50 full-time and part-time general dentistry faculty members at one U.S. dental school were shown a sequence of 15 bitewing radiographs; one interproximal radiolucency was highlighted on each bitewing. For each radiographic lesion, participants were asked to choose the most appropriate diagnostic code (from a concise list of five codes, corresponding to lesion progression to outer/inner halves of enamel and outer/middle/pulpal thirds of dentin), acute treatment (attempt to arrest/remineralize non-invasively, operative intervention, or no treatment), and level of confidence in choices. Diagnostic and treatment choices of participants were compared to "gold standard" correct responses, as determined by expert radiology and operative faculty members, respectively. The majority of the participants selected the correct diagnostic code for lesions in the outer one-third of dentin (p<0.0001) and the pulpal one-third of dentin (p<0.0001). For lesions in the outer and inner halves of enamel and the middle one-third of dentin, the correct rates were moderate. However, the majority of the participants chose correct treatments on all types of lesions (correct rate 63.6-100%). Faculty members' confidence in their responses was generally high for all lesions, all above 90%. Diagnostic codes were appropriately assigned by participants for the very deepest lesions, but they were not assigned accurately for more incipient lesions (limited to enamel). Paradoxically, treatment choices were generally correct, regardless of diagnostic choices. Further calibration is needed to improve faculty use and teaching of diagnostic codes.
Polymorphism of BMP4 gene in Indian goat breeds differing in prolificacy.
Sharma, Rekha; Ahlawat, Sonika; Maitra, A; Roy, Manoranjan; Mandakmale, S; Tantia, M S
2013-12-10
Bone morphogenetic proteins (BMPs) are members of the TGF-β (transforming growth factor-beta) superfamily, of which BMP4 is the most important due to its crucial role in follicular growth and differentiation, cumulus expansion and ovulation. Reproduction is a crucial trait in goat breeding and based on the important role of BMP4 gene in reproduction it was considered as a possible candidate gene for the prolificacy of goats. The objective of the present study was to detect polymorphism in intronic, exonic and 3' un-translated regions of BMP4 gene in Indian goats. Nine different goat breeds (Barbari, Beetal, Black Bengal, Malabari, Jakhrana (Twinning>40%), Osmanabadi, Sangamneri (Twinning 20-30%), Sirohi and Ganjam (Twinning<10%)) differing in prolificacy and geographic distribution were employed for polymorphism scanning. Cattle sequence (AC_000167.1) was used to design primers for the amplification of a targeted region followed by direct DNA sequencing to identify the genetic variations. Single nucleotide polymorphisms (SNPs) were not detected in exon 3, the intronic region and the 3' flanking region. A SNP (G1534A) was identified in exon 2. It was a non-synonymous mutation resulting in an arginine to lysine change in a corresponding protein sequence. G to A transition at the 1534 locus revealed two genotypes GG and GA in the nine investigated goat breeds. The GG genotype was predominant with a genotype frequency of 0.98. The GA genotype was present in the Black Bengal as well as Jakhrana breed with a genotype frequency of 0.02. A microsatellite was identified in the 3' flanking region, only 20 nucleotides downstream from the termination site of the coding region, as a short sequence with more than nineteen continuous and repeated CA dinucleotides. Since the gene is highly evolutionarily conserved, identification of a non-synonymous SNP (G1534A) in the coding region gains further importance. To our knowledge, this is the first report of a mutation in the coding region of the caprine BMP4 gene. But whether the reproduction trait of goat is associated with the BMP4 polymorphism, needs to be further defined by association studies in more populations so as to delineate an effect on it. © 2013 Elsevier B.V. All rights reserved.
Amundsen, M S; Kirkeby, T M G; Giri, S; Koju, R; Krishna, S S; Ystgaard, B; Solligård, E; Risnes, K
2016-12-01
Recent global burden of disease reports find that a major proportion of global deaths and disability worldwide can be attributed to alcohol use. Thus, it may be surprising that very few studies have reported on the burden of alcohol-related disease in low income settings. The evidence of non-communicable disease (NCD) burden in Nepal was recently reviewed and concluded that data is still lacking, particularly to describe the burden of alcohol-related diseases (ARDs). Therefore, here we report on NCD burden and specifically ARDs, in hospitalized patients at a regional hospital in Nepal. We conducted a retrospective chart-review that included detailed information on all discharged patients during a four month period. A local database that included sociodemographic information and diagnoses at discharge was established. All doctor-assigned discharge diagnoses were retrospectively assigned ICD-10 codes. A total of 1,139 hospitalized adult patients were included in the study and one third of these were NCDs (n = 332). The main NCDs were chronic obstructive pulmonary disease (COPD) (n = 148, 45%) and ARDs (n = 57, 17%). Patients with ARD often presented with signs of liver cirrhosis and were typically younger men, with a median age at 43 years, from specific ethnic groups. These data demonstrate that severe alcohol-related organ failure in relatively young men contributed to a high proportion of NCDs in a regional hospital in Nepal. These findings are novel and alarming and warrant further studies that can establish the burden of ARDs and alcohol use in Nepal and other similar low-income countries. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Functional bottlenecks for generation of HIV-1 intersubtype Env recombinants.
Bagaya, Bernard S; Vega, José F; Tian, Meijuan; Nickel, Gabrielle C; Li, Yuejin; Krebs, Kendall C; Arts, Eric J; Gao, Yong
2015-05-23
Intersubtype recombination is a powerful driving force for HIV evolution, impacting both HIV-1 diversity within an infected individual and within the global epidemic. This study examines if viral protein function/fitness is the major constraint shaping selection of recombination hotspots in replication-competent HIV-1 progeny. A better understanding of the interplay between viral protein structure-function and recombination may provide insights into both vaccine design and drug development. In vitro HIV-1 dual infections were used to recombine subtypes A and D isolates and examine breakpoints in the Env glycoproteins. The entire env genes of 21 A/D recombinants with breakpoints in gp120 were non-functional when cloned into the laboratory strain, NL4-3. Likewise, cloning of A/D gp120 coding regions also produced dead viruses with non-functional Envs. 4/9 replication-competent viruses with functional Env's were obtained when just the V1-V5 regions of these same A/D recombinants (i.e. same A/D breakpoints as above) were cloned into NL4-3. These findings on functional A/D Env recombinants combined with structural models of Env suggest a conserved interplay between the C1 domain with C5 domain of gp120 and extracellular domain of gp41. Models also reveal a co-evolution within C1, C5, and ecto-gp41 domains which might explain the paucity of intersubtype recombination in the gp120 V1-V5 regions, despite their hypervariability. At least HIV-1 A/D intersubtype recombination in gp120 may result in a C1 from one subtype incompatible with a C5/gp41 from another subtype.
NASA Astrophysics Data System (ADS)
Brandt, Jørgen; Silver, Jeremy David; Heile Christensen, Jesper; Skou Andersen, Mikael; Geels, Camilla; Gross, Allan; Buus Hansen, Ayoe; Mantzius Hansen, Kaj; Brandt Hedegaard, Gitte; Ambelas Skjøth, Carsten
2010-05-01
Air pollution has significant negative impacts on human health and well-being, which entail substantial economic consequences. We have developed an integrated model system, EVA (External Valuation of Air pollution), to assess health-related economic externalities of air pollution resulting from specific emission sources/sectors. The EVA system was initially developed to assess externalities from power production, but in this study it is extended to evaluate costs at the national level. The EVA system integrates a regional-scale atmospheric chemistry transport model (DEHM), address-level population data, exposure-response functions and monetary values applicable for Danish/European conditions. Traditionally, systems that assess economic costs of health impacts from air pollution assume linear approximations in the source-receptor relationships. However, atmospheric chemistry is non-linear and therefore the uncertainty involved in the linear assumption can be large. The EVA system has been developed to take into account the non-linear processes by using a comprehensive, state-of-the-art chemical transport model when calculating how specific changes to emissions affect air pollution levels and the subsequent impacts on human health and cost. Furthermore, we present a new "tagging" method, developed to examine how specific emission sources influence air pollution levels without assuming linearity of the non-linear behaviour of atmospheric chemistry. This method is more precise than the traditional approach based on taking the difference between two concentration fields. Using the EVA system, we have estimated the total external costs from the main emission sectors in Denmark, representing the ten major SNAP codes. Finally, we assess the impacts and external costs of emissions from international ship traffic around Denmark, since there is a high volume of ship traffic in the region.
NPTFit: A Code Package for Non-Poissonian Template Fitting
NASA Astrophysics Data System (ADS)
Mishra-Sharma, Siddharth; Rodd, Nicholas L.; Safdi, Benjamin R.
2017-06-01
We present NPTFit, an open-source code package, written in Python and Cython, for performing non-Poissonian template fits (NPTFs). The NPTF is a recently developed statistical procedure for characterizing the contribution of unresolved point sources (PSs) to astrophysical data sets. The NPTF was first applied to Fermi gamma-ray data to provide evidence that the excess of ˜GeV gamma-rays observed in the inner regions of the Milky Way likely arises from a population of sub-threshold point sources, and the NPTF has since found additional applications studying sub-threshold extragalactic sources at high Galactic latitudes. The NPTF generalizes traditional astrophysical template fits to allow for the ability to search for populations of unresolved PSs that may follow a given spatial distribution. NPTFit builds upon the framework of the fluctuation analyses developed in X-ray astronomy, thus it likely has applications beyond those demonstrated with gamma-ray data. The NPTFit package utilizes novel computational methods to perform the NPTF efficiently. The code is available at http://github.com/bsafdi/NPTFit and up-to-date and extensive documentation may be found at http://nptfit.readthedocs.io.
Liu, Zhongliang; Hui, Yi; Shi, Lei; Chen, Zhenyu; Xu, Xiangjie; Chi, Liankai; Fan, Beibei; Fang, Yujiang; Liu, Yang; Ma, Lin; Wang, Yiran; Xiao, Lei; Zhang, Quanbin; Jin, Guohua; Liu, Ling; Zhang, Xiaoqing
2016-09-13
Loss-of-function studies in human pluripotent stem cells (hPSCs) require efficient methodologies for lesion of genes of interest. Here, we introduce a donor-free paired gRNA-guided CRISPR/Cas9 knockout strategy (paired-KO) for efficient and rapid gene ablation in hPSCs. Through paired-KO, we succeeded in targeting all genes of interest with high biallelic targeting efficiencies. More importantly, during paired-KO, the cleaved DNA was repaired mostly through direct end joining without insertions/deletions (precise ligation), and thus makes the lesion product predictable. The paired-KO remained highly efficient for one-step targeting of multiple genes and was also efficient for targeting of microRNA, while for long non-coding RNA over 8 kb, cleavage of a short fragment of the core promoter region was sufficient to eradicate downstream gene transcription. This work suggests that the paired-KO strategy is a simple and robust system for loss-of-function studies for both coding and non-coding genes in hPSCs. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Quantized phase coding and connected region labeling for absolute phase retrieval.
Chen, Xiangcheng; Wang, Yuwei; Wang, Yajun; Ma, Mengchao; Zeng, Chunnian
2016-12-12
This paper proposes an absolute phase retrieval method for complex object measurement based on quantized phase-coding and connected region labeling. A specific code sequence is embedded into quantized phase of three coded fringes. Connected regions of different codes are labeled and assigned with 3-digit-codes combining the current period and its neighbors. Wrapped phase, more than 36 periods, can be restored with reference to the code sequence. Experimental results verify the capability of the proposed method to measure multiple isolated objects.
Jaramillo-Correa, J P; Bousquet, J; Beaulieu, J; Isabel, N; Perron, M; Bouillé, M
2003-05-01
Primers previously developed to amplify specific non-coding regions of the mitochondrial genome in Angiosperms, and new primers for additional non-coding mtDNA regions, were tested for their ability to direct DNA amplification in 12 conifer taxa and to detect sequence-tagged-site (STS) polymorphisms within and among eight species in Picea. Out of 12 primer pairs, nine were successful at amplifying mtDNA in most of the taxa surveyed. In conifers, indels and substitutions were observed for several loci, allowing them to distinguish between families, genera and, in some cases, between species within genera. In Picea, interspecific polymorphism was detected for four loci, while intraspecific variation was observed for three of the mtDNA regions studied. One of these (SSU rRNA V1 region) exhibited indel polymorphisms, and the two others ( nad1 intron b/c and nad5 intron1) revealed restriction differences after digestion with Sau3AI (PCR-RFLP). A fourth locus, the nad4L- orf25 intergenic region, showed a multibanding pattern for most of the spruce species, suggesting a possible gene duplication. Maternal inheritance, expected for mtDNA in conifers, was observed for all polymorphic markers except the intergenic region nad4L- orf25. Pooling of the variation observed with the remaining three markers resulted in two to six different mtDNA haplotypes within the different species of Picea. Evidence for intra-genomic recombination was observed in at least two taxa. Thus, these mitotypes are likely to be more informative than single-locus haplotypes. They should be particularly useful for the study of biogeography and the dynamics of hybrid zones.
Millwood, Iona Y; Bennett, Derrick A; Walters, Robin G; Clarke, Robert; Waterworth, Dawn; Johnson, Toby; Chen, Yiping; Yang, Ling; Guo, Yu; Bian, Zheng; Hacker, Alex; Yeo, Astrid; Parish, Sarah; Hill, Michael R; Chissoe, Stephanie; Peto, Richard; Cardon, Lon; Collins, Rory; Li, Liming; Chen, Zhengming
2016-01-01
Background: Lipoprotein-associated phospholipase A2 (Lp-PLA2) has been implicated in development of atherosclerosis; however, recent randomized trials of Lp-PLA2 inhibition reported no beneficial effects on vascular diseases. In East Asians, a loss-of-function variant in the PLA2G7 gene can be used to assess the effects of genetically determined lower Lp-PLA2. Methods: PLA2G7 V279F (rs76863441) was genotyped in 91 428 individuals randomly selected from the China Kadoorie Biobank of 0.5 M participants recruited in 2004–08 from 10 regions of China, with 7 years’ follow-up. Linear regression was used to assess effects of V279F on baseline traits. Logistic regression was conducted for a range of vascular and non-vascular diseases, including 41 ICD-10 coded disease categories. Results: PLA2G7 V279F frequency was 5% overall (range 3–7% by region), and 9691 (11%) participants had at least one loss-of-function variant. V279F was not associated with baseline blood pressure, adiposity, blood glucose or lung function. V279F was not associated with major vascular events [7141 events; odds ratio (OR) = 0.98 per F variant, 95% confidence interval (CI) 0.90-1.06] or other vascular outcomes, including major coronary events (922 events; 0.96, 0.79-1.18) and stroke (5967 events; 1.00, 0.92-1.09). Individuals with V279F had lower risks of diabetes (7031 events; 0.91, 0.84-0.98) and asthma (182 events; 0.53, 0.28-0.98), but there was no association after adjustment for multiple testing. Conclusions: Lifelong lower Lp-PLA2 activity was not associated with major risks of vascular or non-vascular diseases in Chinese adults. Using functional genetic variants in large-scale prospective studies with linkage to a range of health outcomes is a valuable approach to inform drug development and repositioning. PMID:27301456
Sällberg, M; Rudén, U; Wahren, B; Magnius, L O
1993-01-01
Antibody binding to antigenic regions of hepatitis C virus (HCV) envelope 1 (E1; residues 183-380, E2/non-structural (NS) 1 (residues 380-437), NS1 (residues 643-690), and NS4 (1684-1751) proteins were assayed for 50 sera with antibodies to HCV (anti-HCV) and for 46 sera without anti-HCV. Thirty-four peptides, 18 residues long with an eight-amino acid overlap within each HCV region, were synthesized and tested with all 96 sera. Within the E region 183-380, the major binding site was located to residues 203-220, and was recognized by eight sera. Within the E2/NS1 region 380-437, the peptide covering residues 410-427 was recognized by two sera, and within the NS1 region 643-690, peptides covering residues 663-690 were recognized by four sera. Within the NS4 region 1684-1751, 27 sera were reactive to one or more of the NS4 peptides, and 21 out of these were reactive with peptide 1694-1711. One part of the major binding site could be located to residues 1701-1704, with the sequence Leu-Tyr-Arg-Glu. The IgG1, IgG3 and IgG4 subclasses were reactive with the five antigenic regions of HCV core, residues 1-18, 11-28, 21-38, 51-68 and 101-118. Reactivity to the major envelope site consisted almost exclusively of IgG3, and reactivity to the major site of NS4 consisted only of IgG1. Thus, a non-restricted IgG response to linear HCV-encoded binding sites was found to the core protein, whereas IgG subclass-restricted linear binding sites were found within the E1 protein, and within the NS4 protein. PMID:7680297
Yang, Yaodong; Mason, Annaliese S.; Lei, Xintao; Ma, Zilong
2013-01-01
MicroRNAs (miRNAs) are important regulators of gene expression at the post-transcriptional level in a wide range of species. Highly conserved miRNAs regulate ancestral transcription factors common to all plants, and control important basic processes such as cell division and meristem function. We selected 21 conserved miRNA families to analyze the distribution and maintenance of miRNAs. Recently, the first genome sequence in Palmaceae was released: date palm (Phoenix dactylifera). We conducted a systematic miRNA analysis in date palm, computationally identifying and characterizing the distribution and duplication of conserved miRNAs in this species compared to other published plant genomes. A total of 81 miRNAs belonging to 18 miRNA families were identified in date palm. The majority of miRNAs in date palm and seven other well-studied plant species were located in intergenic regions and located 4 to 5 kb away from the nearest protein-coding genes. Sequence comparison showed that 67% of date palm miRNA members were present in duplicated segments, and that 135 pairs of miRNA-containing segments were duplicated in Arabidopsis, tomato, orange, rice, apple, poplar and soybean with a high similarity of non coding sequences between duplicated segments, indicating genomic duplication was a major force for expansion of conserved miRNAs. Duplicated miRNA pairs in date palm showed divergence in pre-miRNA sequence and in number of promoters, implying that these duplicated pairs may have undergone divergent evolution. Comparisons between date palm and the seven other plant species for the gain/loss of miR167 loci in an ancient segment shared between monocots and dicots suggested that these conserved miRNAs were highly influenced by and diverged as a result of genomic duplication events. PMID:23951162
Xiao, Yong; Xia, Wei; Yang, Yaodong; Mason, Annaliese S; Lei, Xintao; Ma, Zilong
2013-01-01
MicroRNAs (miRNAs) are important regulators of gene expression at the post-transcriptional level in a wide range of species. Highly conserved miRNAs regulate ancestral transcription factors common to all plants, and control important basic processes such as cell division and meristem function. We selected 21 conserved miRNA families to analyze the distribution and maintenance of miRNAs. Recently, the first genome sequence in Palmaceae was released: date palm (Phoenix dactylifera). We conducted a systematic miRNA analysis in date palm, computationally identifying and characterizing the distribution and duplication of conserved miRNAs in this species compared to other published plant genomes. A total of 81 miRNAs belonging to 18 miRNA families were identified in date palm. The majority of miRNAs in date palm and seven other well-studied plant species were located in intergenic regions and located 4 to 5 kb away from the nearest protein-coding genes. Sequence comparison showed that 67% of date palm miRNA members were present in duplicated segments, and that 135 pairs of miRNA-containing segments were duplicated in Arabidopsis, tomato, orange, rice, apple, poplar and soybean with a high similarity of non coding sequences between duplicated segments, indicating genomic duplication was a major force for expansion of conserved miRNAs. Duplicated miRNA pairs in date palm showed divergence in pre-miRNA sequence and in number of promoters, implying that these duplicated pairs may have undergone divergent evolution. Comparisons between date palm and the seven other plant species for the gain/loss of miR167 loci in an ancient segment shared between monocots and dicots suggested that these conserved miRNAs were highly influenced by and diverged as a result of genomic duplication events.
del Val, Coral; Rivas, Elena; Torres-Quesada, Omar; Toro, Nicolás; Jiménez-Zurdo, José I
2007-01-01
Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related α-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5′-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of α-proteobacteria with their eukaryotic hosts. PMID:17971083
Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L. M.; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana
2017-01-01
The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT. PMID:28704421
Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L M; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana; Parati, Eugenio A; Gorio, Alfredo
2017-01-01
The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT.
Basu, Swaraj; Larsson, Erik
2018-05-31
Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
González, Rodrigo M; Ricardi, Martiniano M; Iusem, Norberto D
2011-05-20
Eukaryotic DNA methylation is one of the most studied epigenetic processes, as it results in a direct and heritable covalent modification triggered by external stimuli. In contrast to mammals, plant DNA methylation, which is stimulated by external cues exemplified by various abiotic types of stress, is often found not only at CG sites but also at CNG (N denoting A, C or T) and CNN (asymmetric) sites. A genome-wide analysis of DNA methylation in Arabidopsis has shown that CNN methylation is preferentially concentrated in transposon genes and non-coding repetitive elements. We are particularly interested in investigating the epigenetics of plant species with larger and more complex genomes than Arabidopsis, particularly with regards to the associated alterations elicited by abiotic stress. We describe the existence of CNN-methylated epialleles that span Asr1, a non-transposon, protein-coding gene from tomato plants that lacks an orthologous counterpart in Arabidopsis. In addition, to test the hypothesis of a link between epigenetics modifications and the adaptation of crop plants to abiotic stress, we exhaustively explored the cytosine methylation status in leaf Asr1 DNA, a model gene in our system, resulting from water-deficit stress conditions imposed on tomato plants. We found that drought conditions brought about removal of methyl marks at approximately 75 of the 110 asymmetric (CNN) sites analysed, concomitantly with a decrease of the repressive H3K27me3 epigenetic mark and a large induction of expression at the RNA level. When pinpointing those sites, we observed that demethylation occurred mostly in the intronic region. These results demonstrate a novel genomic distribution of CNN methylation, namely in the transcribed region of a protein-coding, non-repetitive gene, and the changes in those epigenetic marks that are caused by water stress. These findings may represent a general mechanism for the acquisition of new epialleles in somatic cells, which are pivotal for regulating gene expression in plants.
Petersen, Michael B; Grigoriadou, Maria; Koutroumpe, Maria; Kokotas, Haris
2012-07-01
Non-syndromic hearing loss is one of the most common hereditary determined diseases in human, and the disease is a genetically heterogeneous disorder. Mutations in the GJB2 gene, encoding connexin 26 (Cx26), are a major cause of non-syndromic recessive hearing impairment in many countries and are largely dependent on ethnic groups. Due to the high frequency of the c.35delG GJB2 mutation in the Greek population, we have previously suggested that Greek patients with sensorineural, non-syndromic deafness should be tested for the c.35delG mutation and the coding region of the GJB2 gene should be sequenced in c.35delG heterozygotes. Here we present on the clinical and molecular genetic evaluation of a family suffering from prelingual, sensorineural, non-syndromic deafness. A novel c.247_249delTTC (p.F83del) GJB2 mutation was detected in compound heterozygosity with the c.35delG GJB2 mutation in the proband and was later confirmed in the father, while the mother was homozygous for the c.35delG GJB2 mutation. We conclude that compound heterozygosity of the novel c.247_249delTTC (p.F83del) and the c.35delG mutations in the GJB2 gene was the cause of deafness in the proband and his father. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.
Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S
2015-01-01
In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.
Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.
2015-01-01
In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Li, Juan; Chen, Fen; Sugiyama, Hiromu; Blair, David; Lin, Rui-Qing; Zhu, Xing-Quan
2015-07-01
In the present study, near-complete mitochondrial (mt) genome sequences for Schistosoma japonicum from different regions in the Philippines and Japan were amplified and sequenced. Comparisons among S. japonicum from the Philippines, Japan, and China revealed a geographically based length difference in mt genomes, but the mt genomic organization and gene arrangement were the same. Sequence differences among samples from the Philippines and all samples from the three endemic areas were 0.57-2.12 and 0.76-3.85 %, respectively. The most variable part of the mt genome was the non-coding region. In the coding portion of the genome, protein-coding genes varied more than rRNA genes and tRNAs. The near-complete mt genome sequences for Philippine specimens were identical in length (14,091 bp) which was 4 bp longer than those of S. japonicum samples from Japan and China. This indel provides a unique genetic marker for S. japonicum samples from the Philippines. Phylogenetic analyses based on the concatenated amino acids of 12 protein-coding genes showed that samples of S. japonicum clustered according to their geographical origins. The identified mitochondrial indel marker will be useful for tracing the source of S. japonicum infection in humans and animals in Southeast Asia.
Survival in commercially insured multiple sclerosis patients and comparator subjects in the U.S.
Kaufman, D W; Reshef, S; Golub, H L; Peucker, M; Corwin, M J; Goodin, D S; Knappertz, V; Pleimes, D; Cutter, G
2014-05-01
Compare survival in patients with multiple sclerosis (MS) from a U.S. commercial health insurance database with a matched cohort of non-MS subjects. 30,402 MS patients and 89,818 non-MS subjects (comparators) in the OptumInsight Research (OIR) database from 1996 to 2009 were included. An MS diagnosis required at least 3 consecutive months of database reporting, with two or more ICD-9 codes of 340 at least 30 days apart, or the combination of 1 ICD-9-340 code and at least 1 MS disease-modifying treatment (DMT) code. Comparators required the absence of ICD-9-340 and DMT codes throughout database reporting. Up to three comparators were matched to each patient for: age in the year of the first relevant code (index year - at least 3 months of reporting in that year were required); sex; region of residence in the index year. Deaths were ascertained from the National Death Index and the Social Security Administration Death Master File. Subjects not identified as deceased were assumed to be alive through the end of 2009. Annual mortality rates were 899/100,000 among MS patients and 446/100,000 among comparators. Standardized mortality ratios compared to the U.S. population were 1.70 and 0.80, respectively. Kaplan-Meier analysis yielded a median survival from birth that was 6 years lower among MS patients than among comparators. The results show, for the first time in a U.S. population, a survival disadvantage for contemporary MS patients compared to non-MS subjects from the same healthcare system. The 6-year decrement in lifespan parallels a recent report from British Columbia. Copyright © 2013 Elsevier B.V. All rights reserved.