genome structure expression: Topics by Science.gov

Sample records for genome structure expression

Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

PubMed

Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

2015-02-10

Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus

PubMed Central

Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.

2001-01-01

Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505
Refolding strategies from inclusion bodies in a structural genomics project.

PubMed

Trésaugues, Lionel; Collinet, Bruno; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Liger, Dominique; Zhou, Cong-Zhao; Janin, Joël; Van Tilbeurgh, Herman; Quevillon-Cheruel, Sophie

2004-01-01

The South-Paris Yeast Structural Genomics Project aims at systematically expressing, purifying and determining the structure of S. cerevisiae proteins with no detectable homology to proteins of known structure. We brought 250 yeast ORFs to expression in E. coli, but 37% of them form inclusion bodies. This important fraction of proteins that are well expressed but lost for structural studies prompted us to test methodologies to recover these proteins. Three different strategies were explored in parallel on a set of 20 proteins: (1) refolding from solubilized inclusion bodies using an original and fast 96-well plates screening test, (2) co-expression of the targets in E. coli with DnaK-DnaJ-GrpE and GroEL-GroES chaperones, and (3) use of the cell-free expression system. Most of the tested proteins (17/20) could be resolubilized at least by one approach, but the subsequent purification proved to be difficult for most of them.
Genome Structures and Transcriptomes Signify Niche Adaptation for the Multiple-Ion-Tolerant Extremophyte Schrenkiella parvula1[C][W][OPEN

PubMed Central

Oh, Dong-Ha; Hong, Hyewon; Lee, Sang Yeol; Yun, Dae-Jin; Bohnert, Hans J.; Dassanayake, Maheshi

2014-01-01

Schrenkiella parvula (formerly Thellungiella parvula), a close relative of Arabidopsis (Arabidopsis thaliana) and Brassica crop species, thrives on the shores of Lake Tuz, Turkey, where soils accumulate high concentrations of multiple-ion salts. Despite the stark differences in adaptations to extreme salt stresses, the genomes of S. parvula and Arabidopsis show extensive synteny. S. parvula completes its life cycle in the presence of Na+, K+, Mg2+, Li+, and borate at soil concentrations lethal to Arabidopsis. Genome structural variations, including tandem duplications and translocations of genes, interrupt the colinearity observed throughout the S. parvula and Arabidopsis genomes. Structural variations distinguish homologous gene pairs characterized by divergent promoter sequences and basal-level expression strengths. Comparative RNA sequencing reveals the enrichment of ion-transport functions among genes with higher expression in S. parvula, while pathogen defense-related genes show higher expression in Arabidopsis. Key stress-related ion transporter genes in S. parvula showed increased copy number, higher transcript dosage, and evidence for subfunctionalization. This extremophyte offers a framework to identify the requisite adjustments of genomic architecture and expression control for a set of genes found in most plants in a way to support distinct niche adaptation and lifestyles. PMID:24563282
Genome-wide characterization of the Pectate Lyase-like (PLL) genes in Brassica rapa.

PubMed

Jiang, Jingjing; Yao, Lina; Miao, Ying; Cao, Jiashu

2013-11-01

Pectate lyases (PL) depolymerize demethylated pectin (pectate, EC 4.2.2.2) by catalyzing the eliminative cleavage of α-1,4-glycosidic linked galacturonan. Pectate Lyase-like (PLL) genes are one of the largest and most complex families in plants. However, studies on the phylogeny, gene structure, and expression of PLL genes are limited. To understand the potential functions of PLL genes in plants, we characterized their intron-exon structure, phylogenetic relationships, and protein structures, and measured their expression patterns in various tissues, specifically the reproductive tissues in Brassica rapa. Sequence alignments revealed two characteristic motifs in PLL genes. The chromosome location analysis indicated that 18 of the 46 PLL genes were located in the least fractionated sub-genome (LF) of B. rapa, while 16 were located in the medium fractionated sub-genome (MF1) and 12 in the more fractionated sub-genome (MF2). Quantitative RT-PCR analysis showed that BrPLL genes were expressed in various tissues, with most of them being expressed in flowers. Detailed qRT-PCR analysis identified 11 pollen specific PLL genes and several other genes with unique spatial expression patterns. In addition, some duplicated genes showed similar expression patterns. The phylogenetic analysis identified three PLL gene subfamilies in plants, among which subfamily II might have evolved from gene neofunctionalization or subfunctionalization. Therefore, this study opens the possibility for exploring the roles of PLL genes during plant development.
Cloning, production, and purification of proteins for a medium-scale structural genomics project.

PubMed

Quevillon-Cheruel, Sophie; Collinet, Bruno; Trésaugues, Lionel; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Zhou, Cong-Zhao; Liger, Dominique; Bettache, Nabila; Poupon, Anne; Aboulfath, Ilham; Leulliot, Nicolas; Janin, Joël; van Tilbeurgh, Herman

2007-01-01

The South-Paris Yeast Structural Genomics Pilot Project (http://www.genomics.eu.org) aims at systematically expressing, purifying, and determining the three-dimensional structures of Saccharomyces cerevisiae proteins. We have already cloned 240 yeast open reading frames in the Escherichia coli pET system. Eighty-two percent of the targets can be expressed in E. coli, and 61% yield soluble protein. We have currently purified 58 proteins. Twelve X-ray structures have been solved, six are in progress, and six other proteins gave crystals. In this chapter, we present the general experimental flowchart applied for this project. One of the main difficulties encountered in this pilot project was the low solubility of a great number of target proteins. We have developed parallel strategies to recover these proteins from inclusion bodies, including refolding, coexpression with chaperones, and an in vitro expression system. A limited proteolysis protocol, developed to localize flexible regions in proteins that could hinder crystallization, is also described.
National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression. Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1990-12-31

This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.
National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1990-01-01

This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.
Aquatic Plant Genomics: Advances, Applications, and Prospects

PubMed Central

Li, Gaojie; Yang, Jingjing

2017-01-01

Genomics is a discipline in genetics that studies the genome composition of organisms and the precise structure of genes and their expression and regulation. Genomics research has resolved many problems where other biological methods have failed. Here, we summarize advances in aquatic plant genomics with a focus on molecular markers, the genes related to photosynthesis and stress tolerance, comparative study of genomes and genome/transcriptome sequencing technology. PMID:28900619
Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus

PubMed Central

Salem, Nida’ M.; Miller, W. Allen; Rowhani, Adib; Golino, Deborah A.; Moyne, Anne-Laure; Falk, Bryce W.

2015-01-01

We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5′- and 3′-RACE showed the RSDaV genomic RNA to be 5,808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3′-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5′ ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5′ end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3′ cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae. PMID:18329064
Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus.

PubMed

Salem, Nida' M; Miller, W Allen; Rowhani, Adib; Golino, Deborah A; Moyne, Anne-Laure; Falk, Bryce W

2008-06-05

We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5'- and 3'-RACE showed the RSDaV genomic RNA to be 5808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3'-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5' ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5' end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3' cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae.
Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System.

PubMed

Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

2017-01-01

Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.
Complete mitochondrial genome of Concholepas concholepas inferred by 454 pyrosequencing and mtDNA expression in two mollusc populations.

PubMed

Núñez-Acuña, Gustavo; Aguilar-Espinoza, Andrea; Gallardo-Escárate, Cristian

2013-03-01

Despite the great relevance of mitochondrial genome analysis in evolutionary studies, there is scarce information on how the transcripts associated with the mitogenome are expressed and their role in the genetic structuring of populations. This work reports the complete mitochondrial genome of the marine gastropod Concholepas concholepas, obtained by 454 pryosequencing, and an analysis of mitochondrial transcripts of two populations 1000 km apart along the Chilean coast. The mitochondrion of C. concholepas is 15,495 base pairs (bp) in size and contains the 37 subunits characteristic of metazoans, as well as a non-coding region of 330 bp. In silico analysis of mitochondrial gene variability showed significant differences among populations. In terms of levels of relative abundance of transcripts associated with mitochondrion in the two populations (assessed by qPCR), the genes associated with complexes III and IV of the mitochondrial genome had the highest levels of expression in the northern population while transcripts associated with the ATP synthase complex had the highest levels of expression in the southern population. Moreover, fifteen polymorphic SNPs were identified in silico between the mitogenomes of the two populations. Four of these markers implied different amino acid substitutions (non-synonymous SNPs). This work contributes novel information regarding the mitochondrial genome structure and mRNA expression levels of C. concholepas. Copyright © 2012 Elsevier Inc. All rights reserved.
Genomic Hypomethylation in the Human Germline Associates with Selective Structural Mutability in the Human Genome

PubMed Central

Li, Jian; Harris, R. Alan; Cheung, Sau Wai; Coarfa, Cristian; Jeong, Mira; Goodell, Margaret A.; White, Lisa D.; Patel, Ankita; Kang, Sung-Hae; Shaw, Chad; Chinault, A. Craig; Gambin, Tomasz; Gambin, Anna; Lupski, James R.; Milosavljevic, Aleksandar

2012-01-01

The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease. PMID:22615578
Genome-wide identification and characterization of Glyceraldehyde-3-phosphate dehydrogenase genes family in wheat (Triticum aestivum).

PubMed

Zeng, Lingfeng; Deng, Rong; Guo, Ziping; Yang, Shushen; Deng, Xiping

2016-03-16

Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a central enzyme in glycolysi, we performed genome-wide identification of GAPDH genes in wheat and analyzed their structural characteristics and expression patterns under abiotic stress in wheat. A total of 22 GAPDH genes were identified in wheat cv. Chinese spring; the phylogenetic and structure analysis showed that these GAPDH genes could be divided into four distinct subfamilies. The expression profiles of GAPDH genes showed tissue specificity all over plant development stages. The qRT-PCR results revealed that wheat GAPDHs were involved in several abiotic stress response. Wheat carried 22 GAPDH genes, representing four types of plant GAPDHs (gapA/B, gapC, gapCp and gapN). Whole genome duplication and segmental duplication might account for the expansion of wheat GAPDHs. Expression analysis implied that GAPDHs play roles in plants abiotic stress tolerance.
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle.

PubMed

Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

2010-01-01

GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3'-UTR GeneChips), genome-wide protein-DNA binding assays and protein-protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle

PubMed Central

Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

2010-01-01

GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3′-UTR GeneChips), genome-wide protein–DNA binding assays and protein–protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/ PMID:21149299
Applications of the 1000 Genomes Project resources

PubMed Central

Zheng-Bradley, Xiangqun

2017-01-01

Abstract The 1000 Genomes Project created a valuable, worldwide reference for human genetic variation. Common uses of the 1000 Genomes dataset include genotype imputation supporting Genome-wide Association Studies, mapping expression Quantitative Trait Loci, filtering non-pathogenic variants from exome, whole genome and cancer genome sequencing projects, and genetic analysis of population structure and molecular evolution. In this article, we will highlight some of the multiple ways that the 1000 Genomes data can be and has been utilized for genetic studies. PMID:27436001
A genome-wide 20 K citrus microarray for gene expression analysis

PubMed Central

Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

2008-01-01

Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to catalogue genes expressed in citrus globular embryos. PMID:18598343
Genomic Imprinting in Mammals

PubMed Central

Barlow, Denise P.

2014-01-01

Genomic imprinting affects a subset of genes in mammals and results in a monoallelic, parental-specific expression pattern. Most of these genes are located in clusters that are regulated through the use of insulators or long noncoding RNAs (lncRNAs). To distinguish the parental alleles, imprinted genes are epigenetically marked in gametes at imprinting control elements through the use of DNA methylation at the very least. Imprinted gene expression is subsequently conferred through lncRNAs, histone modifications, insulators, and higher-order chromatin structure. Such imprints are maintained after fertilization through these mechanisms despite extensive reprogramming of the mammalian genome. Genomic imprinting is an excellent model for understanding mammalian epigenetic regulation. PMID:24492710

Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis

PubMed Central

Chakrabarti, Kausik; Pearson, Michael; Grate, Leslie; Sterne-Weiler, Timothy; Deans, Jonathan; Donohue, John Paul; Ares, Manuel

2007-01-01

As the genomes of more eukaryotic pathogens are sequenced, understanding how molecular differences between parasite and host might be exploited to provide new therapies has become a major focus. Central to cell function are RNA-containing complexes involved in gene expression, such as the ribosome, the spliceosome, snoRNAs, RNase P, and telomerase, among others. In this article we identify by comparative genomics and validate by RNA analysis numerous previously unknown structural RNAs encoded by the Plasmodium falciparum genome, including the telomerase RNA, U3, 31 snoRNAs, as well as previously predicted spliceosomal snRNAs, SRP RNA, MRP RNA, and RNAse P RNA. Furthermore, we identify six new RNA coding genes of unknown function. To investigate the relationships of the RNA coding genes to other genomic features in related parasites, we developed a genome browser for P. falciparum (http://areslab.ucsc.edu/cgi-bin/hgGateway). Additional experiments provide evidence supporting the prediction that snoRNAs guide methylation of a specific position on U4 snRNA, as well as predicting an snRNA promoter element particular to Plasmodium sp. These findings should allow detailed structural comparisons between the RNA components of the gene expression machinery of the parasite and its vertebrate hosts. PMID:17901154
Characterization of embryo-specific genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1989-01-01

The objective of the proposed research is to characterize the structure and function of a set of genes whose expression is regulated in embryo development, and that is not expressed in mature tissues -- the embryonic genes. In the last two years, using cDNA clones, we have isolated 22 cDNA clones, and characterized the expression pattern of their corresponding RNA. At least 4 cDNA clones detect RNAs of embryonic genes. These cDNA clones detect RNAs expressed in somatic as well as zygotic embryos of carrot. Using the cDNA clones, we screened the genomic library of carrot embryo DNA, and isolatedmore » genomic clones for three genes. The structure and function of two genes DC 8 and DC 59 have been characterized and are reported in this paper.« less
Stable zymomonas mobilis xylose and arabinose fermenting strains

DOEpatents

Zhang, Min [Lakewood, CO; Chou, Yat-Chen [Taipei, TW

2008-04-08

The present invention briefly includes a transposon for stable insertion of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, and at least one promoter for expression of the structural genes in the bacterium, a pair of inverted insertion sequences, the operons contained inside the insertion sequences, and a transposase gene located outside of the insertion sequences. A plasmid shuttle vector for transformation of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, at least one promoter for expression of the structural genes in the bacterium, and at least two DNA fragments having homology with a gene in the bacterial genome to be transformed, is also provided.The transposon and shuttle vectors are useful in constructing significantly different Zymomonas mobilis strains, according to the present invention, which are useful in the conversion of the cellulose derived pentose sugars into fuels and chemicals, using traditional fermentation technology, because they are stable for expression in a non-selection medium.
Single cell Hi-C reveals cell-to-cell variability in chromosome structure

PubMed Central

Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter

2013-01-01

Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610
Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System

PubMed Central

Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

2017-01-01

Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252
Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Larsen, P. E.; Trivedi, G.; Sreedasyam, A.

2010-07-06

Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Approximate matching of structured motifs in DNA sequences.

PubMed

El-Mabrouk, Nadia; Raffinot, Mathieu; Duchesne, Jean-Eudes; Lajoie, Mathieu; Luc, Nicolas

2005-04-01

Several methods have been developed for identifying more or less complex RNA structures in a genome. All these methods are based on the search for conserved primary and secondary sub-structures. In this paper, we present a simple formal representation of a helix, which is a combination of sequence and folding constraints, as a constrained regular expression. This representation allows us to develop a well-founded algorithm that searches for all approximate matches of a helix in a genome. The algorithm is based on an alignment graph constructed from several copies of a pushdown automaton, arranged one on top of another. This is a first attempt to take advantage of the possibilities of pushdown automata in the context of approximate matching. The worst time complexity is O(krpn), where k is the error threshold, n the size of the genome, p the size of the secondary expression, and r its number of union symbols. We then extend the algorithm to search for pseudo-knots and secondary structures containing an arbitrary number of helices.
Cell-free translational screening of an expression sequence tag library of Clonorchis sinensis for novel antigen discovery.

PubMed

Kasi, Devi; Catherine, Christy; Lee, Seung-Won; Lee, Kyung-Ho; Kim, Yu Jung; Ro Lee, Myeong; Ju, Jung Won; Kim, Dong-Myung

2017-05-01

The rapidly evolving cloning and sequencing technologies have enabled understanding of genomic structure of parasite genomes, opening up new ways of combatting parasite-related diseases. To make the most of the exponentially accumulating genomic data, however, it is crucial to analyze the proteins encoded by these genomic sequences. In this study, we adopted an engineered cell-free protein synthesis system for large-scale expression screening of an expression sequence tag (EST) library of Clonorchis sinensis to identify potential antigens that can be used for diagnosis and treatment of clonorchiasis. To allow high-throughput expression and identification of individual genes comprising the library, a cell-free synthesis reaction was designed such that both the template DNA and the expressed proteins were co-immobilized on the same microbeads, leading to microbead-based linkage of the genotype and phenotype. This reaction configuration allowed streamlined expression, recovery, and analysis of proteins. This approach enabled us to identify 21 antigenic proteins. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:832-837, 2017. © 2017 American Institute of Chemical Engineers.
Mms1 is an assistant for regulating G-quadruplex DNA structures.

PubMed

Schwindt, Eike; Paeschke, Katrin

2018-06-01

The preservation of genome stability is fundamental for every cell. Genomic integrity is constantly challenged. Among those challenges are also non-canonical nucleic acid structures. In recent years, scientists became aware of the impact of G-quadruplex (G4) structures on genome stability. It has been shown that folded G4-DNA structures cause changes in the cell, such as transcriptional up/down-regulation, replication stalling, or enhanced genome instability. Multiple helicases have been identified to regulate G4 structures and by this preserve genome stability. Interestingly, although these helicases are mostly ubiquitous expressed, they show specificity for G4 regulation in certain cellular processes (e.g., DNA replication). To this date, it is not clear how this process and target specificity of helicases are achieved. Recently, Mms1, an ubiquitin ligase complex protein, was identified as a novel G4-DNA-binding protein that supports genome stability by aiding Pif1 helicase binding to these regions. In this perspective review, we discuss the question if G4-DNA interacting proteins are fundamental for helicase function and specificity at G4-DNA structures.
The Genomic and Transcriptomic Landscape of a HeLa Cell Line

PubMed Central

Landry, Jonathan J. M.; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M.; Stütz, Adrian M.; Jauch, Anna; Aiyar, Raeka S.; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O.; Huber, Wolfgang; Steinmetz, Lars M.

2013-01-01

HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology. PMID:23550136
Applications of the 1000 Genomes Project resources.

PubMed

Zheng-Bradley, Xiangqun; Flicek, Paul

2017-05-01

The 1000 Genomes Project created a valuable, worldwide reference for human genetic variation. Common uses of the 1000 Genomes dataset include genotype imputation supporting Genome-wide Association Studies, mapping expression Quantitative Trait Loci, filtering non-pathogenic variants from exome, whole genome and cancer genome sequencing projects, and genetic analysis of population structure and molecular evolution. In this article, we will highlight some of the multiple ways that the 1000 Genomes data can be and has been utilized for genetic studies. © The Author 2016. Published by Oxford University Press.
Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

PubMed

Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

2015-12-11

High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.
Complete Mitochondrial Genome of the Medicinal Mushroom Ganoderma lucidum

PubMed Central

Chen, Haimei; Chen, Xiangdong; Lan, Jin; Liu, Chang

2013-01-01

Ganoderma lucidum is one of the well-known medicinal basidiomycetes worldwide. The mitochondrion, referred to as the second genome, is an organelle found in most eukaryotic cells and participates in critical cellular functions. Elucidating the structure and function of this genome is important to understand completely the genetic contents of G. lucidum. In this study, we assembled the mitochondrial genome of G. lucidum and analyzed the differential expressions of its encoded genes across three developmental stages. The mitochondrial genome is a typical circular DNA molecule of 60,630 bp with a GC content of 26.67%. Genome annotation identified genes that encode 15 conserved proteins, 27 tRNAs, small and large rRNAs, four homing endonucleases, and two hypothetical proteins. Except for genes encoding trnW and two hypothetical proteins, all genes were located on the positive strand. For the repeat structure analysis, eight forward, two inverted, and three tandem repeats were detected. A pair of fragments with a total length around 5.5 kb was found in both the nuclear and mitochondrial genomes, which suggests the possible transfer of DNA sequences between two genomes. RNA-Seq data for samples derived from three stages, namely, mycelia, primordia, and fruiting bodies, were mapped to the mitochondrial genome and qualified. The protein-coding genes were expressed higher in mycelia or primordial stages compared with those in the fruiting bodies. The rRNA abundances were significantly higher in all three stages. Two regions were transcribed but did not contain any identified protein or tRNA genes. Furthermore, three RNA-editing sites were detected. Genome synteny analysis showed that significant genome rearrangements occurred in the mitochondrial genomes. This study provides valuable information on the gene contents of the mitochondrial genome and their differential expressions at various developmental stages of G. lucidum. The results contribute to the understanding of the functions and evolution of fungal mitochondrial DNA. PMID:23991034
The retrovirus HTLV-1 inserts an ectopic CTCF-binding site into the human genome.

PubMed

Satou, Yorifumi; Miyazato, Paola; Ishihara, Ko; Yaguchi, Hiroko; Melamed, Anat; Miura, Michi; Fukuda, Asami; Nosaka, Kisato; Watanabe, Takehisa; Rowan, Aileen G; Nakao, Mitsuyoshi; Bangham, Charles R M

2016-03-15

Human T-lymphotropic virus type 1 (HTLV-1) is a retrovirus that causes malignant and inflammatory diseases in ∼10% of infected people. A typical host has between 10(4) and 10(5) clones of HTLV-1-infected T lymphocytes, each clone distinguished by the genomic integration site of the single-copy HTLV-1 provirus. The HTLV-1 bZIP (HBZ) factor gene is constitutively expressed from the minus strand of the provirus, whereas plus-strand expression, required for viral propagation to uninfected cells, is suppressed or intermittent in vivo, allowing escape from host immune surveillance. It remains unknown what regulates this pattern of proviral transcription and latency. Here, we show that CTCF, a key regulator of chromatin structure and function, binds to the provirus at a sharp border in epigenetic modifications in the pX region of the HTLV-1 provirus in T cells naturally infected with HTLV-1. CTCF is a zinc-finger protein that binds to an insulator region in genomic DNA and plays a fundamental role in controlling higher order chromatin structure and gene expression in vertebrate cells. We show that CTCF bound to HTLV-1 acts as an enhancer blocker, regulates HTLV-1 mRNA splicing, and forms long-distance interactions with flanking host chromatin. CTCF-binding sites (CTCF-BSs) have been propagated throughout the genome by transposons in certain primate lineages, but CTCF binding has not previously been described in present-day exogenous retroviruses. The presence of an ectopic CTCF-BS introduced by the retrovirus in tens of thousands of genomic locations has the potential to cause widespread abnormalities in host cell chromatin structure and gene expression.
Molecular Pathways: Extracting Medical Knowledge from High Throughput Genomic Data

PubMed Central

Goldstein, Theodore; Paull, Evan O.; Ellis, Matthew J.; Stuart, Joshua M.

2013-01-01

High-throughput genomic data that measures RNA expression, DNA copy number, mutation status and protein levels provide us with insights into the molecular pathway structure of cancer. Genomic lesions (amplifications, deletions, mutations) and epigenetic modifications disrupt biochemical cellular pathways. While the number of possible lesions is vast, different genomic alterations may result in concordant expression and pathway activities, producing common tumor subtypes that share similar phenotypic outcomes. How can these data be translated into medical knowledge that provides prognostic and predictive information? First generation mRNA expression signatures such as Genomic Health's Oncotype DX already provide prognostic information, but do not provide therapeutic guidance beyond the current standard of care – which is often inadequate in high-risk patients. Rather than building molecular signatures based on gene expression levels, evidence is growing that signatures based on higher-level quantities such as from genetic pathways may provide important prognostic and diagnostic cues. We provide examples of how activities for molecular entities can be predicted from pathway analysis and how the composite of all such activities, referred to here as the “activitome,” help connect genomic events to clinical factors in order to predict the drivers of poor outcome. PMID:23430023
Informational laws of genome structures

PubMed Central

Bonnici, Vincenzo; Manca, Vincenzo

2016-01-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155
Informational laws of genome structures

NASA Astrophysics Data System (ADS)

Bonnici, Vincenzo; Manca, Vincenzo

2016-06-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
NCBI GEO: archive for high-throughput functional genomic data.

PubMed

Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Edgar, Ron

2009-01-01

The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as 'Minimum Information About a Microarray Experiment' (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

PubMed

Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

2015-04-23

With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.
Structural Studies on Varicella Zoster Virus

DTIC Science & Technology

1990-08-20

TABLE OF CONTENTS INTRODUCTION 1 The VZV Genome , 8 VZV Proteins 12 VZV Transcription 14 The Structure of HSV 15 Herpesvirus Expression 16...VZV virion . . . 4 Figure 2. A drawing of the VZV nucleocapsid 6 Figure 3. A comparison of the structure of the genomes of HSV - 1 and VZV 9 Figure 4...VZV, and purified VZ virions probed with antibody against VZV IE62 (the HSV - 1 ICP4 equivalent) . . . 154 xii Figure 48. Autoradiograph of VZV IE62

Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

Cancer.gov

Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.
Genome evolution and speciation genetics of clawed frogs (Xenopus and Silurana).

PubMed

Evans, Ben J

2008-05-01

Speciation of clawed frogs occurred through bifurcation and reticulation of evolutionary lineages, and resulted in extant species with different ploidy levels. Duplicate gene evolution and expression in these animals provides a unique perspective into the earliest genomic transformations after vertebrate whole genome duplication (WGD) and suggests that functional constraints are relaxed compared to before duplication but still consistently strong for millions of years following WGD. Additionally, extensive quantitative expression divergence between duplicate genes occurred after WGD. Diversification of clawed frogs was potentially catalyzed by transposition and divergent resolution--processes that occur through different genetic mechanisms but that have analogous implications for genome structure. How sex determination is maintained after genome duplication is fundamental to our understanding of why allopolyploidization is so prevalent in this group, and why clawed frogs violate Haldane's Rule for hybrid sterility. Future studies of expression subfunctionalization in polyploids will shed light on the role and purviews of cis- and trans-regulatory elements in gene regulation.
The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

PubMed Central

Mello, C.V.; Clayton, D.F.

2014-01-01

High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907
HIV promoter integration site primarily modulates transcriptional burst size rather than frequency.

PubMed

Skupsky, Ron; Burnett, John C; Foley, Jonathan E; Schaffer, David V; Arkin, Adam P

2010-09-30

Mammalian gene expression patterns, and their variability across populations of cells, are regulated by factors specific to each gene in concert with its surrounding cellular and genomic environment. Lentiviruses such as HIV integrate their genomes into semi-random genomic locations in the cells they infect, and the resulting viral gene expression provides a natural system to dissect the contributions of genomic environment to transcriptional regulation. Previously, we showed that expression heterogeneity and its modulation by specific host factors at HIV integration sites are key determinants of infected-cell fate and a possible source of latent infections. Here, we assess the integration context dependence of expression heterogeneity from diverse single integrations of a HIV-promoter/GFP-reporter cassette in Jurkat T-cells. Systematically fitting a stochastic model of gene expression to our data reveals an underlying transcriptional dynamic, by which multiple transcripts are produced during short, infrequent bursts, that quantitatively accounts for the wide, highly skewed protein expression distributions observed in each of our clonal cell populations. Interestingly, we find that the size of transcriptional bursts is the primary systematic covariate over integration sites, varying from a few to tens of transcripts across integration sites, and correlating well with mean expression. In contrast, burst frequencies are scattered about a typical value of several per cell-division time and demonstrate little correlation with the clonal means. This pattern of modulation generates consistently noisy distributions over the sampled integration positions, with large expression variability relative to the mean maintained even for the most productive integrations, and could contribute to specifying heterogeneous, integration-site-dependent viral production patterns in HIV-infected cells. Genomic environment thus emerges as a significant control parameter for gene expression variation that may contribute to structuring mammalian genomes, as well as be exploited for survival by integrating viruses.
Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins

PubMed Central

Chen, Yunjia; Qiu, Shihong; Luan, Chi-Hao; Luo, Ming

2007-01-01

Background Expression of higher eukaryotic genes as soluble, stable recombinant proteins is still a bottleneck step in biochemical and structural studies of novel proteins today. Correct identification of stable domains/fragments within the open reading frame (ORF), combined with proper cloning strategies, can greatly enhance the success rate when higher eukaryotic proteins are expressed as these domains/fragments. Furthermore, a HTP cloning pipeline incorporated with bioinformatics domain/fragment selection methods will be beneficial to studies of structure and function genomics/proteomics. Results With bioinformatics tools, we developed a domain/domain boundary prediction (DDBP) method, which was trained by available experimental data. Combined with an improved cloning strategy, DDBP had been applied to 57 proteins from C. elegans. Expression and purification results showed there was a 10-fold increase in terms of obtaining purified proteins. Based on the DDBP method, the improved GATEWAY cloning strategy and a robotic platform, we constructed a high throughput (HTP) cloning pipeline, including PCR primer design, PCR, BP reaction, transformation, plating, colony picking and entry clones extraction, which have been successfully applied to 90 C. elegans genes, 88 Brucella genes, and 188 human genes. More than 97% of the targeted genes were obtained as entry clones. This pipeline has a modular design and can adopt different operations for a variety of cloning/expression strategies. Conclusion The DDBP method and improved cloning strategy were satisfactory. The cloning pipeline, combined with our recombinant protein HTP expression pipeline and the crystal screening robots, constitutes a complete platform for structure genomics/proteomics. This platform will increase the success rate of purification and crystallization dramatically and promote the further advancement of structure genomics/proteomics. PMID:17663785
An unusual internal ribosomal entry site of inverted symmetry directs expression of a potato leafroll polerovirus replication-associated protein

PubMed Central

Jaag, Hannah Miriam; Kawchuk, Lawrence; Rohde, Wolfgang; Fischer, Rainer; Emans, Neil; Prüfer, Dirk

2003-01-01

Potato leafroll polerovirus (PLRV) genomic RNA acts as a polycistronic mRNA for the production of proteins P0, P1, and P2 translated from the 5′-proximal half of the genome. Within the P1 coding region we identified a 5-kDa replication-associated protein 1 (Rap1) essential for viral multiplication. An internal ribosome entry site (IRES) with unusual structure and location was identified that regulates Rap1 translation. Core structural elements for internal ribosome entry include a conserved AUG codon and a downstream GGAGAGAGAGG motif with inverted symmetry. Reporter gene expression in potato protoplasts confirmed the internal ribosome entry function. Unlike known IRES motifs, the PLRV IRES is located completely within the coding region of Rap1 at the center of the PLRV genome. PMID:12835413
Transcriptome profile of a bovine respiratory disease pathogen: Mannheimia haemolytica PHL213

PubMed Central

2012-01-01

Background Computational methods for structural gene annotation have propelled gene discovery but face certain drawbacks with regards to prokaryotic genome annotation. Identification of transcriptional start sites, demarcating overlapping gene boundaries, and identifying regulatory elements such as small RNA are not accurate using these approaches. In this study, we re-visit the structural annotation of Mannheimia haemolytica PHL213, a bovine respiratory disease pathogen. M. haemolytica is one of the causative agents of bovine respiratory disease that results in about $3 billion annual losses to the cattle industry. We used RNA-Seq and analyzed the data using freely-available computational methods and resources. The aim was to identify previously unannotated regions of the genome using RNA-Seq based expression profile to complement the existing annotation of this pathogen. Results Using the Illumina Genome Analyzer, we generated 9,055,826 reads (average length ~76 bp) and aligned them to the reference genome using Bowtie. The transcribed regions were analyzed using SAMTOOLS and custom Perl scripts in conjunction with BLAST searches and available gene annotation information. The single nucleotide resolution map enabled the identification of 14 novel protein coding regions as well as 44 potential novel sRNA. The basal transcription profile revealed that 2,506 of the 2,837 annotated regions were expressed in vitro, at 95.25% coverage, representing all broad functional gene categories in the genome. The expression profile also helped identify 518 potential operon structures involving 1,086 co-expressed pairs. We also identified 11 proteins with mutated/alternate start codons. Conclusions The application of RNA-Seq based transcriptome profiling to structural gene annotation helped correct existing annotation errors and identify potential novel protein coding regions and sRNA. We used computational tools to predict regulatory elements such as promoters and terminators associated with the novel expressed regions for further characterization of these novel functional elements. Our study complements the existing structural annotation of Mannheimia haemolytica PHL213 based on experimental evidence. Given the role of sRNA in virulence gene regulation and stress response, potential novel sRNA described in this study can form the framework for future studies to determine the role of sRNA, if any, in M. haemolytica pathogenesis. PMID:23046475
The evolution of an osmotically inducible dps in the genus Streptomyces.

PubMed

Facey, Paul D; Hitchings, Matthew D; Williams, Jason S; Skibinski, David O F; Dyson, Paul J; Del Sol, Ricardo

2013-01-01

Dps proteins are found almost ubiquitously in bacterial genomes and there is now an appreciation of their multifaceted roles in various stress responses. Previous studies have shown that this family of proteins assemble into dodecamers and their quaternary structure is entirely critical to their function. Moreover, the numbers of dps genes per bacterial genome is variable; even amongst closely related species - however, for many genera this enigma is yet to be satisfactorily explained. We reconstruct the most probable evolutionary history of Dps in Streptomyces genomes. Typically, these bacteria encode for more than one Dps protein. We offer the explanation that variation in the number of dps per genome among closely related Streptomyces can be explained by gene duplication or lateral acquisition, and the former preceded a subsequent shift in expression patterns for one of the resultant paralogs. We show that the genome of S. coelicolor encodes for three Dps proteins including a tailless Dps. Our in vivo observations show that the tailless protein, unlike the other two Dps in S. coelicolor, does not readily oligomerise. Phylogenetic and bioinformatic analyses combined with expression studies indicate that in several Streptomyces species at least one Dps is significantly over-expressed during osmotic shock, but the identity of the ortholog varies. In silico analysis of dps promoter regions coupled with gene expression studies of duplicated dps genes shows that paralogous gene pairs are expressed differentially and this correlates with the presence of a sigB promoter. Lastly, we identify a rare novel clade of Dps and show that a representative of these proteins in S. coelicolor possesses a dodecameric quaternary structure of high stability.
Genome-Wide Survey on Genomic Variation, Expression Divergence, and Evolution in Two Contrasting Rice Genotypes under High Salinity Stress

PubMed Central

Jiang, Shu-Ye; Ma, Ali; Ramamoorthy, Rengasamy; Ramachandran, Srinivasan

2013-01-01

Expression profiling is one of the most important tools for dissecting biological functions of genes and the upregulation or downregulation of gene expression is sufficient for recreating phenotypic differences. Expression divergence of genes significantly contributes to phenotypic variations. However, little is known on the molecular basis of expression divergence and evolution among rice genotypes with contrasting phenotypes. In this study, we have implemented an integrative approach using bioinformatics and experimental analyses to provide insights into genomic variation, expression divergence, and evolution between salinity-sensitive rice variety Nipponbare and tolerant rice line Pokkali under normal and high salinity stress conditions. We have detected thousands of differentially expressed genes between these two genotypes and thousands of up- or downregulated genes under high salinity stress. Many genes were first detected with expression evidence using custom microarray analysis. Some gene families were preferentially regulated by high salinity stress and might play key roles in stress-responsive biological processes. Genomic variations in promoter regions resulted from single nucleotide polymorphisms, indels (1–10 bp of insertion/deletion), and structural variations significantly contributed to the expression divergence and regulation. Our data also showed that tandem and segmental duplication, CACTA and hAT elements played roles in the evolution of gene expression divergence and regulation between these two contrasting genotypes under normal or high salinity stress conditions. PMID:24121498
Sequence and expression variation in SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1): homeolog evolution in Indian Brassicas.

PubMed

Sri, Tanu; Mayee, Pratiksha; Singh, Anandita

2015-09-01

Whole genome sequence analyses allow unravelling such evolutionary consequences of meso-triplication event in Brassicaceae (∼14-20 million years ago (MYA)) as differential gene fractionation and diversification in homeologous sub-genomes. This study presents a simple gene-centric approach involving microsynteny and natural genetic variation analysis for understanding SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1) homeolog evolution in Brassica. Analysis of microsynteny in Brassica rapa homeologous regions containing SOC1 revealed differential gene fractionation correlating to reported fractionation status of sub-genomes of origin, viz. least fractionated (LF), moderately fractionated 1 (MF1) and most fractionated (MF2), respectively. Screening 18 cultivars of 6 Brassica species led to the identification of 8 genomic and 27 transcript variants of SOC1, including splice-forms. Co-occurrence of both interrupted and intronless SOC1 genes was detected in few Brassica species. In silico analysis characterised Brassica SOC1 as MADS intervening, K-box, C-terminal (MIKC(C)) transcription factor, with highly conserved MADS and I domains relative to K-box and C-terminal domain. Phylogenetic analyses and multiple sequence alignments depicting shared pattern of silent/non-silent mutations assigned Brassica SOC1 homologs into groups based on shared diploid base genome. In addition, a sub-genome structure in uncharacterised Brassica genomes was inferred. Expression analysis of putative MF2 and LF (Brassica diploid base genome A (AA)) sub-genome-specific SOC1 homeologs of Brassica juncea revealed near identical expression pattern. However, MF2-specific homeolog exhibited significantly higher expression implying regulatory diversification. In conclusion, evidence for polyploidy-induced sequence and regulatory evolution in Brassica SOC1 is being presented wherein differential homeolog expression is implied in functional diversification.
Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

PubMed

Wan, B; Yarbrough, J W; Schultz, T W

2008-01-01

This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Cancer Genomics: Integrative and Scalable Solutions in R / Bioconductor | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

This proposal develops scalable R / Bioconductor software infrastructure and data resources to integrate complex, heterogeneous, and large cancer genomic experiments. The falling cost of genomic assays facilitates collection of multiple data types (e.g., gene and transcript expression, structural variation, copy number, methylation, and microRNA data) from a set of clinical specimens. Furthermore, substantial resources are now available from large consortium activities like The Cancer Genome Atlas (TCGA).
DEFINING THE MANDATE OF PROTEOMICS IN THE POST-GENOMIC ERA: WORKSHOP REPORT

EPA Science Inventory

Research in proteomics is the next step after genomics in understanding life processes at the molecular level. In the largest sense proteomics encompasses knowledge of the structure, function and expression of all proteins in the biochemical or biological contexts of all organism...
Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

PubMed Central

Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

2003-01-01

Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Genome-wide analysis of the R2R3-MYB transcription factor gene family in sweet orange (Citrus sinensis).

PubMed

Liu, Chaoyang; Wang, Xia; Xu, Yuantao; Deng, Xiuxin; Xu, Qiang

2014-10-01

MYB transcription factor represents one of the largest gene families in plant genomes. Sweet orange (Citrus sinensis) is one of the most important fruit crops worldwide, and recently the genome has been sequenced. This provides an opportunity to investigate the organization and evolutionary characteristics of sweet orange MYB genes from whole genome view. In the present study, we identified 100 R2R3-MYB genes in the sweet orange genome. A comprehensive analysis of this gene family was performed, including the phylogeny, gene structure, chromosomal localization and expression pattern analyses. The 100 genes were divided into 29 subfamilies based on the sequence similarity and phylogeny, and the classification was also well supported by the highly conserved exon/intron structures and motif composition. The phylogenomic comparison of MYB gene family among sweet orange and related plant species, Arabidopsis, cacao and papaya suggested the existence of functional divergence during evolution. Expression profiling indicated that sweet orange R2R3-MYB genes exhibited distinct temporal and spatial expression patterns. Our analysis suggested that the sweet orange MYB genes may play important roles in different plant biological processes, some of which may be potentially involved in citrus fruit quality. These results will be useful for future functional analysis of the MYB gene family in sweet orange.
Segmental duplications: evolution and impact among the current Lepidoptera genomes.

PubMed

Zhao, Qian; Ma, Dongna; Vasseur, Liette; You, Minsheng

2017-07-06

Structural variation among genomes is now viewed to be as important as single nucleoid polymorphisms in influencing the phenotype and evolution of a species. Segmental duplication (SD) is defined as segments of DNA with homologous sequence. Here, we performed a systematic analysis of segmental duplications (SDs) among five lepidopteran reference genomes (Plutella xylostella, Danaus plexippus, Bombyx mori, Manduca sexta and Heliconius melpomene) to understand their potential impact on the evolution of these species. We find that the SDs content differed substantially among species, ranging from 1.2% of the genome in B. mori to 15.2% in H. melpomene. Most SDs formed very high identity (similarity higher than 90%) blocks but had very few large blocks. Comparative analysis showed that most of the SDs arose after the divergence of each linage and we found that P. xylostella and H. melpomene showed more duplications than other species, suggesting they might be able to tolerate extensive levels of variation in their genomes. Conserved ancestral and species specific SD events were assessed, revealing multiple examples of the gain, loss or maintenance of SDs over time. SDs content analysis showed that most of the genes embedded in SDs regions belonged to species-specific SDs ("Unique" SDs). Functional analysis of these genes suggested their potential roles in the lineage-specific evolution. SDs and flanking regions often contained transposable elements (TEs) and this association suggested some involvement in SDs formation. Further studies on comparison of gene expression level between SDs and non-SDs showed that the expression level of genes embedded in SDs was significantly lower, suggesting that structure changes in the genomes are involved in gene expression differences in species. The results showed that most of the SDs were "unique SDs", which originated after species formation. Functional analysis suggested that SDs might play different roles in different species. Our results provide a valuable resource beyond the genetic mutation to explore the genome structure for future Lepidoptera research.
Gene expression patterns are correlated with genomic and genic structure in soybean

USDA-ARS?s Scientific Manuscript database

Studies have indicated that exon and intron size, and intergenic distance are correlated with gene expression levels and expression breadth. Previous studies on these correlations in plants and animals have been conflicting. In this study next-generation sequence data of the soybean transcriptome wa...
Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus)

PubMed Central

Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

2016-01-01

The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts. PMID:26907269
Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus).

PubMed

Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

2016-02-23

The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.
Genomic structure and expression of STM2, the chromosome 1 familial Alzheimer disease gene.

PubMed

Levy-Lahad, E; Poorkaj, P; Wang, K; Fu, Y H; Oshima, J; Mulligan, J; Schellenberg, G D

1996-06-01

Mutations in the gene STM2 result in autosomal dominant familial Alzheimer disease. To screen for mutations and to identify regulatory elements for this gene, the genomic DNA sequence and intron-exon structure were determined. Twelve exons including 10 coding exons were identified in a genomic region spanning 23,737 bp. The first 2 exons encode the 5'-untranslated region. Expression analysis of STM2 indicates that two transcripts of 2.4 and 2.8 kb are found in skeletal muscle, pancreas, and heart. In addition, a splice variant of the 2.4-kb transcript was identified that is the result of the use of an alternative splice acceptor site located in exon 10. The use of this site results in a transcript lacking a single glutamate. The promotor for this gene and the alternatively spliced exons leading to the 2.8-kb form of the gene remain to be identified. Expression of STM2 was high in skeletal muscle and pancreas, with comparatively low levels observed in brain. This expression pattern is intriguing since in Alzheimer disease, pathology and degeneration are observed only in the central nervous system.

Genomic structure and expression of STM2, the chromosome 1 familial Alzheimer disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Levy-Lahad, E.; Wang, Kai; Fu, Ying Hui

1996-06-01

Mutations in the gene STM2 result in autosomal dominant familial Alzheimer disease. To screen for mutations and to identify regulatory elements for this gene, the genomic DNA sequence and intron-exon structure were determined. Twelve exons including 10 coding exons were identified in a genomic region spanning 23, 737 bp. The first 2 exons encode the 5{prime}-untranslated region. Expression analysis of STM2 indicates that two transcripts of 2.4 and 2.8 kb are found in skeletal muscle, pancreas, and heart. In addition, a splice variant of the 2.4-kb transcript was identified that is the result of the use of an alternative splicemore » acceptor site located in exon 10. The use of this site results in a transcript lacking a single glutamate. The promotor for this gene and the alternatively spliced exons leading to the 2.8-kb form of the gene remain to be identified. Expression of STM2 was high in skeletal muscle and pancreas, with comparatively low levels observed in brain. This expression pattern is intriguing since in Alzheimer disease, pathology and degeneration are observed only in the central nervous system. 19 refs., 2 figs., 3 tabs.« less
Production of pseudoinfectious yellow fever virus with a two-component genome.

PubMed

Shustov, Alexandr V; Mason, Peter W; Frolov, Ilya

2007-11-01

Application of genetically modified, deficient-in-replication flaviviruses that are incapable of developing productive, spreading infection is a promising means of designing safe and effective vaccines. Here we describe a two-component genome yellow fever virus (YFV) replication system in which each of the genomes encodes complete sets of nonstructural proteins that form the replication complex but expresses either only capsid or prM/E instead of the entire structural polyprotein. Upon delivery to the same cell, these genomes produce together all of the viral structural proteins, and cells release a combination of virions with both types of genomes packaged into separate particles. In tissue culture, this modified YFV can be further passaged at an escalating scale by using a high multiplicity of infection (MOI). However, at a low MOI, only one of the genomes is delivered into the cells, and infection cannot spread. The replicating prM/E-encoding genome produces extracellular E protein in the form of secreted subviral particles that are known to be an effective immunogen. The presented strategy of developing viruses defective in replication might be applied to other flaviviruses, and these two-component genome viruses can be useful for diagnostic or vaccine applications, including the delivery and expression of heterologous genes. In addition, the achieved separation of the capsid-coding sequence and the cyclization signal in the YFV genome provides a new means for studying the mechanism of the flavivirus packaging process.
The identification and functional annotation of RNA structures conserved in vertebrates

PubMed Central

Seemann, Stefan E.; Mirza, Aashiq H.; Hansen, Claus; Bang-Berthelsen, Claus H.; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T.; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L.; Gorodkin, Jan

2017-01-01

Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human–mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3′ ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. PMID:28487280
Shaping skeletal growth by modular regulatory elements in the Bmp5 gene.

PubMed

Guenther, Catherine; Pantalena-Filho, Luiz; Kingsley, David M

2008-12-01

Cartilage and bone are formed into a remarkable range of shapes and sizes that underlie many anatomical adaptations to different lifestyles in vertebrates. Although the morphological blueprints for individual cartilage and bony structures must somehow be encoded in the genome, we currently know little about the detailed genomic mechanisms that direct precise growth patterns for particular bones. We have carried out large-scale enhancer surveys to identify the regulatory architecture controlling developmental expression of the mouse Bmp5 gene, which encodes a secreted signaling molecule required for normal morphology of specific skeletal features. Although Bmp5 is expressed in many skeletal precursors, different enhancers control expression in individual bones. Remarkably, we show here that different enhancers also exist for highly restricted spatial subdomains along the surface of individual skeletal structures, including ribs and nasal cartilages. Transgenic, null, and regulatory mutations confirm that these anatomy-specific sequences are sufficient to trigger local changes in skeletal morphology and are required for establishing normal growth rates on separate bone surfaces. Our findings suggest that individual bones are composite structures whose detailed growth patterns are built from many smaller lineage and gene expression domains. Individual enhancers in BMP genes provide a genomic mechanism for controlling precise growth domains in particular cartilages and bones, making it possible to separately regulate skeletal anatomy at highly specific locations in the body.
Genome-wide identification, classification, and expression analysis of the arabinogalactan protein gene family in rice (Oryza sativa L.)

PubMed Central

Zhao, Jie

2010-01-01

Arabinogalactan proteins (AGPs) comprise a family of hydroxyproline-rich glycoproteins that are implicated in plant growth and development. In this study, 69 AGPs are identified from the rice genome, including 13 classical AGPs, 15 arabinogalactan (AG) peptides, three non-classical AGPs, three early nodulin-like AGPs (eNod-like AGPs), eight non-specific lipid transfer protein-like AGPs (nsLTP-like AGPs), and 27 fasciclin-like AGPs (FLAs). The results from expressed sequence tags, microarrays, and massively parallel signature sequencing tags are used to analyse the expression of AGP-encoding genes, which is confirmed by real-time PCR. The results reveal that several rice AGP-encoding genes are predominantly expressed in anthers and display differential expression patterns in response to abscisic acid, gibberellic acid, and abiotic stresses. Based on the results obtained from this analysis, an attempt has been made to link the protein structures and expression patterns of rice AGP-encoding genes to their functions. Taken together, the genome-wide identification and expression analysis of the rice AGP gene family might facilitate further functional studies of rice AGPs. PMID:20423940
Chromosomal Rearrangements as Barriers to Genetic Homogenization between Archaic and Modern Humans

PubMed Central

Rogers, Rebekah L.

2015-01-01

Chromosomal rearrangements, which shuffle DNA throughout the genome, are an important source of divergence across taxa. Using a paired-end read approach with Illumina sequence data for archaic humans, I identify changes in genome structure that occurred recently in human evolution. Hundreds of rearrangements indicate genomic trafficking between the sex chromosomes and autosomes, raising the possibility of sex-specific changes. Additionally, genes adjacent to genome structure changes in Neanderthals are associated with testis-specific expression, consistent with evolutionary theory that new genes commonly form with expression in the testes. I identify one case of new-gene creation through transposition from the Y chromosome to chromosome 10 that combines the 5′-end of the testis-specific gene Fank1 with previously untranscribed sequence. This new transcript experienced copy number expansion in archaic genomes, indicating rapid genomic change. Among rearrangements identified in Neanderthals, 13% are transposition of selfish genetic elements, whereas 32% appear to be ectopic exchange between repeats. In Denisovan, the pattern is similar but numbers are significantly higher with 18% of rearrangements reflecting transposition and 40% ectopic exchange between distantly related repeats. There is an excess of divergent rearrangements relative to polymorphism in Denisovan, which might result from nonuniform rates of mutation, possibly reflecting a burst of transposable element activity in the lineage that led to Denisovan. Finally, loci containing genome structure changes show diminished rates of introgression from Neanderthals into modern humans, consistent with the hypothesis that rearrangements serve as barriers to gene flow during hybridization. Together, these results suggest that this previously unidentified source of genomic variation has important biological consequences in human evolution. PMID:26399483
Recurrent emergence of structural variants of LTR retrotransposon CsRn1 evolving novel expression strategy and their selective expansion in a carcinogenic liver fluke, Clonorchis sinensis.

PubMed

Kim, Seon-Hee; Kong, Yoon; Bae, Young-An

2017-06-01

Autonomous retrotransposons, in which replication and transcription are coupled, encode the essential gag and pol genes as a fusion or separate overlapping form(s) that are expressed in single transcripts regulated by a common upstream promoter. The element-specific expression strategies have driven development of relevant translational recoding mechanisms including ribosomal frameshifting to satisfy the protein stoichiometry critical for the assembly of infectious virus-like particles. Retrotransposons with different recoding strategies exhibit a mosaic distribution pattern across the diverse families of reverse transcribing elements, even though their respective distributions are substantially skewed towards certain family groups. However, only a few investigations to date have focused on the emergence of retrotransposons evolving novel expression strategy and causal genetic drivers of the structural variants. In this study, the bulk of genomic and transcribed sequences of a Ty3/gypsy-like CsRn1 retrotransposon in Clonorchis sinensis were analyzed for the comprehensive examination of its expression strategy. Our results demonstrated that structural variants with single open reading frame (ORF) have recurrently emerged from precedential CsRn1 copies encoding overlapping gag-pol ORFs by a single-nucleotide insertion in an upstream region of gag stop codon. In the parasite genome, some of the newly evolved variants appeared to undergo proliferative burst as active master lineages together with their ancestral copies. The genetic event was similarly observed in Opisthorchis viverrini, the closest neighbor of C. sinensis, whereas the resulting structural variants might have failed to overcome purifying selection and comprised minor remnant copies in the Opisthorchis genome. Copyright © 2017 Elsevier B.V. All rights reserved.
Expression of virus-encoded proteinases: functional and structural similarities with cellular enzymes.

PubMed Central

Dougherty, W G; Semler, B L

1993-01-01

Many viruses express their genome, or part of their genome, initially as a polyprotein precursor that undergoes proteolytic processing. Molecular genetic analyses of viral gene expression have revealed that many of these processing events are mediated by virus-encoded proteinases. Biochemical activity studies and structural analyses of these viral enzymes reveal that they have remarkable similarities to cellular proteinases. However, the viral proteinases have evolved unique features that permit them to function in a cellular environment. In this article, the current status of plant and animal virus proteinases is described along with their role in the viral replication cycle. The reactions catalyzed by viral proteinases are not simple enzyme-substrate interactions; rather, the processing steps are highly regulated, are coordinated with other viral processes, and frequently involve the participation of other factors. Images PMID:8302216
The somatic genomic landscape of chromophobe renal cell carcinoma

PubMed Central

Davis, Caleb F.; Ricketts, Christopher; Wang, Min; Yang, Lixing; Cherniack, Andrew D.; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C.; Hacker, Kathryn E.; Bhanot, Gyan; Gordenin, Dmitry A.; Chu, Andy; Gunaratne, Preethi H.; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A.; Bristow, Christopher A.; Donehower, Lawrence A.; Wallen, Eric M.; Smith, Angela B.; Tickoo, Satish K.; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S.; Hsieh, James J.; Choueiri, Toni K.; Hakimi, A. Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A. Gordon; Laird, Peter W.; Henske, Elizabeth P.; Kwiatkowski, David J.; Park, Peter J.; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A.; Linehan, W. Marston; Gibbs, Richard A.; Rathmell, W. Kimryn; Creighton, Chad J.

2014-01-01

Summary We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) based on multidimensional and comprehensive characterization, including mitochondrial DNA (mtDNA) and whole genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared to other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT up-regulation in cancer distinct from previously-observed amplifications and point mutations. PMID:25155756
CnidBase: The Cnidarian Evolutionary Genomics Database

PubMed Central

Ryan, Joseph F.; Finnerty, John R.

2003-01-01

CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians. In turn, CnidBase will help to illuminate the role of specific genes in shaping cnidarian biodiversity in the present day and in the distant past. CnidBase highlights evolutionary changes between species within the phylum Cnidaria and structures genomic and expression data to facilitate comparisons to non-cnidarian metazoans. CnidBase aims to further the progress that has already been made in the realm of cnidarian evolutionary genomics by creating a central community resource which will help drive future research and facilitate more accurate classification and comparison of new experimental data with existing data. CnidBase is available at http://cnidbase.bu.edu/. PMID:12519972
The somatic genomic landscape of chromophobe renal cell carcinoma.

PubMed

Davis, Caleb F; Ricketts, Christopher J; Wang, Min; Yang, Lixing; Cherniack, Andrew D; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C; Hacker, Kathryn E; Bhanot, Gyan; Gordenin, Dmitry A; Chu, Andy; Gunaratne, Preethi H; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A; Bristow, Christopher A; Donehower, Lawrence A; Wallen, Eric M; Smith, Angela B; Tickoo, Satish K; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S; Hsieh, James J; Choueiri, Toni K; Hakimi, A Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A Gordon; Laird, Peter W; Henske, Elizabeth P; Kwiatkowski, David J; Park, Peter J; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A; Linehan, W Marston; Gibbs, Richard A; Rathmell, W Kimryn; Creighton, Chad J

2014-09-08

We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) on the basis of multidimensional and comprehensive characterization, including mtDNA and whole-genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared with other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT upregulation in cancer distinct from previously observed amplifications and point mutations. Copyright © 2014 Elsevier Inc. All rights reserved.
Genome-wide analysis of WRKY gene family in Cucumis sativus

PubMed Central

2011-01-01

Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985
Genome-wide analysis of WRKY gene family in Cucumis sativus.

PubMed

Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

2011-09-28

WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
A Novel Self-Replicating Chimeric Lentivirus-Like Particle

PubMed Central

Young, Kelly R.; Madden, Victoria J.; Johnson, Philip R.; Johnston, Robert E.

2012-01-01

Successful live attenuated vaccines mimic natural exposure to pathogens without causing disease and have been successful against several viruses. However, safety concerns prevent the development of attenuated human immunodeficiency virus (HIV) as a vaccine candidate. If a safe, replicating virus vaccine could be developed, it might have the potential to offer significant protection against HIV infection and disease. Described here is the development of a novel self-replicating chimeric virus vaccine candidate that is designed to provide natural exposure to a lentivirus-like particle and to incorporate the properties of a live attenuated virus vaccine without the inherent safety issues associated with attenuated lentiviruses. The genome from the alphavirus Venezuelan equine encephalitis virus (VEE) was modified to express SHIV89.6P genes encoding the structural proteins Gag and Env. Expression of Gag and Env from VEE RNA in primate cells led to the assembly of particles that morphologically and functionally resembled lentivirus virions and that incorporated alphavirus RNA. Infection of CD4+ cells with chimeric lentivirus-like particles was specific and productive, resulting in RNA replication, expression of Gag and Env, and generation of progeny chimeric particles. Further genome modifications designed to enhance encapsidation of the chimeric virus genome and to express an attenuated simian immunodeficiency virus (SIV) protease for particle maturation improved the ability of chimeric lentivirus-like particles to propagate in cell culture. This study provides proof of concept for the feasibility of creating chimeric virus genomes that express lentivirus structural proteins and assemble into infectious particles for presentation of lentivirus immunogens in their native and functional conformation. PMID:22013035
A novel self-replicating chimeric lentivirus-like particle.

PubMed

Jurgens, Christy K; Young, Kelly R; Madden, Victoria J; Johnson, Philip R; Johnston, Robert E

2012-01-01

Successful live attenuated vaccines mimic natural exposure to pathogens without causing disease and have been successful against several viruses. However, safety concerns prevent the development of attenuated human immunodeficiency virus (HIV) as a vaccine candidate. If a safe, replicating virus vaccine could be developed, it might have the potential to offer significant protection against HIV infection and disease. Described here is the development of a novel self-replicating chimeric virus vaccine candidate that is designed to provide natural exposure to a lentivirus-like particle and to incorporate the properties of a live attenuated virus vaccine without the inherent safety issues associated with attenuated lentiviruses. The genome from the alphavirus Venezuelan equine encephalitis virus (VEE) was modified to express SHIV89.6P genes encoding the structural proteins Gag and Env. Expression of Gag and Env from VEE RNA in primate cells led to the assembly of particles that morphologically and functionally resembled lentivirus virions and that incorporated alphavirus RNA. Infection of CD4⁺ cells with chimeric lentivirus-like particles was specific and productive, resulting in RNA replication, expression of Gag and Env, and generation of progeny chimeric particles. Further genome modifications designed to enhance encapsidation of the chimeric virus genome and to express an attenuated simian immunodeficiency virus (SIV) protease for particle maturation improved the ability of chimeric lentivirus-like particles to propagate in cell culture. This study provides proof of concept for the feasibility of creating chimeric virus genomes that express lentivirus structural proteins and assemble into infectious particles for presentation of lentivirus immunogens in their native and functional conformation.
New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison

PubMed Central

Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.

2011-01-01

Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011
ALUminating the Path of Atherosclerosis Progression: Chaos Theory Suggests a Role for Alu Repeats in the Development of Atherosclerotic Vascular Disease.

PubMed

Hueso, Miguel; Cruzado, Josep M; Torras, Joan; Navarro, Estanislao

2018-06-12

Atherosclerosis (ATH) and coronary artery disease (CAD) are chronic inflammatory diseases with an important genetic background; they derive from the cumulative effect of multiple common risk alleles, most of which are located in genomic noncoding regions. These complex diseases behave as nonlinear dynamical systems that show a high dependence on their initial conditions; thus, long-term predictions of disease progression are unreliable. One likely possibility is that the nonlinear nature of ATH could be dependent on nonlinear correlations in the structure of the human genome. In this review, we show how chaos theory analysis has highlighted genomic regions that have shared specific structural constraints, which could have a role in ATH progression. These regions were shown to be enriched with repetitive sequences of the Alu family, genomic parasites that have colonized the human genome, which show a particular secondary structure and are involved in the regulation of gene expression. Here, we show the impact of Alu elements on the mechanisms that regulate gene expression, especially highlighting the molecular mechanisms via which the Alu elements alter the inflammatory response. We devote special attention to their relationship with the long noncoding RNA (lncRNA); antisense noncoding RNA in the INK4 locus ( ANRIL ), a risk factor for ATH; their role as microRNA (miRNA) sponges; and their ability to interfere with the regulatory circuitry of the (nuclear factor kappa B) NF-κB response. We aim to characterize ATH as a nonlinear dynamic system, in which small initial alterations in the expression of a number of repetitive elements are somehow amplified to reach phenotypic significance.
The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure

PubMed Central

Kagale, Sateesh; Koh, Chushin; Nixon, John; Bollina, Venkatesh; Clarke, Wayne E.; Tuteja, Reetu; Spillane, Charles; Robinson, Stephen J.; Links, Matthew G.; Clarke, Carling; Higgins, Erin E.; Huebert, Terry; Sharpe, Andrew G.; Parkin, Isobel A. P.

2014-01-01

Camelina sativa is an oilseed with desirable agronomic and oil-quality attributes for a viable industrial oil platform crop. Here we generate the first chromosome-scale high-quality reference genome sequence for C. sativa and annotated 89,418 protein-coding genes, representing a whole-genome triplication event relative to the crucifer model Arabidopsis thaliana. C. sativa represents the first crop species to be sequenced from lineage I of the Brassicaceae. The well-preserved hexaploid genome structure of C. sativa surprisingly mirrors those of economically important amphidiploid Brassica crop species from lineage II as well as wheat and cotton. The three genomes of C. sativa show no evidence of fractionation bias and limited expression-level bias, both characteristics commonly associated with polyploid evolution. The highly undifferentiated polyploid genome of C. sativa presents significant consequences for breeding and genetic manipulation of this industrial oil crop. PMID:24759634
Ligninolytic peroxidase genes in the oyster mushroom genome: heterologous expression, molecular structure, catalytic and stability properties, and lignin-degrading ability

Treesearch

Elena Fernández-Fueyo; Francisco J Ruiz-Dueñas; María Jesús Martinez; Antonio Romero; Kenneth E Hammel; Francisco Javier Medrano; Angel T. Martínez

2014-01-01

Background: The genome of Pleurotus ostreatus, an important edible mushroom and a model ligninolytic organism of interest in lignocellulose biorefineries due to its ability to delignify agricultural wastes, was sequenced with the purpose of identifying and characterizing the enzymes responsible for lignin degradation. ...
Expression of homing endonuclease gene and insertion-like element in sea anemone mitochondrial genomes: Lesson learned from Anemonia viridis.

PubMed

Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D

2018-04-30

The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.

Insights into structural variations and genome rearrangements in prokaryotic genomes.

PubMed

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The mediator complex in genomic and non-genomic signaling in cancer.

PubMed

Weber, Hannah; Garabedian, Michael J

2018-05-01

Mediator is a conserved, multi-subunit macromolecular machine divided structurally into head, middle, and tail modules, along with a transiently associating kinase module. Mediator functions as an integrator of transcriptional regulatory activity by interacting with DNA-bound transcription factors and with RNA polymerase II (Pol II) to both activate and repress gene expression. Mediator has been shown to affect multiple steps in transcription, including chromatin looping between enhancers and promoters, pre-initiation complex formation, transcriptional elongation, and mRNA splicing. Individual Mediator subunits participate in regulation of gene expression by the estrogen and androgen receptors and are altered in a number of endocrine cancers, including breast and prostate cancer. In addition to its role in genomic signaling, MED12 has been implicated in non-genomic signaling by interacting with and activating TGF-beta receptor 2 in the cytoplasm. Recent structural studies have revealed extensive inter-domain interactions and complex architecture of the Mediator-Pol II complex, suggesting that Mediator is capable of reorganizing its conformation and composition to fit cellular needs. We propose that alterations in Mediator subunit expression that occur in various cancers could impact the organization and function of Mediator, resulting in changes in gene expression that promote malignancy. A better understanding of the role of Mediator in cancer could reveal new approaches to the diagnosis and treatment of Mediator-dependent endocrine cancers, especially in settings of therapy resistance. Copyright © 2017 Elsevier Inc. All rights reserved.
Selective forces and mutational biases drive stop codon usage in the human genome: a comparison with sense codon usage.

PubMed

Trotta, Edoardo

2016-05-17

The three stop codons UAA, UAG, and UGA signal the termination of mRNA translation. As a result of a mechanism that is not adequately understood, they are normally used with unequal frequencies. In this work, we showed that selective forces and mutational biases drive stop codon usage in the human genome. We found that, in respect to sense codons, stop codon usage was affected by stronger selective forces but was less influenced by neutral mutational biases. UGA is the most frequent termination codon in human genome. However, UAA was the preferred stop codon in genes with high breadth of expression, high level of expression, AT-rich coding sequences, housekeeping functions, and in gene ontology categories with the largest deviation from expected stop codon usage. Selective forces associated with the breadth and the level of expression favoured AT-rich sequences in the mRNA region including the stop site and its proximal 3'-UTR, but acted with scarce effects on sense codons, generating two regions, upstream and downstream of the stop codon, with strongly different base composition. By favouring low levels of GC-content, selection promoted labile local secondary structures at the stop site and its proximal 3'-UTR. The compositional and structural context favoured by selection was surprisingly emphasized in the class of ribosomal proteins and was consistent with sequence elements that increase the efficiency of translational termination. Stop codons were also heterogeneously distributed among chromosomes by a mechanism that was strongly correlated with the GC-content of coding sequences. In human genome, the nucleotide composition and the thermodynamic stability of stop codon site and its proximal 3'-UTR are correlated with the GC-content of coding sequences and with the breadth and the level of gene expression. In highly expressed genes stop codon usage is compositionally and structurally consistent with highly efficient translation termination signals.
Structural features based genome-wide characterization and prediction of nucleosome organization

PubMed Central

2012-01-01

Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization. The implementation of our DLaNe method based on structural features is available online. PMID:22449207
Structural and quantitative expression analyses of HERV gene family in human tissues.

PubMed

Ahn, Kung; Kim, Heui-Soo

2009-08-31

Human endogenous retroviruses (HERVs) have been implicated in the pathogenesis of several human diseases as multi-copy members in the human genome. Their gene expression profiling could provide us with important insights into the pathogenic relationship between HERVs and cancer. In this study, we have evaluated the genomic structure and quantitatively determined the expression patterns in the env gene of a variety of HERV family members located on six specific loci by the RetroTector 10 program, as well as real-time RT-PCR amplification. The env gene transcripts evidenced significant differences in the human tumor/normal adjacent tissues (colon, liver, uterus, lung and testis). As compared to the adjacent normal tissues, high levels of expression were noted in testis tumor tissues for HERV-K, in liver and lung tumor tissues for HERV-R, in liver, lung, and testis tumor tissues for HERV-H, and in colon and liver tumor tissues for HERV-P. These data warrant further studies with larger groups of patients to develop biomarkers for specific human cancers.
Life in the fast lane for protein crystallization and X-ray crystallography

NASA Technical Reports Server (NTRS)

Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.

2005-01-01

The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high-rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today's high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
Life in the Fast Lane for Protein Crystallization and X-Ray Crystallography

NASA Technical Reports Server (NTRS)

Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.

2004-01-01

The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today s high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
The identification and functional annotation of RNA structures conserved in vertebrates.

PubMed

Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan

2017-08-01

Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.
Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome

PubMed Central

Opazo, Juan C.; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F.

2015-01-01

Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. PMID:25743544
The High-Throughput Protein Sample Production Platform of the Northeast Structural Genomics Consortium

PubMed Central

Xiao, Rong; Anderson, Stephen; Aramini, James; Belote, Rachel; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John K.; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Jiang, Mei; Kornhaber, Gregory J.; Lee, Dong Yup; Locke, Jessica Y.; Ma, Li-Chung; Maglaqui, Melissa; Mao, Lei; Mitra, Saheli; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Sharma, Seema; Shastry, Ritu; Swapna, G.V.T.; Tong, Saichu N.; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.; Acton, Thomas B.

2014-01-01

We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities. PMID:20688167
Genome-wide association of 10 horticultural traits with expressed sequence tag-derived SNP markers in a collection of lettuce lines

USDA-ARS?s Scientific Manuscript database

Genetic diversity, population structure, and genome-wide marker-trait association analyses were conducted on a special collection of 298 homozygous lettuce (Lactuca sativa L.) lines. Each of these lines was derived from a single plant that had been genotyped with 384 SNP makers using LSGermOPA. They...
An integrated map of structural variation in 2,504 human genomes.

PubMed

Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

2015-10-01

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
The Gene Expression Omnibus Database.

PubMed

Clough, Emily; Barrett, Tanya

2016-01-01

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.
Structure, sequence and expression of the hepatitis delta (δ) viral genome

NASA Astrophysics Data System (ADS)

Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael

1986-10-01

Biochemical and electron microscopic data indicate that the human hepatitis δ viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis δ viral infections.
Genomic expression patterns of cardiac tissues from dogs with dilated cardiomyopathy.

PubMed

Oyama, Mark A; Chittur, Sridar

2005-07-01

To evaluate global genome expression patterns of left ventricular tissues from dogs with dilated cardiomyopathy (DCM). Tissues obtained from the left ventricle of 2 Doberman Pinschers with end-stage DCM and 5 healthy control dogs. Transcriptional activities of 23,851 canine DNA sequences were determined by use of an oligonucleotide microarray. Genome expression patterns of DCM tissue were evaluated by measuring the relative amount of complementary RNA hybridization to the microarray probes and comparing it with gene expression for tissues from 5 healthy control dogs. 478 transcripts were differentially expressed (> or = 2.5-fold change). In DCM tissue, expression of 173 transcripts was upregulated and expression of 305 transcripts was downregulated, compared with expression for control tissues. Of the 478 transcripts, 167 genes could be specifically identified. These genes were grouped into 1 of 8 categories on the basis of their primary physiologic function. Grouping revealed that pathways involving cellular energy production, signaling and communication, and cell structure were generally downregulated, whereas pathways involving cellular defense and stress responses were upregulated. Many previously unreported genes that may contribute to the pathophysiologic aspects of heart disease were identified. Evaluation of global expression patterns provides a molecular portrait of heart failure, yields insights into the pathophysiologic aspects of DCM, and identifies intriguing genes and pathways for further study.
Comprehensive Genomic Analysis and Expression Profiling of Phospholipase C Gene Family during Abiotic Stresses and Development in Rice

PubMed Central

Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K.; Sopory, Sudhir K.; Kapoor, Sanjay; Pandey, Girdhar K.

2013-01-01

Background Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. Methodology/Principal Findings An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. Conclusion/Significance The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future. PMID:23638098
Comprehensive genomic analysis and expression profiling of phospholipase C gene family during abiotic stresses and development in rice.

PubMed

Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K; Sopory, Sudhir K; Kapoor, Sanjay; Pandey, Girdhar K

2013-01-01

Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future.
Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

PubMed Central

Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F

2007-01-01

Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA editing, loss of stop codons, and use of trans-splicing. PMID:17897476
Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum)

PubMed Central

Shivaraj, S. M.; Deshmukh, Rupesh K.; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K.; Dash, Prasanta K.

2017-01-01

Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax. PMID:28447607
Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum).

PubMed

Shivaraj, S M; Deshmukh, Rupesh K; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K; Dash, Prasanta K

2017-04-27

Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax.

Gene Structures, Evolution and Transcriptional Profiling of the WRKY Gene Family in Castor Bean (Ricinus communis L.).

PubMed

Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui

2016-01-01

WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.
Novel Loci for Metabolic Networks and Multi-Tissue Expression Studies Reveal Genes for Atherosclerosis

PubMed Central

Inouye, Michael; Ripatti, Samuli; Kettunen, Johannes; Lyytikäinen, Leo-Pekka; Oksala, Niku; Laurila, Pirkka-Pekka; Kangas, Antti J.; Soininen, Pasi; Savolainen, Markku J.; Viikari, Jorma; Kähönen, Mika; Perola, Markus; Salomaa, Veikko; Raitakari, Olli; Lehtimäki, Terho; Taskinen, Marja-Riitta; Järvelin, Marjo-Riitta; Ala-Korpela, Mika; Palotie, Aarno; de Bakker, Paul I. W.

2012-01-01

Association testing of multiple correlated phenotypes offers better power than univariate analysis of single traits. We analyzed 6,600 individuals from two population-based cohorts with both genome-wide SNP data and serum metabolomic profiles. From the observed correlation structure of 130 metabolites measured by nuclear magnetic resonance, we identified 11 metabolic networks and performed a multivariate genome-wide association analysis. We identified 34 genomic loci at genome-wide significance, of which 7 are novel. In comparison to univariate tests, multivariate association analysis identified nearly twice as many significant associations in total. Multi-tissue gene expression studies identified variants in our top loci, SERPINA1 and AQP9, as eQTLs and showed that SERPINA1 and AQP9 expression in human blood was associated with metabolites from their corresponding metabolic networks. Finally, liver expression of AQP9 was associated with atherosclerotic lesion area in mice, and in human arterial tissue both SERPINA1 and AQP9 were shown to be upregulated (6.3-fold and 4.6-fold, respectively) in atherosclerotic plaques. Our study illustrates the power of multi-phenotype GWAS and highlights candidate genes for atherosclerosis. PMID:22916037
Solenopsis invicta virus 3: mapping of structural proteins, ribosomal frameshifting, and similarities to Acyrthosiphon pisum virus and Kelp fly virus.

PubMed

Valles, Steven M; Bell, Susanne; Firth, Andrew E

2014-01-01

Solenopsis invicta virus 3 (SINV-3) is a positive-sense single-stranded RNA virus that infects the red imported fire ant, Solenopsis invicta. We show that the second open reading frame (ORF) of the dicistronic genome is expressed via a frameshifting mechanism and that the sequences encoding the structural proteins map to both ORF2 and the 3' end of ORF1, downstream of the sequence that encodes the RNA-dependent RNA polymerase. The genome organization and structural protein expression strategy resemble those of Acyrthosiphon pisum virus (APV), an aphid virus. The capsid protein that is encoded by the 3' end of ORF1 in SINV-3 and APV is predicted to have a jelly-roll fold similar to the capsid proteins of picornaviruses and caliciviruses. The capsid-extension protein that is produced by frameshifting, includes the jelly-roll fold domain encoded by ORF1 as its N-terminus, while the C-terminus encoded by the 5' half of ORF2 has no clear homology with other viral structural proteins. A third protein, encoded by the 3' half of ORF2, is associated with purified virions at sub-stoichiometric ratios. Although the structural proteins can be translated from the genomic RNA, we show that SINV-3 also produces a subgenomic RNA encoding the structural proteins. Circumstantial evidence suggests that APV may also produce such a subgenomic RNA. Both SINV-3 and APV are unclassified picorna-like viruses distantly related to members of the order Picornavirales and the family Caliciviridae. Within this grouping, features of the genome organization and capsid domain structure of SINV-3 and APV appear more similar to caliciviruses, perhaps suggesting the basis for a "Calicivirales" order.
Linear Lepidopteran ambidensovirus 1 sequences drive random integration of a reporter gene in transfected Spodoptera frugiperda cells.

PubMed

Rizk, Francine; Laverdure, Sylvain; d'Alençon, Emmanuelle; Bossin, Hervé; Dupressoir, Thierry

2018-01-01

The Lepidopteran ambidensovirus 1 isolated from Junonia coenia (hereafter JcDV) is an invertebrate parvovirus considered as a viral transduction vector as well as a potential tool for the biological control of insect pests. Previous works showed that JcDV-based circular plasmids experimentally integrate into insect cells genomic DNA. In order to approach the natural conditions of infection and possible integration, we generated linear JcDV- gfp based molecules which were transfected into non permissive Spodoptera frugiperda ( Sf9 ) cultured cells. Cells were monitored for the expression of green fluorescent protein (GFP) and DNA was analyzed for integration of transduced viral sequences. Non-structural protein modulation of the VP-gene cassette promoter activity was additionally assayed. We show that linear JcDV-derived molecules are capable of long term genomic integration and sustained transgene expression in Sf9 cells. As expected, only the deletion of both inverted terminal repeats (ITR) or the polyadenylation signals of NS and VP genes dramatically impairs the global transduction/expression efficiency. However, all the integrated viral sequences we characterized appear "scrambled" whatever the viral content of the transfected vector. Despite a strong GFP expression, we were unable to recover any full sequence of the original constructs and found rearranged viral and non-viral sequences as well. Cellular flanking sequences were identified as non-coding ones. On the other hand, the kinetics of GFP expression over time led us to investigate the apparent down-regulation by non-structural proteins of the VP-gene cassette promoter. Altogether, our results show that JcDV-derived sequences included in linear DNA molecules are able to drive efficiently the integration and expression of a foreign gene into the genome of insect cells, whatever their composition, provided that at least one ITR is present. However, the transfected sequences were extensively rearranged with cellular DNA during or after random integration in the host cell genome. Lastly, the non-structural proteins seem to participate in the regulation of p9 promoter activity rather than to the integration of viral sequences.
A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation

PubMed Central

2011-01-01

Background One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated. Results Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes. Conclusions Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues. PMID:21226900
Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci

PubMed Central

Chambers, Emily V.; Bickmore, Wendy A.; Semple, Colin A.

2013-01-01

Several recent studies have examined different aspects of mammalian higher order chromatin structure – replication timing, lamina association and Hi-C inter-locus interactions — and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution. PMID:23592965
The FlyBase database of the Drosophila genome projects and community literature

PubMed Central

2003-01-01

FlyBase (http://flybase.bio.indiana.edu/) provides an integrated view of the fundamental genomic and genetic data on the major genetic model Drosophila melanogaster and related species. FlyBase has primary responsibility for the continual reannotation of the D. melanogaster genome. The ultimate goal of the reannotation effort is to decorate the euchromatic sequence of the genome with as much biological information as is available from the community and from the major genome project centers. A complete revision of the annotations of the now-finished euchromatic genomic sequence has been completed. There are many points of entry to the genome within FlyBase, most notably through maps, gene products and ontologies, structured phenotypic and gene expression data, and anatomy. PMID:12519974
Retrotransposons as regulators of gene expression

PubMed Central

Elbarbary, Reyad A.; Lucas, Bronwyn A.; Maquat, Lynne E.

2016-01-01

Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body’s defense mechanisms. PMID:26912865
Two-component signal transduction systems of Xanthomonas spp.: a lesson from genomics.

PubMed

Qian, Wei; Han, Zhong-Ji; He, Chaozu

2008-02-01

The two-component signal transduction systems (TCSTSs), consisting of a histidine kinase sensor (HK) and a response regulator (RR), are the dominant molecular mechanisms by which prokaryotes sense and respond to environmental stimuli. Genomes of Xanthomonas generally contain a large repertoire of TCSTS genes (approximately 92 to 121 for each genome), which encode diverse structural groups of HKs and RRs. Among them, although a core set of 70 TCSTS genes (about two-thirds in total) which accumulates point mutations with a slow rate are shared by these genomes, the other genes, especially hybrid HKs, experienced extensive genetic recombination, including genomic rearrangement, gene duplication, addition or deletion, and fusion or fission. The recombinations potentially promote the efficiency and complexity of TCSTSs in regulating gene expression. In addition, our analysis suggests that a co-evolutionary model, rather than a selfish operon model, is the major mechanism for the maintenance and microevolution of TCSTS genes in the genomes of Xanthomonas. Genomic annotation, secondary protein structure prediction, and comparative genomic analyses of TCSTS genes reviewed here provide insights into our understanding of signal networks in these important phytopathogenic bacteria.
The Aspergillus Genome Database: multispecies curation and incorporation of RNA-Seq data to improve structural gene annotations.

PubMed

Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R

2014-01-01

The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
Genome-Wide Identification of the Alba Gene Family in Plants and Stress-Responsive Expression of the Rice Alba Genes.

PubMed

Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan

2018-03-28

Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa , Zea mays , Sorghum bicolor , Cicer arietinum , and Vitis vinifera , and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii , Physcomitrella patens , and Amborella trichopoda , revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice ( OsAlba ), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure-function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants.
A Complex Structural Variation on Chromosome 27 Leads to the Ectopic Expression of HOXB8 and the Muffs and Beard Phenotype in Chickens

PubMed Central

Wang, Yanqiang; Luo, Chenglong; Liu, Ranran; Qu, Hao; Shu, Dingming; Wen, Jie; Crooijmans, Richard P. M. A.; Zhao, Yiqiang; Hu, Xiaoxiang; Li, Ning

2016-01-01

Muffs and beard (Mb) is a phenotype in chickens where groups of elongated feathers gather from both sides of the face (muffs) and below the beak (beard). It is an autosomal, incomplete dominant phenotype encoded by the Muffs and beard (Mb) locus. Here we use genome-wide association (GWA) analysis, linkage analysis, Identity-by-Descent (IBD) mapping, array-CGH, genome re-sequencing and expression analysis to show that the Mb allele causing the Mb phenotype is a derived allele where a complex structural variation (SV) on GGA27 leads to an altered expression of the gene HOXB8. This Mb allele was shown to be completely associated with the Mb phenotype in nine other independent Mb chicken breeds. The Mb allele differs from the wild-type mb allele by three duplications, one in tandem and two that are translocated to that of the tandem repeat around 1.70 Mb on GGA27. The duplications contain total seven annotated genes and their expression was tested during distinct stages of Mb morphogenesis. A continuous high ectopic expression of HOXB8 was found in the facial skin of Mb chickens, strongly suggesting that HOXB8 directs this regional feather-development. In conclusion, our results provide an interesting example of how genomic structural rearrangements alter the regulation of genes leading to novel phenotypes. Further, it again illustrates the value of utilizing derived phenotypes in domestic animals to dissect the genetic basis of developmental traits, herein providing novel insights into the likely role of HOXB8 in feather development and differentiation. PMID:27253709
Identification, characterization and expression analysis of lineage-specific genes within sweet orange (Citrus sinensis).

PubMed

Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang

2015-11-23

With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.
[Research advances of genomic GYP coding MNS blood group antigens].

PubMed

Liu, Chang-Li; Zhao, Wei-Jun

2012-02-01

The MNS blood group system includes more than 40 antigens, and the M, N, S and s antigens are the most significant ones in the system. The antigenic determinants of M and N antigens lie on the top of GPA on the surface of red blood cells, while the antigenic determinants of S and s antigens lie on the top of GPB on the surface of red blood cells. The GYPA gene coding GPA and the GYPB gene coding GPB locate at the longarm of chromosome 4 and display 95% homologus sequence, meanwhile both genes locate closely to GYPE gene that did not express product. These three genes formed "GYPA-GYPB-GYPE" structure called GYP genome. This review focuses on the molecular basis of genomic GYP and the variety of GYP genome in the expression of diversity MNS blood group antigens. The molecular basis of Miltenberger hybrid glycophorin polymorphism is specifically expounded.
A Helitron-like Transposon Superfamily from Lepidoptera Disrupts (GAAA)n Microsatellites and is Responsible for Flanking Sequence Similarity within a Microsatellite Family

USDA-ARS?s Scientific Manuscript database

Transposable elements (TEs) are mobile DNA regions that alter host genome structure and gene expression. A novel 588 bp non-autonomous high copy number TE in the Ostrinia nubilalis genome has features in common with miniature inverted-repeat transposable elements (MITEs): high A+T content (62.3%),...
Network-constrained group lasso for high-dimensional multinomial classification with application to cancer subtype prediction.

PubMed

Tian, Xinyu; Wang, Xuefeng; Chen, Jun

2014-01-01

Classic multinomial logit model, commonly used in multiclass regression problem, is restricted to few predictors and does not take into account the relationship among variables. It has limited use for genomic data, where the number of genomic features far exceeds the sample size. Genomic features such as gene expressions are usually related by an underlying biological network. Efficient use of the network information is important to improve classification performance as well as the biological interpretability. We proposed a multinomial logit model that is capable of addressing both the high dimensionality of predictors and the underlying network information. Group lasso was used to induce model sparsity, and a network-constraint was imposed to induce the smoothness of the coefficients with respect to the underlying network structure. To deal with the non-smoothness of the objective function in optimization, we developed a proximal gradient algorithm for efficient computation. The proposed model was compared to models with no prior structure information in both simulations and a problem of cancer subtype prediction with real TCGA (the cancer genome atlas) gene expression data. The network-constrained mode outperformed the traditional ones in both cases.
House spider genome uncovers evolutionary shifts in the diversity and expression of black widow venom proteins associated with extreme toxicity.

PubMed

Gendreau, Kerry L; Haney, Robert A; Schwager, Evelyn E; Wierschin, Torsten; Stanke, Mario; Richards, Stephen; Garb, Jessica E

2017-02-16

Black widow spiders are infamous for their neurotoxic venom, which can cause extreme and long-lasting pain. This unusual venom is dominated by latrotoxins and latrodectins, two protein families virtually unknown outside of the black widow genus Latrodectus, that are difficult to study given the paucity of spider genomes. Using tissue-, sex- and stage-specific expression data, we analyzed the recently sequenced genome of the house spider (Parasteatoda tepidariorum), a close relative of black widows, to investigate latrotoxin and latrodectin diversity, expression and evolution. We discovered at least 47 latrotoxin genes in the house spider genome, many of which are tandem-arrayed. Latrotoxins vary extensively in predicted structural domains and expression, implying their significant functional diversification. Phylogenetic analyses show latrotoxins have substantially duplicated after the Latrodectus/Parasteatoda split and that they are also related to proteins found in endosymbiotic bacteria. Latrodectin genes are less numerous than latrotoxins, but analyses show their recruitment for venom function from neuropeptide hormone genes following duplication, inversion and domain truncation. While latrodectins and other peptides are highly expressed in house spider and black widow venom glands, latrotoxins account for a far smaller percentage of house spider venom gland expression. The house spider genome sequence provides novel insights into the evolution of venom toxins once considered unique to black widows. Our results greatly expand the size of the latrotoxin gene family, reinforce its narrow phylogenetic distribution, and provide additional evidence for the lateral transfer of latrotoxins between spiders and bacterial endosymbionts. Moreover, we strengthen the evidence for the evolution of latrodectin venom genes from the ecdysozoan Ion Transport Peptide (ITP)/Crustacean Hyperglycemic Hormone (CHH) neuropeptide superfamily. The lower expression of latrotoxins in house spiders relative to black widows, along with the absence of a vertebrate-targeting α-latrotoxin gene in the house spider genome, may account for the extreme potency of black widow venom.
Interaction of a common painkiller piroxicam and copper-piroxicam with chromatin causes structural alterations accompanied by modulation at the epigenomic/genomic level.

PubMed

Goswami, Sathi; Sanyal, Sulagna; Chakraborty, Payal; Das, Chandrima; Sarkar, Munna

2017-08-01

NSAIDs are the most common class of painkillers and anti-inflammatory agents. They also show other functions like chemoprevention and chemosuppression for which they act at the protein but not at the genome level since they are mostly anions at physiological pH, which prohibit their approach to the poly-anionic DNA. Complexing the drugs with bioactive metal obliterate their negative charge and allow them to bind to the DNA, thereby, opening the possibility of genome level interaction. To test this hypothesis, we present the interaction of a traditional NSAID, Piroxicam and its copper complex with core histone and chromatin. Spectroscopy, DLS, and SEM studies were applied to see the effect of the interaction on the structure of histone/chromatin. This was coupled with MTT assay, immunoblot analysis, confocal microscopy, micro array analysis and qRT-PCR. The interaction of Piroxicam and its copper complex with histone/chromatin results in structural alterations. Such structural alterations can have different biological manifestations, but to test our hypothesis, we have focused only on the accompanied modulations at the epigenomic/genomic level. The complex, showed alteration of key epigenetic signatures implicated in transcription in the global context, although Piroxicam caused no significant changes. We have correlated such alterations caused by the complex with the changes in global gene expression and validated the candidate gene expression alterations. Our results provide the proof of concept that DNA binding ability of the copper complexes of a traditional NSAID, opens up the possibility of modulations at the epigenomic/genomic level. Copyright © 2017 Elsevier B.V. All rights reserved.
Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arakaki, Tracy; Le Trong, Isolde; Structural Genomics of Pathogenic Protozoa

2006-03-01

The crystal structure of a conserved hypothetical protein from L. major, Pfam sequence family PF04543, structural genomics target ID Lmaj006129AAA, has been determined at a resolution of 1.6 Å. The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure elements and disorder. The structure of one of these variants, Lmaj006129AAH, was solved by multiple-wavelength anomalous diffraction (MAD)more » using ELVES, an automatic protein crystal structure-determination system. This model was then successfully used as a molecular-replacement probe for the parent full-length target, Lmaj006129AAA. The final structure of Lmaj006129AAA was refined to an R value of 0.185 (R{sub free} = 0.229) at 1.60 Å resolution. Structure and sequence comparisons based on Lmaj006129AAA suggest that proteins belonging to Pfam sequence families PF04543 and PF01878 may share a common ligand-binding motif.« less
Expression of exogenous DNA methyltransferases: application in molecular and cell biology.

PubMed

Dyachenko, O V; Tarlachkov, S V; Marinitch, D V; Shevchuk, T V; Buryanov, Y I

2014-02-01

DNA methyltransferases might be used as powerful tools for studies in molecular and cell biology due to their ability to recognize and modify nitrogen bases in specific sequences of the genome. Methylation of the eukaryotic genome using exogenous DNA methyltransferases appears to be a promising approach for studies on chromatin structure. Currently, the development of new methods for targeted methylation of specific genetic loci using DNA methyltransferases fused with DNA-binding proteins is especially interesting. In the present review, expression of exogenous DNA methyltransferase for purposes of in vivo analysis of the functional chromatin structure along with investigation of the functional role of DNA methylation in cell processes are discussed, as well as future prospects for application of DNA methyltransferases in epigenetic therapy and in plant selection.

Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Diseases and Molecular Diagnostics: A Step Closer to Precision Medicine.

PubMed

Dwivedi, Shailendra; Purohit, Purvi; Misra, Radhieka; Pareek, Puneet; Goel, Apul; Khattri, Sanjay; Pant, Kamlesh Kumar; Misra, Sanjeev; Sharma, Praveen

2017-10-01

The current advent of molecular technologies together with a multidisciplinary interplay of several fields led to the development of genomics, which concentrates on the detection of pathogenic events at the genome level. The structural and functional genomics approaches have now pinpointed the technical challenge in the exploration of disease-related genes and the recognition of their structural alterations or elucidation of gene function. Various promising technologies and diagnostic applications of structural genomics are currently preparing a large database of disease-genes, genetic alterations etc., by mutation scanning and DNA chip technology. Further the functional genomics also exploring the expression genetics (hybridization-, PCR- and sequence-based technologies), two-hybrid technology, next generation sequencing with Bioinformatics and computational biology. Advances in microarray "chip" technology as microarrays have allowed the parallel analysis of gene expression patterns of thousands of genes simultaneously. Sequence information collected from the genomes of many individuals is leading to the rapid discovery of single nucleotide polymorphisms or SNPs. Further advances of genetic engineering have also revolutionized immunoassay biotechnology via engineering of antibody-encoding genes and the phage display technology. The Biotechnology plays an important role in the development of diagnostic assays in response to an outbreak or critical disease response need. However, there is also need to pinpoint various obstacles and issues related to the commercialization and widespread dispersal of genetic knowledge derived from the exploitation of the biotechnology industry and the development and marketing of diagnostic services. Implementation of genetic criteria for patient selection and individual assessment of the risks and benefits of treatment emerges as a major challenge to the pharmaceutical industry. Thus this field is revolutionizing current era and further it may open new vistas in the field of disease management.
Systematic gene tagging using CRISPR/Cas9 in human stem cells to illuminate cell organization

PubMed Central

Roberts, Brock; Haupt, Amanda; Tucker, Andrew; Grancharova, Tanya; Arakaki, Joy; Fuqua, Margaret A.; Nelson, Angelique; Hookway, Caroline; Ludmann, Susan A.; Mueller, Irina A.; Yang, Ruian; Horwitz, Rick; Rafelski, Susanne M.; Gunawardane, Ruwanthi N.

2017-01-01

We present a CRISPR/Cas9 genome-editing strategy to systematically tag endogenous proteins with fluorescent tags in human induced pluripotent stem cells (hiPSC). To date, we have generated multiple hiPSC lines with monoallelic green fluorescent protein tags labeling 10 proteins representing major cellular structures. The tagged proteins include alpha tubulin, beta actin, desmoplakin, fibrillarin, nuclear lamin B1, nonmuscle myosin heavy chain IIB, paxillin, Sec61 beta, tight junction protein ZO1, and Tom20. Our genome-editing methodology using Cas9/crRNA ribonuclear protein and donor plasmid coelectroporation, followed by fluorescence-based enrichment of edited cells, typically resulted in <0.1–4% homology-directed repair (HDR). Twenty-five percent of clones generated from each edited population were precisely edited. Furthermore, 92% (36/39) of expanded clonal lines displayed robust morphology, genomic stability, expression and localization of the tagged protein to the appropriate subcellular structure, pluripotency-marker expression, and multilineage differentiation. It is our conclusion that, if cell lines are confirmed to harbor an appropriate gene edit, pluripotency, differentiation potential, and genomic stability are typically maintained during the clonal line–generation process. The data described here reveal general trends that emerged from this systematic gene-tagging approach. Final clonal lines corresponding to each of the 10 cellular structures are now available to the research community. PMID:28814507
Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers.

PubMed

Da, Yang

2015-12-18

The amount of functional genomic information has been growing rapidly but remains largely unused in genomic selection. Genomic prediction and estimation using haplotypes in genome regions with functional elements such as all genes of the genome can be an approach to integrate functional and structural genomic information for genomic selection. Towards this goal, this article develops a new haplotype approach for genomic prediction and estimation. A multi-allelic haplotype model treating each haplotype as an 'allele' was developed for genomic prediction and estimation based on the partition of a multi-allelic genotypic value into additive and dominance values. Each additive value is expressed as a function of h - 1 additive effects, where h = number of alleles or haplotypes, and each dominance value is expressed as a function of h(h - 1)/2 dominance effects. For a sample of q individuals, the limit number of effects is 2q - 1 for additive effects and is the number of heterozygous genotypes for dominance effects. Additive values are factorized as a product between the additive model matrix and the h - 1 additive effects, and dominance values are factorized as a product between the dominance model matrix and the h(h - 1)/2 dominance effects. Genomic additive relationship matrix is defined as a function of the haplotype model matrix for additive effects, and genomic dominance relationship matrix is defined as a function of the haplotype model matrix for dominance effects. Based on these results, a mixed model implementation for genomic prediction and variance component estimation that jointly use haplotypes and single markers is established, including two computing strategies for genomic prediction and variance component estimation with identical results. The multi-allelic genetic partition fills a theoretical gap in genetic partition by providing general formulations for partitioning multi-allelic genotypic values and provides a haplotype method based on the quantitative genetics model towards the utilization of functional and structural genomic information for genomic prediction and estimation.
The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lang, Daniel; Ullrich, Kristian K.; Murat, Florent

Here, the draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome–scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene– and TE–rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono–centric with peaks of a class of Copia elements potentially coinciding with centromeres. Genemore » body methylation is evident in 5.7% of the protein–coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure–based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant–specific cell growth and tissue organization. The P. patens genome lacks the TE–rich pericentromeric and gene–rich distal regions typical for most flowering plant genomes. More non–seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.« less
The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution

DOE PAGES

Lang, Daniel; Ullrich, Kristian K.; Murat, Florent; ...

2017-12-13

Here, the draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome–scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene– and TE–rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono–centric with peaks of a class of Copia elements potentially coinciding with centromeres. Genemore » body methylation is evident in 5.7% of the protein–coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure–based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant–specific cell growth and tissue organization. The P. patens genome lacks the TE–rich pericentromeric and gene–rich distal regions typical for most flowering plant genomes. More non–seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.« less
Genome-wide analysis and expression profile of the bZIP transcription factor gene family in grapevine (Vitis vinifera)

PubMed Central

2014-01-01

Background Basic leucine zipper (bZIP) transcription factor gene family is one of the largest and most diverse families in plants. Current studies have shown that the bZIP proteins regulate numerous growth and developmental processes and biotic and abiotic stress responses. Nonetheless, knowledge concerning the specific expression patterns and evolutionary history of plant bZIP family members remains very limited. Results We identified 55 bZIP transcription factor-encoding genes in the grapevine (Vitis vinifera) genome, and divided them into 10 groups according to the phylogenetic relationship with those in Arabidopsis. The chromosome distribution and the collinearity analyses suggest that expansion of the grapevine bZIP (VvbZIP) transcription factor family was greatly contributed by the segment/chromosomal duplications, which may be associated with the grapevine genome fusion events. Nine intron/exon structural patterns within the bZIP domain and the additional conserved motifs were identified among all VvbZIP proteins, and showed a high group-specificity. The predicted specificities on DNA-binding domains indicated that some highly conserved amino acid residues exist across each major group in the tree of land plant life. The expression patterns of VvbZIP genes across the grapevine gene expression atlas, based on microarray technology, suggest that VvbZIP genes are involved in grapevine organ development, especially seed development. Expression analysis based on qRT-PCR indicated that VvbZIP genes are extensively involved in drought- and heat-responses, with possibly different mechanisms. Conclusions The genome-wide identification, chromosome organization, gene structures, evolutionary and expression analyses of grapevine bZIP genes provide an overall insight of this gene family and their potential involvement in growth, development and stress responses. This will facilitate further research on the bZIP gene family regarding their evolutionary history and biological functions. PMID:24725365
Genome-wide analysis of WRKY gene family in the sesame genome and identification of the WRKY genes involved in responses to abiotic stresses.

PubMed

Li, Donghua; Liu, Pan; Yu, Jingyin; Wang, Linhai; Dossa, Komivi; Zhang, Yanxin; Zhou, Rong; Wei, Xin; Zhang, Xiurong

2017-09-11

Sesame (Sesamum indicum L.) is one of the world's most important oil crops. However, it is susceptible to abiotic stresses in general, and to waterlogging and drought stresses in particular. The molecular mechanisms of abiotic stress tolerance in sesame have not yet been elucidated. The WRKY domain transcription factors play significant roles in plant growth, development, and responses to stresses. However, little is known about the number, location, structure, molecular phylogenetics, and expression of the WRKY genes in sesame. We performed a comprehensive study of the WRKY gene family in sesame and identified 71 SiWRKYs. In total, 65 of these genes were mapped to 15 linkage groups within the sesame genome. A phylogenetic analysis was performed using a related species (Arabidopsis thaliana) to investigate the evolution of the sesame WRKY genes. Tissue expression profiles of the WRKY genes demonstrated that six SiWRKY genes were highly expressed in all organs, suggesting that these genes may be important for plant growth and organ development in sesame. Analysis of the SiWRKY gene expression patterns revealed that 33 and 26 SiWRKYs respond strongly to waterlogging and drought stresses, respectively. Changes in the expression of 12 SiWRKY genes were observed at different times after the waterlogging and drought treatments had begun, demonstrating that sesame gene expression patterns vary in response to abiotic stresses. In this study, we analyzed the WRKY family of transcription factors encoded by the sesame genome. Insight was gained into the classification, evolution, and function of the SiWRKY genes, revealing their putative roles in a variety of tissues. Responses to abiotic stresses in different sesame cultivars were also investigated. The results of our study provide a better understanding of the structures and functions of sesame WRKY genes and suggest that manipulating these WRKYs could enhance resistance to waterlogging and drought.
Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays

PubMed Central

2010-01-01

Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence. PMID:20525227
Unstable genomes elevate transcriptome dynamics

PubMed Central

Stevens, Joshua B.; Liu, Guo; Abdallah, Batoul Y.; Horne, Steven D.; Ye, Karen J.; Bremer, Steven W.; Ye, Christine J.; Krawetz, Stephen A.; Heng, Henry H.

2015-01-01

The challenge of identifying common expression signatures in cancer is well known, however the reason behind this is largely unclear. Traditionally variation in expression signatures has been attributed to technological problems, however recent evidence suggests that chromosome instability (CIN) and resultant karyotypic heterogeneity may be a large contributing factor. Using a well-defined model of immortalization, we systematically compared the pattern of genome alteration and expression dynamics during somatic evolution. Co-measurement of global gene expression and karyotypic alteration throughout the immortalization process reveals that karyotype changes influence gene expression as major structural and numerical karyotypic alterations result in large gene expression deviation. Replicate samples from stages with stable genomes are more similar to each other than are replicate samples with karyotypic heterogeneity. Karyotypic and gene expression change during immortalization is dynamic as each stage of progression has a unique expression pattern. This was further verified by comparing global expression in two replicates grown in one flask with known karyotypes. Replicates with higher karyotypic instability were found to be less similar than replicates with stable karyotypes. This data illustrates the karyotype, transcriptome, and transcriptome determined pathways are in constant flux during somatic cellular evolution (particularly during the macroevolutionary phase) and this flux is an inextricable feature of CIN and essential for cancer formation. The findings presented here underscore the importance of understanding the evolutionary process of cancer in order to design improved treatment modalities. PMID:24122714
XTHs from Fragaria vesca: genomic structure and transcriptomic analysis in ripening fruit and other tissues.

PubMed

Opazo, María Cecilia; Lizana, Rodrigo; Stappung, Yazmina; Davis, Thomas M; Herrera, Raúl; Moya-León, María Alejandra

2017-11-07

Fragaria vesca or 'woodland strawberry' has emerged as an attractive model for the study of ripening of non-climacteric fruit. It has several advantages, such as its small genome and its diploidy. The recent availability of the complete sequence of its genome opens the possibility for further analysis and its use as a reference species. Fruit softening is a physiological event and involves many biochemical changes that take place at the final stages of fruit development; among them, the remodeling of cell walls by the action of a set of enzymes. Xyloglucan endotransglycosylase/hydrolase (XTH) is a cell wall-associated enzyme, which is encoded by a multigene family. Its action modifies the structure of xyloglucans, a diverse group of polysaccharides that crosslink with cellulose microfibrills, affecting therefore the functional structure of the cell wall. The aim of this work is to identify the XTH-encoding genes present in F. vesca and to determine its transcription level in ripening fruit. The search resulted in identification of 26 XTH-encoding genes named as FvXTHs. Genetic structure and phylogenetic analyses were performed allowing the classification of FvXTH genes into three phylogenetic groups: 17 in group I/II, 2 in group IIIA and 4 in group IIIB. Two sequences were included into the ancestral group. Through a comparative analysis, characteristic structural protein domains were found in FvXTH protein sequences. In complement, expression analyses of FvXTHs by qPCR were performed in fruit at different developmental and ripening stages, as well as, in other tissues. The results showed a diverse expression pattern of FvXTHs in several tissues, although most of them are highly expressed in roots. Their expression patterns are not related to their respective phylogenetic groups. In addition, most FvXTHs are expressed in ripe fruit, and interestingly, some of them (FvXTH 18 and 20, belonging to phylogenic group I/II, and FvXTH 25 and 26 to group IIIB) display an increasing expression pattern as the fruit ripens. A discrete group of FvXTHs (18, 20, 25 and 26) increases their expression during softening of F. vesca fruit, and could take part in cell wall remodeling required for softening in collaboration with other cell wall degrading enzymes.
Retrotransposons as regulators of gene expression.

PubMed

Elbarbary, Reyad A; Lucas, Bronwyn A; Maquat, Lynne E

2016-02-12

Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body's defense mechanisms. Copyright © 2016, American Association for the Advancement of Science.
Protists and the Wild, Wild West of Gene Expression: New Frontiers, Lawlessness, and Misfits.

PubMed

Smith, David Roy; Keeling, Patrick J

2016-09-08

The DNA double helix has been called one of life's most elegant structures, largely because of its universality, simplicity, and symmetry. The expression of information encoded within DNA, however, can be far from simple or symmetric and is sometimes surprisingly variable, convoluted, and wantonly inefficient. Although exceptions to the rules exist in certain model systems, the true extent to which life has stretched the limits of gene expression is made clear by nonmodel systems, particularly protists (microbial eukaryotes). The nuclear and organelle genomes of protists are subject to the most tangled forms of gene expression yet identified. The complicated and extravagant picture of the underlying genetics of eukaryotic microbial life changes how we think about the flow of genetic information and the evolutionary processes shaping it. Here, we discuss the origins, diversity, and growing interest in noncanonical protist gene expression and its relationship to genomic architecture.
An orthogonal system for heterologous expression of actinobacterial lasso peptides in Streptomyces hosts.

PubMed

Mevaere, Jimmy; Goulard, Christophe; Schneider, Olha; Sekurova, Olga N; Ma, Haiyan; Zirah, Séverine; Afonso, Carlos; Rebuffat, Sylvie; Zotchev, Sergey B; Li, Yanyan

2018-05-29

Lasso peptides are ribosomally synthesized and post-translationally modified peptides produced by bacteria. They are characterized by an unusual lariat-knot structure. Targeted genome scanning revealed a wide diversity of lasso peptides encoded in actinobacterial genomes, but cloning and heterologous expression of these clusters turned out to be problematic. To circumvent this, we developed an orthogonal expression system for heterologous production of actinobacterial lasso peptides in Streptomyces hosts based on a newly-identified regulatory circuit from Actinoalloteichus fjordicus. Six lasso peptide gene clusters, mainly originating from marine Actinobacteria, were chosen for proof-of-concept studies. By varying the Streptomyces expression hosts and a small set of culture conditions, three new lasso peptides were successfully produced and characterized by tandem MS. The newly developed expression system thus sets the stage to uncover and bioengineer the chemo-diversity of actinobacterial lasso peptides. Moreover, our data provide some considerations for future bioprospecting efforts for such peptides.
Contribution of transposable elements in the plant's genome.

PubMed

Sahebi, Mahbod; Hanafi, Mohamed M; van Wijnen, Andre J; Rice, David; Rafii, M Y; Azizi, Parisa; Osman, Mohamad; Taheri, Sima; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat; Noor, Yusuf Muhammad

2018-07-30

Plants maintain extensive growth flexibility under different environmental conditions, allowing them to continuously and rapidly adapt to alterations in their environment. A large portion of many plant genomes consists of transposable elements (TEs) that create new genetic variations within plant species. Different types of mutations may be created by TEs in plants. Many TEs can avoid the host's defense mechanisms and survive alterations in transposition activity, internal sequence and target site. Thus, plant genomes are expected to utilize a variety of mechanisms to tolerate TEs that are near or within genes. TEs affect the expression of not only nearby genes but also unlinked inserted genes. TEs can create new promoters, leading to novel expression patterns or alternative coding regions to generate alternate transcripts in plant species. TEs can also provide novel cis-acting regulatory elements that act as enhancers or inserts within original enhancers that are required for transcription. Thus, the regulation of plant gene expression is strongly managed by the insertion of TEs into nearby genes. TEs can also lead to chromatin modifications and thereby affect gene expression in plants. TEs are able to generate new genes and modify existing gene structures by duplicating, mobilizing and recombining gene fragments. They can also facilitate cellular functions by sharing their transposase-coding regions. Hence, TE insertions can not only act as simple mutagens but can also alter the elementary functions of the plant genome. Here, we review recent discoveries concerning the contribution of TEs to gene expression in plant genomes and discuss the different mechanisms by which TEs can affect plant gene expression and reduce host defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Analyses of the NAC transcription factor gene family in Gossypium raimondii Ulbr.: chromosomal location, structure, phylogeny, and expression patterns.

PubMed

Shang, Haihong; Li, Wei; Zou, Changsong; Yuan, Youlu

2013-07-01

NAC domain proteins are plant-specific transcription factors known to play diverse roles in various plant developmental processes. In the present study, we performed the first comprehensive study of the NAC gene family in Gossypium raimondii Ulbr., incorporating phylogenetic, chromosomal location, gene structure, conserved motif, and expression profiling analyses. We identified 145 NAC transcription factor (NAC-TF) genes that were phylogenetically clustered into 18 distinct subfamilies. Of these, 127 NAC-TF genes were distributed across the 13 chromosomes, 80 (55%) were preferentially retained duplicates located in both duplicated regions and six were located in triplicated chromosomal regions. The majority of NAC-TF genes showed temporal-, spatial-, and tissue-specific expression patterns based on transcriptomic and qRT-PCR analyses. However, the expression patterns of several duplicate genes were partially redundant, suggesting the occurrence of sub-functionalization during their evolution. Based on their genomic organization, we concluded that genomic duplications contributed significantly to the expansion of the NAC-TF gene family in G. raimondii. Comprehensive analysis of their expression profiles could provide novel insights into the functional divergence among members of the NAC gene family in G. raimondii. © 2013 Institute of Botany, Chinese Academy of Sciences.
Preparation of Protein Samples for NMR Structure, Function, and Small Molecule Screening Studies

PubMed Central

Acton, Thomas B.; Xiao, Rong; Anderson, Stephen; Aramini, James; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Kornhaber, Gregory; Lau, Jessica; Lee, Dong Yup; Liu, Gaohua; Maglaqui, Melissa; Ma, Lichung; Mao, Lei; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Shastry, Ritu; Swapna, G.V.T.; Tang, Yeufeng; Tong, Saichiu; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.

2014-01-01

In this chapter, we concentrate on the production of high quality protein samples for NMR studies. In particular, we provide an in-depth description of recent advances in the production of NMR samples and their synergistic use with recent advancements in NMR hardware. We describe the protein production platform of the Northeast Structural Genomics Consortium, and outline our high-throughput strategies for producing high quality protein samples for nuclear magnetic resonance (NMR) studies. Our strategy is based on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems and isotope enrichment in minimal media. We describe 96-well ligation-independent cloning and analytical expression systems, parallel preparative scale fermentation, and high-throughput purification protocols. The 6X-His affinity tag allows for a similar two-step purification procedure implemented in a parallel high-throughput fashion that routinely results in purity levels sufficient for NMR studies (> 97% homogeneity). Using this platform, the protein open reading frames of over 17,500 different targeted proteins (or domains) have been cloned as over 28,000 constructs. Nearly 5,000 of these proteins have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html), resulting in more than 950 new protein structures, including more than 400 NMR structures, deposited in the Protein Data Bank. The Northeast Structural Genomics Consortium pipeline has been effective in producing protein samples of both prokaryotic and eukaryotic origin. Although this paper describes our entire pipeline for producing isotope-enriched protein samples, it focuses on the major updates introduced during the last 5 years (Phase 2 of the National Institute of General Medical Sciences Protein Structure Initiative). Our advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are suitable for implementation in a large individual laboratory or by a small group of collaborating investigators for structural biology, functional proteomics, ligand screening and structural genomics research. PMID:21371586
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

PubMed

Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

2017-09-30

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species

PubMed Central

Andersen, Ethan J.; Neupane, Surendra; Benson, Benjamin V.

2017-01-01

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis, we investigated nTNL orthologs in the genomes of common bean, Medicago, soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis, common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence. PMID:28973974
Identification and expression profiles of the WRKY transcription factor family in Ricinus communis.

PubMed

Li, Hui-Liang; Zhang, Liang-Bo; Guo, Dong; Li, Chang-Zhu; Peng, Shi-Qing

2012-07-25

In plants, WRKY proteins constitute a large family of transcription factors. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. A large number of WRKY transcription factors have been reported from Arabidopsis, rice, and other higher plants. The recent publication of the draft genome sequence of castor bean (Ricinus communis) has allowed a genome-wide search for R. communis WRKY (RcWRKY) transcription factors and the comparison of these positively identified proteins with their homologs in model plants. A total of 47 WRKY genes were identified in the castor bean genome. According to the structural features of the WRKY domain, the RcWRKY are classified into seven main phylogenetic groups. Furthermore, putative orthologs of RcWRKY proteins in Arabidopsis and rice could now be assigned. An analysis of expression profiles of RcWRKY genes indicates that 47 WRKY genes display differential expressions either in their transcript abundance or expression patterns under normal growth conditions. Copyright © 2012 Elsevier B.V. All rights reserved.

Widespread antisense transcription of Populus genome under drought.

PubMed

Yuan, Yinan; Chen, Su

2018-06-06

Antisense transcription is widespread in many genomes and plays important regulatory roles in gene expression. The objective of our study was to investigate the extent and functional relevance of antisense transcription in forest trees. We employed Populus, a model tree species, to probe the antisense transcriptional response of tree genome under drought, through stranded RNA-seq analysis. We detected nearly 48% of annotated Populus gene loci with antisense transcripts and 44% of them with co-transcription from both DNA strands. Global distribution of reads pattern across annotated gene regions uncovered that antisense transcription was enriched in untranslated regions while sense reads were predominantly mapped in coding exons. We further detected 1185 drought-responsive sense and antisense gene loci and identified a strong positive correlation between the expression of antisense and sense transcripts. Additionally, we assessed the antisense expression in introns and found a strong correlation between intronic expression and exonic expression, confirming antisense transcription of introns contributes to transcriptional activity of Populus genome under drought. Finally, we functionally characterized drought-responsive sense-antisense transcript pairs through gene ontology analysis and discovered that functional groups including transcription factors and histones were concordantly regulated at both sense and antisense transcriptional level. Overall, our study demonstrated the extensive occurrence of antisense transcripts of Populus genes under drought and provided insights into genome structure, regulation pattern and functional significance of drought-responsive antisense genes in forest trees. Datasets generated in this study serve as a foundation for future genetic analysis to improve our understanding of gene regulation by antisense transcription.
In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

PubMed

Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

2014-01-30

RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
Genome-Wide Identification and Expression Analysis of WRKY Gene Family in Capsicum annuum L.

PubMed

Diao, Wei-Ping; Snyder, John C; Wang, Shu-Bin; Liu, Jin-Bing; Pan, Bao-Gui; Guo, Guang-Jun; Wei, Ge

2016-01-01

The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating multiple biological processes, especially in regulating defense against biotic and abiotic stresses. However, little information is available about WRKYs in pepper (Capsicum annuum L.). The recent release of completely assembled genome sequences of pepper allowed us to perform a genome-wide investigation for pepper WRKY proteins. In the present study, a total of 71 WRKY genes were identified in the pepper genome. According to structural features of their encoded proteins, the pepper WRKY genes (CaWRKY) were classified into three main groups, with the second group further divided into five subgroups. Genome mapping analysis revealed that CaWRKY were enriched on four chromosomes, especially on chromosome 1, and 15.5% of the family members were tandemly duplicated genes. A phylogenetic tree was constructed depending on WRKY domain' sequences derived from pepper and Arabidopsis. The expression of 21 selected CaWRKY genes in response to seven different biotic and abiotic stresses (salt, heat shock, drought, Phytophtora capsici, SA, MeJA, and ABA) was evaluated by quantitative RT-PCR; Some CaWRKYs were highly expressed and up-regulated by stress treatment. Our results will provide a platform for functional identification and molecular breeding studies of WRKY genes in pepper.
The genome of the Gulf pipefish enables understanding of evolutionary innovations.

PubMed

Small, C M; Bassham, S; Catchen, J; Amores, A; Fuiten, A M; Brown, R S; Jones, A G; Cresko, W A

2016-12-20

Evolutionary origins of derived morphologies ultimately stem from changes in protein structure, gene regulation, and gene content. A well-assembled, annotated reference genome is a central resource for pursuing these molecular phenomena underlying phenotypic evolution. We explored the genome of the Gulf pipefish (Syngnathus scovelli), which belongs to family Syngnathidae (pipefishes, seahorses, and seadragons). These fishes have dramatically derived bodies and a remarkable novelty among vertebrates, the male brood pouch. We produce a reference genome, condensed into chromosomes, for the Gulf pipefish. Gene losses and other changes have occurred in pipefish hox and dlx clusters and in the tbx and pitx gene families, candidate mechanisms for the evolution of syngnathid traits, including an elongated axis and the loss of ribs, pelvic fins, and teeth. We measure gene expression changes in pregnant versus non-pregnant brood pouch tissue and characterize the genomic organization of duplicated metalloprotease genes (patristacins) recruited into the function of this novel structure. Phylogenetic inference using ultraconserved sequences provides an alternative hypothesis for the relationship between orders Syngnathiformes and Scombriformes. Comparisons of chromosome structure among percomorphs show that chromosome number in a pipefish ancestor became reduced via chromosomal fusions. The collected findings from this first syngnathid reference genome open a window into the genomic underpinnings of highly derived morphologies, demonstrating that de novo production of high quality and useful reference genomes is within reach of even small research groups.
The Transcriptome of the Reference Potato Genome Solanum tuberosum Group Phureja Clone DM1-3 516R44

PubMed Central

Massa, Alicia N.; Childs, Kevin L.; Lin, Haining; Bryan, Glenn J.; Giuliano, Giovanni; Buell, C. Robin

2011-01-01

Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family. PMID:22046362
The Pekin duck programmed death-ligand 1: cDNA cloning, genomic structure, molecular characterization and mRNA expression analysis.

PubMed

Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S

2015-04-01

Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.
Epigenomic landscape modified by histone modification correlated with activation of IGF2 gene

USDA-ARS?s Scientific Manuscript database

The links of histone post-translational modifications and chromatin structure to cell cycle progression, DNA replication, and overall chromosome functions are very clear. The modulation of genome expression as a consequence of chromatin structural changes is most likely a basic mechanism. The epige...
Trans-packaging of human immunodeficiency virus type 1 genome into Gag virus-like particles in Saccharomyces cerevisiae.

PubMed

Tomo, Naoki; Goto, Toshiyuki; Morikawa, Yuko

2013-03-26

Yeast is recognized as a generally safe microorganism and is utilized for the production of pharmaceutical products, including vaccines. We previously showed that expression of human immunodeficiency virus type 1 (HIV-1) Gag protein in Saccharomyces cerevisiae spheroplasts released Gag virus-like particles (VLPs) extracellularly, suggesting that the production system could be used in vaccine development. In this study, we further establish HIV-1 genome packaging into Gag VLPs in a yeast cell system. The nearly full-length HIV-1 genome containing the entire 5' long terminal repeat, U3-R-U5, did not transcribe gag mRNA in yeast. Co-expression of HIV-1 Tat, a transcription activator, did not support the transcription. When the HIV-1 promoter U3 was replaced with the promoter for the yeast glyceraldehyde-3-phosphate dehydrogenase gene, gag mRNA transcription was restored, but no Gag protein expression was observed. Co-expression of HIV-1 Rev, a factor that facilitates nuclear export of gag mRNA, did not support the protein synthesis. Progressive deletions of R-U5 and its downstream stem-loop-rich region (SL) to the gag start ATG codon restored Gag protein expression, suggesting that a highly structured noncoding RNA generated from the R-U5-SL region had an inhibitory effect on gag mRNA translation. When a plasmid containing the HIV-1 genome with the R-U5-SL region was coexpressed with an expression plasmid for Gag protein, the HIV-1 genomic RNA was transcribed and incorporated into Gag VLPs formed by Gag protein assembly, indicative of the trans-packaging of HIV-1 genomic RNA into Gag VLPs in a yeast cell system. The concentration of HIV-1 genomic RNA in Gag VLPs released from yeast was approximately 500-fold higher than that in yeast cytoplasm. The deletion of R-U5 to the gag gene resulted in the failure of HIV-1 RNA packaging into Gag VLPs, indicating that the packaging signal of HIV-1 genomic RNA present in the R-U5 to gag region functions similarly in yeast cells. Our data indicate that selective trans-packaging of HIV-1 genomic RNA into Gag VLPs occurs in a yeast cell system, analogous to a mammalian cell system, suggesting that yeast may provide an alternative packaging system for lentiviral RNA.
Molecular evolution and expression of archosaurian β-keratins: diversification and expansion of archosaurian β-keratins and the origin of feather β-keratins.

PubMed

Greenwold, Matthew J; Sawyer, Roger H

2013-09-01

The archosauria consist of two living groups, crocodilians, and birds. Here we compare the structure, expression, and phylogeny of the beta (β)-keratins in two crocodilian genomes and two avian genomes to gain a better understanding of the evolutionary origin of the feather β-keratins. Unlike squamates such as the green anole with 40 β-keratins in its genome, the chicken and zebra finch genomes have over 100 β-keratin genes in their genomes, while the American alligator has 20 β-keratin genes, and the saltwater crocodile has 21 β-keratin genes. The crocodilian β-keratins are similar to those of birds and these structural proteins have a central filament domain and N- and C-termini, which contribute to the matrix material between the twisted β-sheets, which form the 2-3 nm filament. Overall the expression of alligator β-keratin genes in the integument increases during development. Phylogenetic analysis demonstrates that a crocodilian β-keratin clade forms a monophyletic group with the avian scale and feather β-keratins, suggesting that avian scale and feather β-keratins along with a subset of crocodilian β-keratins evolved from a common ancestral gene/s. Overall, our analyses support the view that the epidermal appendages of basal archosaurs used a diverse array of β-keratins, which evolved into crocodilian and avian specific clades. In birds, the scale and feather subfamilies appear to have evolved independently in the avian lineage from a subset of archosaurian claw β-keratins. The expansion of the avian specific feather β-keratin genes accompanied the diversification of birds and the evolution of feathers. Copyright © 2013 Wiley Periodicals, Inc.
Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome.

PubMed

Opazo, Juan C; Lee, Alison P; Hoffmann, Federico G; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F

2015-07-01

Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu

PubMed Central

Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat. PMID:26132381
Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu.

PubMed

Zhang, Yanlin; Luo, Guangbin; Liu, Dongcheng; Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

2015-01-01

Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat.
Patterns of expression of position-dependent integrated transgenes in mouse embryo.

PubMed Central

Bonnerot, C; Grimber, G; Briand, P; Nicolas, J F

1990-01-01

The abilities to introduce foreign DNA into the genome of mice and to visualize gene expression at the single-cell level underlie a method for defining individual elements of a genetic program. We describe the use of an Escherichia coli lacZ reporter gene fused to the promoter of the gene for hypoxanthine phosphoribosyl transferase that is expressed in all tissues. Most transgenic mice (six of seven) obtained with this construct express the lacZ gene from the hypoxanthine phosphoribosyltransferase promoter. Unexpectedly, however, the expression is temporally and spatially regulated. Each transgenic line is characterized by a specific, highly reproducible pattern of lacZ expression. These results show that, for expression, the integrated construct must be complemented by elements of the genome. These elements exert dominant developmental control on the hypoxanthine phosphoribosyltransferase promoter. The expression patterns in some transgenic mice conform to a typological marker and in others to a subtle combination of typology and topography. These observations define discrete heterogeneities of cell types and of certain structures, particularly in the nervous system and in the mesoderm. This system opens opportunities for developmental studies by providing cellular, molecular, and genetic markers of cell types, cell states, and cells from developmental compartments. Finally this method illustrates that genes transduced or transposed to a different position in the genome acquire different spatiotemporal specificities, a result that has implications for evolution. Images PMID:1696727
Dissecting genomic imprinting and genetic conflict from a game theory prospective. Comment on: ;Epigenetic game theory: How to compute the epigenetic control of maternal-to-zygotic transition; by Qian Wang et al.

NASA Astrophysics Data System (ADS)

Cui, Yuehua; Yang, Haitao

2017-03-01

Epigenetics typically refers to changes in the structure of a chromosome that affect gene activity and expression. Genomic imprinting is a special type of epigenetic phenomenon in which the expression of an allele depends on its parental origin. When an allele inherited from the mother (or father) is imprinted (i.e., silent), it is termed as maternal (or paternal) imprinting. Imprinting is often resulted from DNA methylation and tends to cluster together in the genome [1]. It has been shown to play a key role in many genetic disorders in humans [2]. Imprinting is heritable and undergoes a reprogramming process in gametes before and after fertilization [1]. Sometimes the reprogramming process is not reversible, leading to the loss of imprinting [3]. Although efforts have been made to experimentally or computationally infer imprinting genes, the underlying molecular mechanism that leads to unbalanced allelic expression is still largely unknown.
Zinc Fingers, TALEs, and CRISPR Systems: A Comparison of Tools for Epigenome Editing.

PubMed

Waryah, Charlene Babra; Moses, Colette; Arooj, Mahira; Blancafort, Pilar

2018-01-01

The completion of genome, epigenome, and transcriptome mapping in multiple cell types has created a demand for precision biomolecular tools that allow researchers to functionally manipulate DNA, reconfigure chromatin structure, and ultimately reshape gene expression patterns. Epigenetic editing tools provide the ability to interrogate the relationship between epigenetic modifications and gene expression. Importantly, this information can be exploited to reprogram cell fate for both basic research and therapeutic applications. Three different molecular platforms for epigenetic editing have been developed: zinc finger proteins (ZFs), transcription activator-like effectors (TALEs), and the system of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) proteins. These platforms serve as custom DNA-binding domains (DBDs), which are fused to epigenetic modifying domains to manipulate epigenetic marks at specific sites in the genome. The addition and/or removal of epigenetic modifications reconfigures local chromatin structure, with the potential to provoke long-lasting changes in gene transcription. Here we summarize the molecular structure and mechanism of action of ZF, TALE, and CRISPR platforms and describe their applications for the locus-specific manipulation of the epigenome. The advantages and disadvantages of each platform will be discussed with regard to genomic specificity, potency in regulating gene expression, and reprogramming cell phenotypes, as well as ease of design, construction, and delivery. Finally, we outline potential applications for these tools in molecular biology and biomedicine and identify possible barriers to their future clinical implementation.
Exploration of sequence space as the basis of viral RNA genome segmentation.

PubMed

Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

2014-05-06

The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.
Evolutionary dynamics of 3D genome architecture following polyploidization in cotton.

PubMed

Wang, Maojun; Wang, Pengcheng; Lin, Min; Ye, Zhengxiu; Li, Guoliang; Tu, Lili; Shen, Chao; Li, Jianying; Yang, Qingyong; Zhang, Xianlong

2018-02-01

The formation of polyploids significantly increases the complexity of transcriptional regulation, which is expected to be reflected in sophisticated higher-order chromatin structures. However, knowledge of three-dimensional (3D) genome structure and its dynamics during polyploidization remains poor. Here, we characterize 3D genome architectures for diploid and tetraploid cotton, and find the existence of A/B compartments and topologically associated domains (TADs). By comparing each subgenome in tetraploids with its extant diploid progenitor, we find that genome allopolyploidization has contributed to the switching of A/B compartments and the reorganization of TADs in both subgenomes. We also show that the formation of TAD boundaries during polyploidization preferentially occurs in open chromatin, coinciding with the deposition of active chromatin modification. Furthermore, analysis of inter-subgenomic chromatin interactions has revealed the spatial proximity of homoeologous genes, possibly associated with their coordinated expression. This study advances our understanding of chromatin organization in plants and sheds new light on the relationship between 3D genome evolution and transcriptional regulation.
Genomic understanding of dinoflagellates.

PubMed

Lin, Senjie

2011-01-01

The phylum of dinoflagellates is characterized by many unusual and interesting genomic and physiological features, the imprint of which, in its immense genome, remains elusive. Much novel understanding has been achieved in the last decade on various aspects of dinoflagellate biology, but most remarkably about the structure, expression pattern and epigenetic modification of protein-coding genes in the nuclear and organellar genomes. Major findings include: 1) the great diversity of dinoflagellates, especially at the base of the dinoflagellate tree of life; 2) mini-circularization of the genomes of typical dinoflagellate plastids (with three membranes, chlorophylls a, c1 and c2, and carotenoid peridinin), the scrambled mitochondrial genome and the extensive mRNA editing occurring in both systems; 3) ubiquitous spliced leader trans-splicing of nuclear-encoded mRNA and demonstrated potential as a novel tool for studying dinoflagellate transcriptomes in mixed cultures and natural assemblages; 4) existence and expression of histones and other nucleosomal proteins; 5) a ribosomal protein set expected of typical eukaryotes; 6) genetic potential of non-photosynthetic solar energy utilization via proton-pump rhodopsin; 7) gene candidates in the toxin synthesis pathways; and 8) evidence of a highly redundant, high gene number and highly recombined genome. Despite this progress, much more work awaits genome-wide transcriptome and whole genome sequencing in order to unfold the molecular mechanisms underlying the numerous mysterious attributes of dinoflagellates. Copyright © 2011 Institut Pasteur. Published by Elsevier SAS. All rights reserved.
The X chromosome in space.

PubMed

Jégu, Teddy; Aeby, Eric; Lee, Jeannie T

2017-06-01

Extensive 3D folding is required to package a genome into the tiny nuclear space, and this packaging must be compatible with proper gene expression. Thus, in the well-hierarchized nucleus, chromosomes occupy discrete territories and adopt specific 3D organizational structures that facilitate interactions between regulatory elements for gene expression. The mammalian X chromosome exemplifies this structure-function relationship. Recent studies have shown that, upon X-chromosome inactivation, active and inactive X chromosomes localize to different subnuclear positions and adopt distinct chromosomal architectures that reflect their activity states. Here, we review the roles of long non-coding RNAs, chromosomal organizational structures and the subnuclear localization of chromosomes as they relate to X-linked gene expression.
Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle.

PubMed

Gu, Quan; Nagaraj, Shivashankar H; Hudson, Nicholas J; Dalrymple, Brian P; Reverter, Antonio

2011-01-12

Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information. We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively. The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate.

Genome-Wide Identification and Characterization of BrrTCP Transcription Factors in Brassica rapa ssp. rapa.

PubMed

Du, Jiancan; Hu, Simin; Yu, Qin; Wang, Chongde; Yang, Yunqiang; Sun, Hang; Yang, Yongping; Sun, Xudong

2017-01-01

The teosinte branched1/cycloidea/proliferating cell factor (TCP) gene family is a plant-specific transcription factor that participates in the control of plant development by regulating cell proliferation. However, no report is currently available about this gene family in turnips ( Brassica rapa ssp. rapa ). In this study, a genome-wide analysis of TCP genes was performed in turnips. Thirty-nine TCP genes in turnip genome were identified and distributed on 10 chromosomes. Phylogenetic analysis clearly showed that the family was classified as two clades: class I and class II. Gene structure and conserved motif analysis showed that the same clade genes have similar gene structures and conserved motifs. The expression profiles of 39 TCP genes were determined through quantitative real-time PCR. Most CIN-type BrrTCP genes were highly expressed in leaf. The members of CYC/TB1 subclade are highly expressed in flower bud and weakly expressed in root. By contrast, class I clade showed more widespread but less tissue-specific expression patterns. Yeast two-hybrid data show that BrrTCP proteins preferentially formed heterodimers. The function of BrrTCP2 was confirmed through ectopic expression of BrrTCP2 in wild-type and loss-of-function ortholog mutant of Arabidopsis. Overexpression of BrrTCP2 in wild-type Arabidopsis resulted in the diminished leaf size. Overexpression of BrrTCP2 in triple mutants of tcp2/4/10 restored the leaf phenotype of tcp2/4/10 to the phenotype of wild type. The comprehensive analysis of turnip TCP gene family provided the foundation to further study the roles of TCP genes in turnips.
The FlyBase database of the Drosophila genome projects and community literature

PubMed Central

2002-01-01

FlyBase (http://flybase.bio.indiana.edu/) provides an integrated view of the fundamental genomic and genetic data on the major genetic model Drosophila melanogaster and related species. Following on the success of the Drosophila genome project, FlyBase has primary responsibility for the continual reannotation of the D.melanogaster genome. The ultimate goal of the reannotation effort is to decorate the euchromatic sequence of the genome with as much biological information as is available from the community and from the major genome project centers. The current cycle of reannotation focuses on establishing a comprehensive data set of gene models (i.e. transcription units and CDSs). There are many points of entry to the genome within FlyBase, most notably through maps, gene ontologies, structured phenotypic and gene expression data, and anatomy. PMID:11752267
Genomic characterisation of the effector complement of the potato cyst nematode Globodera pallida.

PubMed

Thorpe, Peter; Mantelin, Sophie; Cock, Peter Ja; Blok, Vivian C; Coke, Mirela C; Eves-van den Akker, Sebastian; Guzeeva, Elena; Lilley, Catherine J; Smant, Geert; Reid, Adam J; Wright, Kathryn M; Urwin, Peter E; Jones, John T

2014-10-23

The potato cyst nematode Globodera pallida has biotrophic interactions with its host. The nematode induces a feeding structure - the syncytium - which it keeps alive for the duration of the life cycle and on which it depends for all nutrients required to develop to the adult stage. Interactions of G. pallida with the host are mediated by effectors, which are produced in two sets of gland cells. These effectors suppress host defences, facilitate migration and induce the formation of the syncytium. The recent completion of the G. pallida genome sequence has allowed us to identify the effector complement from this species. We identify 128 orthologues of effectors from other nematodes as well as 117 novel effector candidates. We have used in situ hybridisation to confirm gland cell expression of a subset of these effectors, demonstrating the validity of our effector identification approach. We have examined the expression profiles of all effector candidates using RNAseq; this analysis shows that the majority of effectors fall into one of three clusters of sequences showing conserved expression characteristics (invasive stage nematode only, parasitic stage only or invasive stage and adult male only). We demonstrate that further diversity in the effector pool is generated by alternative splicing. In addition, we show that effectors target a diverse range of structures in plant cells, including the peroxisome. This is the first identification of effectors from any plant pathogen that target this structure. This is the first genome scale search for effectors, combined to a life-cycle expression analysis, for any plant-parasitic nematode. We show that, like other phylogenetically unrelated plant pathogens, plant parasitic nematodes deploy hundreds of effectors in order to parasitise plants, with different effectors required for different phases of the infection process.
NCBI GEO: archive for functional genomics data sets--10 years on.

PubMed

Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra

2011-01-01

A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

PubMed

Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

2016-04-07

DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. Copyright © 2016 Vrljicak et al.
Cloning of cDNA of major antigen of foot and mouth disease virus and expression in E. coli

NASA Astrophysics Data System (ADS)

Küpper, Hans; Keller, Walter; Kurz, Christina; Forss, Sonja; Schaller, Heinz

1981-02-01

Double-stranded DNA copies of the single-stranded genomic RNA of foot and mouth disease virus have been cloned into the Escherichia coli plasmid pBR322. A restriction map of the viral genome was established and aligned with the biochemical map of foot and mouth disease virus. The coding sequence for structural protein VP1, the major antigen of the virus, was identified and inserted into a plasmid vector where the expression of this sequence is under control of the phage λ PL promoter. In an appropriate host the synthesis of antigenic polypeptide can be demonstrated by radioimmunoassay.
What is bioinformatics? A proposed definition and overview of the field.

PubMed

Luscombe, N M; Greenbaum, D; Gerstein, M

2001-01-01

The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.
Comprehensive analysis and discovery of drought-related NAC transcription factors in common bean.

PubMed

Wu, Jing; Wang, Lanfen; Wang, Shumin

2016-09-07

Common bean (Phaseolus vulgaris L.) is an important warm-season food legume. Drought is the most important environmental stress factor affecting large areas of common bean via plant death or reduced global production. The NAM, ATAF1/2 and CUC2 (NAC) domain protein family are classic transcription factors (TFs) involved in a variety of abiotic stresses, particularly drought stress. However, the NAC TFs in common bean have not been characterized. In the present study, 86 putative NAC TF proteins were identified from the common bean genome database and located on 11 common bean chromosomes. The proteins were phylogenetically clustered into 8 distinct subfamilies. The gene structure and motif composition of common bean NACs were similar in each subfamily. These results suggest that NACs in the same subfamily may possess conserved functions. The expression patterns of common bean NAC genes were also characterized. The majority of NACs exhibited specific temporal and spatial expression patterns. We identified 22 drought-related NAC TFs based on transcriptome data for drought-tolerant and drought-sensitive genotypes. Quantitative real-time PCR (qRT-PCR) was performed to confirm the expression patterns of the 20 drought-related NAC genes. Based on the common bean genome sequence, we analyzed the structural characteristics, genome distribution, and expression profiles of NAC gene family members and analyzed drought-responsive NAC genes. Our results provide useful information for the functional characterization of common bean NAC genes and rich resources and opportunities for understanding common bean drought stress tolerance mechanisms.
Advances in recombinant protein expression for use in pharmaceutical research.

PubMed

Assenberg, Rene; Wan, Paul T; Geisse, Sabine; Mayr, Lorenz M

2013-06-01

Protein production for structural and biophysical studies, functional assays, biomarkers, mechanistic studies in vitro and in vivo, but also for therapeutic applications in pharma, biotech and academia has evolved into a mature discipline in recent years. Due to the increased emphasis on biopharmaceuticals, the growing demand for proteins used for structural and biophysical studies, the impact of genomics technologies on the analysis of large sets of structurally diverse proteins, and the increasing complexity of disease targets, the interest in innovative approaches for the expression, purification and characterisation of recombinant proteins has steadily increased over the years. In this review, we summarise recent developments in the field of recombinant protein expression for research use in pharma, biotech and academia. We focus mostly on the latest developments for protein expression in the most widely used expression systems: Escherichia coli (E. coli), insect cell expression using the Baculovirus Expression Vector System (BEVS) and, finally, transient and stable expression of recombinant proteins in mammalian cells. Copyright © 2013. Published by Elsevier Ltd.
Genomic analysis and temperature-dependent transcriptome profiles of the rhizosphere originating strain Pseudomonas aeruginosa M18

PubMed Central

2011-01-01

Background Our previously published reports have described an effective biocontrol agent named Pseudomonas sp. M18 as its 16S rDNA sequence and several regulator genes share homologous sequences with those of P. aeruginosa, but there are several unusual phenotypic features. This study aims to explore its strain specific genomic features and gene expression patterns at different temperatures. Results The complete M18 genome is composed of a single chromosome of 6,327,754 base pairs containing 5684 open reading frames. Seven genomic islands, including two novel prophages and five specific non-phage islands were identified besides the conserved P. aeruginosa core genome. Each prophage contains a putative chitinase coding gene, and the prophage II contains a capB gene encoding a putative cold stress protein. The non-phage genomic islands contain genes responsible for pyoluteorin biosynthesis, environmental substance degradation and type I and III restriction-modification systems. Compared with other P. aeruginosa strains, the fewest number (3) of insertion sequences and the most number (3) of clustered regularly interspaced short palindromic repeats in M18 genome may contribute to the relative genome stability. Although the M18 genome is most closely related to that of P. aeruginosa strain LESB58, the strain M18 is more susceptible to several antimicrobial agents and easier to be erased in a mouse acute lung infection model than the strain LESB58. The whole M18 transcriptomic analysis indicated that 10.6% of the expressed genes are temperature-dependent, with 22 genes up-regulated at 28°C in three non-phage genomic islands and one prophage but none at 37°C. Conclusions The P. aeruginosa strain M18 has evolved its specific genomic structures and temperature dependent expression patterns to meet the requirement of its fitness and competitiveness under selective pressures imposed on the strain in rhizosphere niche. PMID:21884571
Herpes simplex virus VP16, but not ICP0, is required to reduce histone occupancy and enhance histone acetylation on viral genomes in U2OS osteosarcoma cells.

PubMed

Hancock, Meaghan H; Cliffe, Anna R; Knipe, David M; Smiley, James R

2010-02-01

The herpes simplex virus (HSV) genome rapidly becomes associated with histones after injection into the host cell nucleus. The viral proteins ICP0 and VP16 are required for efficient viral gene expression and have been implicated in reducing the levels of underacetylated histones on the viral genome, raising the possibility that high levels of underacetylated histones inhibit viral gene expression. The U2OS osteosarcoma cell line is permissive for replication of ICP0 and VP16 mutants and appears to lack an innate antiviral repression mechanism present in other cell types. We therefore used chromatin immunoprecipitation to determine whether U2OS cells are competent to load histones onto HSV DNA and, if so, whether ICP0 and/or VP16 are required to reduce histone occupancy and enhance acetylation in this cell type. High levels of underacetylated histone H3 accumulated at several locations on the viral genome in the absence of VP16 activation function; in contrast, an ICP0 mutant displayed markedly reduced histone levels and enhanced acetylation, similar to wild-type HSV. These results demonstrate that U2OS cells are competent to load underacetylated histones onto HSV DNA and uncover an unexpected role for VP16 in modulating chromatin structure at viral early and late loci. One interpretation of these findings is that ICP0 and VP16 affect viral chromatin structure through separate pathways, and the pathway targeted by ICP0 is defective in U2OS cells. We also show that HSV infection results in decreased histone levels on some actively transcribed genes within the cellular genome, demonstrating that viral infection alters cellular chromatin structure.
Herpes Simplex Virus VP16, but Not ICP0, Is Required To Reduce Histone Occupancy and Enhance Histone Acetylation on Viral Genomes in U2OS Osteosarcoma Cells▿ †

PubMed Central

Hancock, Meaghan H.; Cliffe, Anna R.; Knipe, David M.; Smiley, James R.

2010-01-01

The herpes simplex virus (HSV) genome rapidly becomes associated with histones after injection into the host cell nucleus. The viral proteins ICP0 and VP16 are required for efficient viral gene expression and have been implicated in reducing the levels of underacetylated histones on the viral genome, raising the possibility that high levels of underacetylated histones inhibit viral gene expression. The U2OS osteosarcoma cell line is permissive for replication of ICP0 and VP16 mutants and appears to lack an innate antiviral repression mechanism present in other cell types. We therefore used chromatin immunoprecipitation to determine whether U2OS cells are competent to load histones onto HSV DNA and, if so, whether ICP0 and/or VP16 are required to reduce histone occupancy and enhance acetylation in this cell type. High levels of underacetylated histone H3 accumulated at several locations on the viral genome in the absence of VP16 activation function; in contrast, an ICP0 mutant displayed markedly reduced histone levels and enhanced acetylation, similar to wild-type HSV. These results demonstrate that U2OS cells are competent to load underacetylated histones onto HSV DNA and uncover an unexpected role for VP16 in modulating chromatin structure at viral early and late loci. One interpretation of these findings is that ICP0 and VP16 affect viral chromatin structure through separate pathways, and the pathway targeted by ICP0 is defective in U2OS cells. We also show that HSV infection results in decreased histone levels on some actively transcribed genes within the cellular genome, demonstrating that viral infection alters cellular chromatin structure. PMID:19939931
Transcriptome Wide Annotation of Eukaryotic RNase III Reactivity and Degradation Signals

PubMed Central

Gagnon, Jules; Lavoie, Mathieu; Catala, Mathieu; Malenfant, Francis; Elela, Sherif Abou

2015-01-01

Detection and validation of the RNA degradation signals controlling transcriptome stability are essential steps for understanding how cells regulate gene expression. Here we present complete genomic and biochemical annotations of the signals required for RNA degradation by the dsRNA specific ribonuclease III (Rnt1p) and examine its impact on transcriptome expression. Rnt1p cleavage signals are randomly distributed in the yeast genome, and encompass a wide variety of sequences, indicating that transcriptome stability is not determined by the recurrence of a fixed cleavage motif. Instead, RNA reactivity is defined by the sequence and structural context in which the cleavage sites are located. Reactive signals are often associated with transiently expressed genes, and their impact on RNA expression is linked to growth conditions. Together, the data suggest that Rnt1p reactivity is triggered by malleable RNA degradation signals that permit dynamic response to changes in growth conditions. PMID:25680180
Genome-wide Hi-C analysis reveals extensive hierarchical chromatin interactions in rice.

PubMed

Dong, Qianli; Li, Ning; Li, Xiaochong; Yuan, Zan; Xie, Dejian; Wang, Xiaofei; Li, Jianing; Yu, Yanan; Wang, Jinbin; Ding, Baoxu; Zhang, Zhibin; Li, Changping; Bian, Yao; Zhang, Ai; Wu, Ying; Liu, Bao; Gong, Lei

2018-06-01

The non-random spatial packing of chromosomes in the nucleus plays a critical role in orchestrating gene expression and genome function. Here, we present a Hi-C analysis of the chromatin interaction patterns in rice (Oryza sativa L.) at hierarchical architectural levels. We confirm that rice chromosomes occupy their own territories with certain preferential inter-chromosomal associations. Moderate compartment delimitation and extensive TADs (Topologically Associated Domains) were determined to be associated with heterogeneous genomic compositions and epigenetic marks in the rice genome. We found subtle features including chromatin loops, gene loops, and off-/near-diagonal intensive interaction regions. Gene chromatin loops associated with H3K27me3 could be positively involved in gene expression. In addition to insulated enhancing effects for neighbor gene expression, the identified rice gene loops could bi-directionally (+/-) affect the expression of looped genes themselves. Finally, web-interleaved off-diagonal IHIs/KEEs (Interactive Heterochromatic Islands or KNOT ENGAGED ELEMENTs) could trap transposable elements (TEs) via the enrichment of silencing epigenetic marks. In parallel, the near-diagonal FIREs (Frequently Interacting Regions) could positively affect the expression of involved genes. Our results suggest that the chromatin packing pattern in rice is generally similar to that in Arabidopsis thaliana but with clear differences at specific structural levels. We conclude that genomic composition, epigenetic modification, and transcriptional activity could act in combination to shape global and local chromatin packing in rice. Our results confirm recent observations in rice and A. thaliana but also provide additional insights into the patterns and features of chromatin organization in higher plants. © 2018 The Authors. The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface.

PubMed

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B; Almon, Richard R; DuBois, Debra C; Jusko, William J; Hoffman, Eric P

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp).
Genome and transcriptome sequencing in prospective metastatic triple-negative breast cancer uncovers therapeutic vulnerabilities.

PubMed

Craig, David W; O'Shaughnessy, Joyce A; Kiefer, Jeffrey A; Aldrich, Jessica; Sinari, Shripad; Moses, Tracy M; Wong, Shukmei; Dinh, Jennifer; Christoforides, Alexis; Blum, Joanne L; Aitelli, Cristi L; Osborne, Cynthia R; Izatt, Tyler; Kurdoglu, Ahmet; Baker, Angela; Koeman, Julie; Barbacioru, Catalin; Sakarya, Onur; De La Vega, Francisco M; Siddiqui, Asim; Hoang, Linh; Billings, Paul R; Salhia, Bodour; Tolcher, Anthony W; Trent, Jeffrey M; Mousses, Spyro; Von Hoff, Daniel; Carpten, John D

2013-01-01

Triple-negative breast cancer (TNBC) is characterized by the absence of expression of estrogen receptor, progesterone receptor, and HER-2. Thirty percent of patients recur after first-line treatment, and metastatic TNBC (mTNBC) has a poor prognosis with median survival of one year. Here, we present initial analyses of whole genome and transcriptome sequencing data from 14 prospective mTNBC. We have cataloged the collection of somatic genomic alterations in these advanced tumors, particularly those that may inform targeted therapies. Genes mutated in multiple tumors included TP53, LRP1B, HERC1, CDH5, RB1, and NF1. Notable genes involved in focal structural events were CTNNA1, PTEN, FBXW7, BRCA2, WT1, FGFR1, KRAS, HRAS, ARAF, BRAF, and PGCP. Homozygous deletion of CTNNA1 was detected in 2 of 6 African Americans. RNA sequencing revealed consistent overexpression of the FOXM1 gene when tumor gene expression was compared with nonmalignant breast samples. Using an outlier analysis of gene expression comparing one cancer with all the others, we detected expression patterns unique to each patient's tumor. Integrative DNA/RNA analysis provided evidence for deregulation of mutated genes, including the monoallelic expression of TP53 mutations. Finally, molecular alterations in several cancers supported targeted therapeutic intervention on clinical trials with known inhibitors, particularly for alterations in the RAS/RAF/MEK/ERK and PI3K/AKT/mTOR pathways. In conclusion, whole genome and transcriptome profiling of mTNBC have provided insights into somatic events occurring in this difficult to treat cancer. These genomic data have guided patients to investigational treatment trials and provide hypotheses for future trials in this irremediable cancer.
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface

PubMed Central

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B.; Almon, Richard R.; DuBois, Debra C.; Jusko, William J.; Hoffman, Eric P.

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp). PMID:14681485
The Plant Structure Ontology, a Unified Vocabulary of Anatomy and Morphology of a Flowering Plant1[W][OA

PubMed Central

Ilic, Katica; Kellogg, Elizabeth A.; Jaiswal, Pankaj; Zapata, Felipe; Stevens, Peter F.; Vincent, Leszek P.; Avraham, Shulamit; Reiser, Leonore; Pujar, Anuradha; Sachs, Martin M.; Whitman, Noah T.; McCouch, Susan R.; Schaeffer, Mary L.; Ware, Doreen H.; Stein, Lincoln D.; Rhee, Seung Y.

2007-01-01

Formal description of plant phenotypes and standardized annotation of gene expression and protein localization data require uniform terminology that accurately describes plant anatomy and morphology. This facilitates cross species comparative studies and quantitative comparison of phenotypes and expression patterns. A major drawback is variable terminology that is used to describe plant anatomy and morphology in publications and genomic databases for different species. The same terms are sometimes applied to different plant structures in different taxonomic groups. Conversely, similar structures are named by their species-specific terms. To address this problem, we created the Plant Structure Ontology (PSO), the first generic ontological representation of anatomy and morphology of a flowering plant. The PSO is intended for a broad plant research community, including bench scientists, curators in genomic databases, and bioinformaticians. The initial releases of the PSO integrated existing ontologies for Arabidopsis (Arabidopsis thaliana), maize (Zea mays), and rice (Oryza sativa); more recent versions of the ontology encompass terms relevant to Fabaceae, Solanaceae, additional cereal crops, and poplar (Populus spp.). Databases such as The Arabidopsis Information Resource, Nottingham Arabidopsis Stock Centre, Gramene, MaizeGDB, and SOL Genomics Network are using the PSO to describe expression patterns of genes and phenotypes of mutants and natural variants and are regularly contributing new annotations to the Plant Ontology database. The PSO is also used in specialized public databases, such as BRENDA, GENEVESTIGATOR, NASCArrays, and others. Over 10,000 gene annotations and phenotype descriptions from participating databases can be queried and retrieved using the Plant Ontology browser. The PSO, as well as contributed gene associations, can be obtained at www.plantontology.org. PMID:17142475
Structural imprints in vivo decode RNA regulatory mechanisms

PubMed Central

Spitale, Robert C.; Flynn, Ryan A.; Zhang, Qiangfeng Cliff; Crisalli, Pete; Lee, Byron; Jung, Jong-Wha; Kuchelmeister, Hannes Y.; Batista, Pedro J.; Torre, Eduardo A.; Kool, Eric T.; Chang, Howard Y.

2015-01-01

Visualizing the physical basis for molecular behavior inside living cells is a grand challenge in biology. RNAs are central to biological regulation, and RNA’s ability to adopt specific structures intimately controls every step of the gene expression program1. However, our understanding of physiological RNA structures is limited; current in vivo RNA structure profiles view only two of four nucleotides that make up RNA2,3. Here we present a novel biochemical approach, In Vivo Click SHAPE (icSHAPE), that enables the first global view of RNA secondary structures of all four bases in living cells. icSHAPE of mouse embryonic stem cell transcriptome versus purified RNA folded in vitro shows that the structural dynamics of RNA in the cellular environment distinguishes different classes of RNAs and regulatory elements. Structural signatures at translational start sites and ribosome pause sites are conserved from in vitro, suggesting that these RNA elements are programmed by sequence. In contrast, focal structural rearrangements in vivo reveal precise interfaces of RNA with RNA binding proteins or RNA modification sites that are consistent with atomic-resolution structural data. Such dynamic structural footprints enable accurate prediction of RNA-protein interactions and N6-methyladenosine (m6A) modification genome-wide. These results open the door for structural genomics of RNA in living cells and reveal key physiological structures controlling gene expression. PMID:25799993
Structural imprints in vivo decode RNA regulatory mechanisms.

PubMed

Spitale, Robert C; Flynn, Ryan A; Zhang, Qiangfeng Cliff; Crisalli, Pete; Lee, Byron; Jung, Jong-Wha; Kuchelmeister, Hannes Y; Batista, Pedro J; Torre, Eduardo A; Kool, Eric T; Chang, Howard Y

2015-03-26

Visualizing the physical basis for molecular behaviour inside living cells is a great challenge for biology. RNAs are central to biological regulation, and the ability of RNA to adopt specific structures intimately controls every step of the gene expression program. However, our understanding of physiological RNA structures is limited; current in vivo RNA structure profiles include only two of the four nucleotides that make up RNA. Here we present a novel biochemical approach, in vivo click selective 2'-hydroxyl acylation and profiling experiment (icSHAPE), which enables the first global view, to our knowledge, of RNA secondary structures in living cells for all four bases. icSHAPE of the mouse embryonic stem cell transcriptome versus purified RNA folded in vitro shows that the structural dynamics of RNA in the cellular environment distinguish different classes of RNAs and regulatory elements. Structural signatures at translational start sites and ribosome pause sites are conserved from in vitro conditions, suggesting that these RNA elements are programmed by sequence. In contrast, focal structural rearrangements in vivo reveal precise interfaces of RNA with RNA-binding proteins or RNA-modification sites that are consistent with atomic-resolution structural data. Such dynamic structural footprints enable accurate prediction of RNA-protein interactions and N(6)-methyladenosine (m(6)A) modification genome wide. These results open the door for structural genomics of RNA in living cells and reveal key physiological structures controlling gene expression.

Karyotype Stability and Unbiased Fractionation in the Paleo-Allotetraploid Cucurbita Genomes.

PubMed

Sun, Honghe; Wu, Shan; Zhang, Guoyu; Jiao, Chen; Guo, Shaogui; Ren, Yi; Zhang, Jie; Zhang, Haiying; Gong, Guoyi; Jia, Zhangcai; Zhang, Fan; Tian, Jiaxing; Lucas, William J; Doyle, Jeff J; Li, Haizhen; Fei, Zhangjun; Xu, Yong

2017-10-09

The Cucurbita genus contains several economically important species in the Cucurbitaceae family. Here, we report high-quality genome sequences of C. maxima and C. moschata and provide evidence supporting an allotetraploidization event in Cucurbita. We are able to partition the genome into two homoeologous subgenomes based on different genetic distances to melon, cucumber, and watermelon in the Benincaseae tribe. We estimate that the two diploid progenitors successively diverged from Benincaseae around 31 and 26 million years ago (Mya), respectively, and the allotetraploidization happened at some point between 26 Mya and 3 Mya, the estimated date when C. maxima and C. moschata diverged. The subgenomes have largely maintained the chromosome structures of their diploid progenitors. Such long-term karyotype stability after polyploidization has not been commonly observed in plant polyploids. The two subgenomes have retained similar numbers of genes, and neither subgenome is globally dominant in gene expression. Allele-specific expression analysis in the C. maxima × C. moschata interspecific F 1 hybrid and their two parents indicates the predominance of trans-regulatory effects underlying expression divergence of the parents, and detects transgressive gene expression changes in the hybrid correlated with heterosis in important agronomic traits. Our study provides insights into polyploid genome evolution and valuable resources for genetic improvement of cucurbit crops. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.
Predicting effects of structural stress in a genome-reduced model bacterial metabolism

NASA Astrophysics Data System (ADS)

Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

2012-08-01

Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.
Genome-Wide Identification and Expression Analysis of WRKY Gene Family in Capsicum annuum L.

PubMed Central

Diao, Wei-Ping; Snyder, John C.; Wang, Shu-Bin; Liu, Jin-Bing; Pan, Bao-Gui; Guo, Guang-Jun; Wei, Ge

2016-01-01

The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating multiple biological processes, especially in regulating defense against biotic and abiotic stresses. However, little information is available about WRKYs in pepper (Capsicum annuum L.). The recent release of completely assembled genome sequences of pepper allowed us to perform a genome-wide investigation for pepper WRKY proteins. In the present study, a total of 71 WRKY genes were identified in the pepper genome. According to structural features of their encoded proteins, the pepper WRKY genes (CaWRKY) were classified into three main groups, with the second group further divided into five subgroups. Genome mapping analysis revealed that CaWRKY were enriched on four chromosomes, especially on chromosome 1, and 15.5% of the family members were tandemly duplicated genes. A phylogenetic tree was constructed depending on WRKY domain' sequences derived from pepper and Arabidopsis. The expression of 21 selected CaWRKY genes in response to seven different biotic and abiotic stresses (salt, heat shock, drought, Phytophtora capsici, SA, MeJA, and ABA) was evaluated by quantitative RT-PCR; Some CaWRKYs were highly expressed and up-regulated by stress treatment. Our results will provide a platform for functional identification and molecular breeding studies of WRKY genes in pepper. PMID:26941768
CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

PubMed

Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

2015-12-17

Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
The path to enlightenment: making sense of genomic and proteomic information.

PubMed

Maurer, Martin H

2004-05-01

Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.
Three Infectious Viral Species Lying in Wait in the Banana Genome

PubMed Central

Chabannes, Matthieu; Baurens, Franc-Christophe; Duroy, Pierre-Olivier; Bocs, Stéphanie; Vernerey, Marie-Stéphanie; Rodier-Goud, Marguerite; Barbe, Valérie; Gayral, Philippe

2013-01-01

Plant pararetroviruses integrate serendipitously into their host genomes. The banana genome harbors integrated copies of banana streak virus (BSV) named endogenous BSV (eBSV) that are able to release infectious pararetrovirus. In this investigation, we characterized integrants of three BSV species—Goldfinger (eBSGFV), Imove (eBSImV), and Obino l'Ewai (eBSOLV)—in the seedy Musa balbisiana Pisang klutuk wulung (PKW) by studying their molecular structure, genomic organization, genomic landscape, and infectious capacity. All eBSVs exhibit extensive viral genome duplications and rearrangements. eBSV segregation analysis on an F1 population of PKW combined with fluorescent in situ hybridization analysis showed that eBSImV, eBSOLV, and eBSGFV are each present at a single locus. eBSOLV and eBSGFV contain two distinct alleles, whereas eBSImV has two structurally identical alleles. Genotyping of both eBSV and viral particles expressed in the progeny demonstrated that only one allele for each species is infectious. The infectious allele of eBSImV could not be identified since the two alleles are identical. Finally, we demonstrate that eBSGFV and eBSOLV are located on chromosome 1 and eBSImV is located on chromosome 2 of the reference Musa genome published recently. The structure and evolution of eBSVs suggest sequential integration into the plant genome, and haplotype divergence analysis confirms that the three loci display differential evolution. Based on our data, we propose a model for BSV integration and eBSV evolution in the Musa balbisiana genome. The mutual benefits of this unique host-pathogen association are also discussed. PMID:23720724
Gene Expression Dynamics Inspector (GEDI): for integrative analysis of expression profiles

NASA Technical Reports Server (NTRS)

Eichler, Gabriel S.; Huang, Sui; Ingber, Donald E.

2003-01-01

Genome-wide expression profiles contain global patterns that evade visual detection in current gene clustering analysis. Here, a Gene Expression Dynamics Inspector (GEDI) is described that uses self-organizing maps to translate high-dimensional expression profiles of time courses or sample classes into animated, coherent and robust mosaics images. GEDI facilitates identification of interesting patterns of molecular activity simultaneously across gene, time and sample space without prior assumption of any structure in the data, and then permits the user to retrieve genes of interest. Important changes in genome-wide activities may be quickly identified based on 'Gestalt' recognition and hence, GEDI may be especially useful for non-specialist end users, such as physicians. AVAILABILITY: GEDI v1.0 is written in Matlab, and binary Matlab.dll files which require Matlab to run can be downloaded for free by academic institutions at http://www.chip.org/ge/gedihome.html Supplementary information: http://www.chip.org/ge/gedihome.html.
Comparative Analysis of Syntenic Genes in Grass Genomes Reveals Accelerated Rates of Gene Structure and Coding Sequence Evolution in Polyploid Wheat1[W][OA

PubMed Central

Akhunov, Eduard D.; Sehgal, Sunish; Liang, Hanquan; Wang, Shichen; Akhunova, Alina R.; Kaur, Gaganpreet; Li, Wanlong; Forrest, Kerrie L.; See, Deven; Šimková, Hana; Ma, Yaqin; Hayden, Matthew J.; Luo, Mingcheng; Faris, Justin D.; Doležel, Jaroslav; Gill, Bikram S.

2013-01-01

Cycles of whole-genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied by comparing the patterns of gene structure changes, alternative splicing (AS), and codon substitution rates among wheat and model grass genomes. In orthologous gene sets, significantly more acquired and lost exonic sequences were detected in wheat than in model grasses. In wheat, 35% of these gene structure rearrangements resulted in frame-shift mutations and premature termination codons. An increased codon mutation rate in the wheat lineage compared with Brachypodium distachyon was found for 17% of orthologs. The discovery of premature termination codons in 38% of expressed genes was consistent with ongoing pseudogenization of the wheat genome. The rates of AS within the individual wheat subgenomes (21%–25%) were similar to diploid plants. However, we uncovered a high level of AS pattern divergence between the duplicated homeologous copies of genes. Our results are consistent with the accelerated accumulation of AS isoforms, nonsynonymous mutations, and gene structure rearrangements in the wheat lineage, likely due to genetic redundancy created by WGDs. Whereas these processes mostly contribute to the degeneration of a duplicated genome and its diploidization, they have the potential to facilitate the origin of new functional variations, which, upon selection in the evolutionary lineage, may play an important role in the origin of novel traits. PMID:23124323
Genome-wide identification and analysis of the SBP-box family genes in apple (Malus × domestica Borkh.).

PubMed

Li, Jun; Hou, Hongmin; Li, Xiaoqin; Xiang, Jiang; Yin, Xiangjing; Gao, Hua; Zheng, Yi; Bassett, Carole L; Wang, Xiping

2013-09-01

SQUAMOSA promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors and play many crucial roles in plant development. In this study, 27 SBP-box gene family members were identified in the apple (Malus × domestica Borkh.) genome, 15 of which were suggested to be putative targets of MdmiR156. Plant SBPs were classified into eight groups according to the phylogenetic analysis of SBP-domain proteins. Gene structure, gene chromosomal location and synteny analyses of MdSBP genes within the apple genome demonstrated that tandem and segmental duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of the SBP-box gene family in apple. Additionally, synteny analysis between apple and Arabidopsis indicated that several paired homologs of MdSBP and AtSPL genes were located in syntenic genomic regions. Tissue-specific expression analysis of MdSBP genes in apple demonstrated their diversified spatiotemporal expression patterns. Most MdmiR156-targeted MdSBP genes, which had relatively high transcript levels in stems, leaves, apical buds and some floral organs, exhibited a more differential expression pattern than most MdmiR156-nontargeted MdSBP genes. Finally, expression analysis of MdSBP genes in leaves upon various plant hormone treatments showed that many MdSBP genes were responsive to different plant hormones, indicating that MdSBP genes may be involved in responses to hormone signaling during stress or in apple development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
An extracellular disulfide bond forming protein (DsbF) from Mycobacterium tuberculosis: Structural, biochemical and gene expression analysis

PubMed Central

Chim, Nicholas; Riley, Robert; The, Juliana; Im, Soyeon; Segelke, Brent; Lekin, Tim; Yu, Minmin; Hung, Li Wei; Terwilliger, Tom; Whitelegge, Julian P.; Goulding, Celia W.

2010-01-01

Disulfide bond forming (Dsb) proteins ensure correct folding and disulfide bond formation of secreted proteins. Previously, we showed that Mycobacterium tuberculosis DsbE (Mtb DsbE, Rv2878c) aids in vitro oxidative folding of proteins. Here we present structural, biochemical and gene expression analyses of another putative Mtb secreted disulfide bond isomerase protein homologous to Mtb DsbE, Mtb DsbF (Rv1677). The X-ray crystal structure of Mtb DsbF reveals a conserved thioredoxin fold although the active-site cysteines may be modeled in both oxidized and reduced forms, in contrast to the solely reduced form in Mtb DsbE. Furthermore, the shorter loop region in Mtb DsbF results in a more solvent-exposed active site. Biochemical analyses show that, similar to Mtb DsbE, Mtb DsbF can oxidatively refold reduced, unfolded hirudin and has a comparable pKa for the active-site solvent-exposed cysteine. However, contrary to Mtb DsbE, the Mtb DsbF redox potential is more oxidizing and its reduced state is more stable. From computational genomics analysis of the M. tuberculosis genome, we identified a potential Mtb DsbF interaction partner, Rv1676, a predicted peroxiredoxin. Complex formation is supported by protein co-expression studies and inferred by gene expression profiles, whereby Mtb DsbF and Rv1676 are upregulated under similar environments. Additionally, comparison of Mtb DsbF and Mtb DsbE gene expression data indicate anticorrelated gene expression patterns, suggesting that these two proteins and their functionally linked partners constitute analogous pathways that may function under different conditions. PMID:20060836
Gene conversion events and variable degree of homogenization of rDNA loci in cultivars of Brassica napus

PubMed Central

Sochorová, Jana; Coriton, Olivier; Kuderová, Alena; Lunerová, Jana; Chèvre, Anne-Marie; Kovařík, Aleš

2017-01-01

Background and aims Brassica napus (AACC, 2n = 38, oilseed rape) is a relatively recent allotetraploid species derived from the putative progenitor diploid species Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18). To determine the influence of intensive breeding conditions on the evolution of its genome, we analysed structure and copy number of rDNA in 21 cultivars of B. napus, representative of genetic diversity. Methods We used next-generation sequencing genomic approaches, Southern blot hybridization, expression analysis and fluorescence in situ hybridization (FISH). Subgenome-specific sequences derived from rDNA intergenic spacers (IGS) were used as probes for identification of loci composition on chromosomes. Key Results Most B. napus cultivars (18/21, 86 %) had more A-genome than C-genome rDNA copies. Three cultivars analysed by FISH (‘Darmor’, ‘Yudal’ and ‘Asparagus kale’) harboured the same number (12 per diploid set) of loci. In B. napus ‘Darmor’, the A-genome-specific rDNA probe hybridized to all 12 rDNA loci (eight on the A-genome and four on the C-genome) while the C-genome-specific probe showed weak signals on the C-genome loci only. Deep sequencing revealed high homogeneity of arrays suggesting that the C-genome genes were largely overwritten by the A-genome variants in B. napus ‘Darmor’. In contrast, B. napus ‘Yudal’ showed a lack of gene conversion evidenced by additive inheritance of progenitor rDNA variants and highly localized hybridization signals of subgenome-specific probes on chromosomes. Brassica napus ‘Asparagus kale’ showed an intermediate pattern to ‘Darmor’ and ‘Yudal’. At the expression level, most cultivars (95 %) exhibited stable A-genome nucleolar dominance while one cultivar (‘Norin 9’) showed co-dominance. Conclusions The B. napus cultivars differ in the degree and direction of rDNA homogenization. The prevalent direction of gene conversion (towards the A-genome) correlates with the direction of expression dominance indicating that gene activity may be needed for interlocus gene conversion. PMID:27707747
Genome-wide identification of ABA receptor PYL family and expression analysis of PYLs in response to ABA and osmotic stress in Gossypium

PubMed Central

Miao, Wenwen; Sun, Lirong; Tian, Mi; Wang, Ji

2017-01-01

Abscisic acid (ABA) receptor pyrabactin resistance1/PYR1-like/regulatory components of ABA receptor (PYR1/PYL/RCAR) (named PYLs for simplicity) are core regulators of ABA signaling, and have been well studied in Arabidopsis and rice. However, knowledge is limited about the PYL family regarding genome organization, gene structure, phylogenesis, gene expression and protein interaction with downstream targets in Gossypium. A comprehensive analysis of the Gossypium PYL family was carried out, and 21, 20, 40 and 39 PYL genes were identified in the genomes from the diploid progenitor G. arboretum, G. raimondii and the tetraploid G. hirsutum and G. barbadense, respectively. Characterization of the physical properties, chromosomal locations, structures and phylogeny of these family members revealed that Gossypium PYLs were quite conservative among the surveyed cotton species. Segmental duplication might be the main force promoting the expansion of PYLs, and the majority of the PYLs underwent evolution under purifying selection in Gossypium. Additionally, the expression profiles of GhPYL genes were specific in tissues. Transcriptions of many GhPYL genes were inhibited by ABA treatments and induced by osmotic stress. A number of GhPYLs can interact with GhABI1A or GhABID in the presence and/or absence of ABA by the yeast-two hybrid method in cotton. PMID:29230363
Genome-wide identification of ABA receptor PYL family and expression analysis of PYLs in response to ABA and osmotic stress in Gossypium.

PubMed

Zhang, Gaofeng; Lu, Tingting; Miao, Wenwen; Sun, Lirong; Tian, Mi; Wang, Ji; Hao, Fushun

2017-01-01

Abscisic acid (ABA) receptor pyrabactin resistance1/PYR1-like/regulatory components of ABA receptor (PYR1/PYL/RCAR) (named PYLs for simplicity) are core regulators of ABA signaling, and have been well studied in Arabidopsis and rice. However, knowledge is limited about the PYL family regarding genome organization, gene structure, phylogenesis, gene expression and protein interaction with downstream targets in Gossypium . A comprehensive analysis of the Gossypium PYL family was carried out, and 21, 20, 40 and 39 PYL genes were identified in the genomes from the diploid progenitor G. arboretum , G. raimondii and the tetraploid G. hirsutum and G. barbadense , respectively. Characterization of the physical properties, chromosomal locations, structures and phylogeny of these family members revealed that Gossypium PYLs were quite conservative among the surveyed cotton species. Segmental duplication might be the main force promoting the expansion of PYLs , and the majority of the PYLs underwent evolution under purifying selection in Gossypium . Additionally, the expression profiles of GhPYL genes were specific in tissues. Transcriptions of many GhPYL genes were inhibited by ABA treatments and induced by osmotic stress. A number of GhPYLs can interact with GhABI1A or GhABID in the presence and/or absence of ABA by the yeast-two hybrid method in cotton.
How gene order is influenced by the biophysics of transcription regulation

PubMed Central

Kolesov, Grigory; Wunderlich, Zeba; Laikova, Olga N.; Gelfand, Mikhail S.; Mirny, Leonid A.

2007-01-01

What are the forces that shape the structure of prokaryotic genomes: the order of genes, their proximity, and their orientation? Coregulation and coordinated horizontal gene transfer are believed to promote the proximity of functionally related genes and the formation of operons. However, forces that influence the structure of the genome beyond the level of a single operon remain unknown. Here, we show that the biophysical mechanism by which regulatory proteins search for their sites on DNA can impose constraints on genome structure. Using simulations, we demonstrate that rapid and reliable gene regulation requires that the transcription factor (TF) gene be close to the site on DNA the TF has to bind, thus promoting the colocalization of TF genes and their targets on the genome. We use parameters that have been measured in recent experiments to estimate the relevant length and times scales of this process and demonstrate that the search for a cognate site may be prohibitively slow if a TF has a low copy number and is not colocalized. We also analyze TFs and their sites in a number of bacterial genomes, confirm that they are colocalized significantly more often than expected, and show that this observation cannot be attributed to the pressure for coregulation or formation of selfish gene clusters, thus supporting the role of the biophysical constraint in shaping the structure of prokaryotic genomes. Our results demonstrate how spatial organization can influence timing and noise in gene expression. PMID:17709750
Whole-Genome Analysis of a Novel Fish Reovirus (MsReV) Discloses Aquareovirus Genomic Structure Relationship with Host in Saline Environments.

PubMed

Chen, Zhong-Yuan; Gao, Xiao-Chan; Zhang, Qi-Ya

2015-08-03

Aquareoviruses are serious pathogens of aquatic animals. Here, genome characterization and functional gene analysis of a novel aquareovirus, largemouth bass Micropterus salmoides reovirus (MsReV), was described. It comprises 11 dsRNA segments (S1-S11) covering 24,024 bp, and encodes 12 putative proteins including the inclusion forming-related protein NS87 and the fusion-associated small transmembrane (FAST) protein NS22. The function of NS22 was confirmed by expression in fish cells. Subsequently, MsReV was compared with two representative aquareoviruses, saltwater fish turbot Scophthalmus maximus reovirus (SMReV) and freshwater fish grass carp reovirus strain 109 (GCReV-109). MsReV NS87 and NS22 genes have the same structure and function with those of SMReV, whereas GCReV-109 is either missing the coiled-coil region in NS79 or the gene-encoding NS22. Significant similarities are also revealed among equivalent genome segments between MsReV and SMReV, but a difference is found between MsReV and GCReV-109. Furthermore, phylogenetic analysis showed that 13 aquareoviruses could be divided into freshwater and saline environments subgroups, and MsReV was closely related to SMReV in saline environments. Consequently, these viruses from hosts in saline environments have more genomic structural similarities than the viruses from hosts in freshwater. This is the first study of the relationships between aquareovirus genomic structure and their host environments.
Structure and Function of the Splice Variants of TMPRSS2-ERG, a Prevalent Genomic Alteration in Prostate Cancer

DTIC Science & Technology

2011-09-01

the ETS family of transcription factors showing diverse expression patterns in human tissues (Turner and Watson, 2008). ERG, similar to other...and adult mouse tissues . Most striking of these observations was highly selective and abundant expression of erg protein in endothelial cells of...mouse tissues . We for the first time clarified that endogenous ERG was not expressed in normal mouse prostate epithelium (Mohamed et al., 2010
Distinct expression patterns of glycoprotein hormone-alpha2 and -beta5 in a basal chordate suggest independent developmental functions.

PubMed

Dos Santos, Sandra; Bardet, Claire; Bertrand, Stephanie; Escriva, Hector; Habert, Damien; Querat, Bruno

2009-08-01

The vertebrate glycoprotein hormones (GpHs), gonadotropins and thyrotropin, are heterodimers composed of a common alpha- and specific beta-subunit. The recombinant heterodimer of two additional, structurally related proteins identified in vertebrate and protostome genomes, the glycoproteins-alpha2 (GPA2) and-beta5 (GPB5), was shown to activate the thyrotropin receptor and was therefore named thyrostimulin. However, differences in tissue distribution and expression levels of these proteins suggested that they might act as nonassociated factors, prompting further investigation on these proteins. In this study we show that GPA2 and GPB5 appeared with the emergence of bilateria and were maintained in most groups. These genes are tightly associated at the genomic level, an association, however, lost in tetrapods. Our structural and genomic environment comparison reinforces the hypothesis of their phylogenetic relationships with GpH-alpha and -beta. In contrast, the glycosylation status of GPA2 and GPB5 is highly variable further questioning heterodimer secretory efficiency and activity. As a first step toward understanding their function, we investigated the spatiotemporal expression of GPA2 and GPB5 genes at different developmental stages in a basal chordate, the amphioxus. Expression of GPB5 was essentially ubiquitous with an anteroposterior gradient in embryos. GPA2 embryonic and larvae expression was restricted to specific areas and, interestingly, partially overlapped that of a GpH receptor-related gene. In conclusion, we speculate that GPA2 and GPB5 have nondispensable and coordinated functions related to a novelty appeared with bilateria. These proteins would be active during embryonic development in a manner that does not require their heterodimerization.
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

PubMed Central

Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S

2009-01-01

In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455
SINEs.

PubMed

Kramerov, Dmitri A; Vassetzky, Nikita S

2011-01-01

Short interspersed elements (SINEs) are mobile genetic elements that invade the genomes of many eukaryotes. Since their discovery about 30 years ago, many gaps in our understanding of the biology and function of SINEs have been filled. This review summarizes the past and recent advances in the studies of SINEs. The structure and origin of SINEs as well as the processes involved in their amplification, transcription, RNA processing, reverse transcription, and integration of a SINE copy into the genome are considered. Then we focus on the significance of SINEs for the host genomes. While these genomic parasites can be deleterious to the cell, the long-term being in the genome has made SINEs a valuable source of genetic variation providing regulatory elements for gene expression, alternative splice sites, polyadenylation signals, and even functional RNA genes. Copyright © 2011 John Wiley & Sons, Ltd.
Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

PubMed Central

2012-01-01

Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791

New Era of Studying RNA Secondary Structure and Its Influence on Gene Regulation in Plants.

PubMed

Yang, Xiaofei; Yang, Minglei; Deng, Hongjing; Ding, Yiliang

2018-01-01

The dynamic structure of RNA plays a central role in post-transcriptional regulation of gene expression such as RNA maturation, degradation, and translation. With the rise of next-generation sequencing, the study of RNA structure has been transformed from in vitro low-throughput RNA structure probing methods to in vivo high-throughput RNA structure profiling. The development of these methods enables incremental studies on the function of RNA structure to be performed, revealing new insights of novel regulatory mechanisms of RNA structure in plants. Genome-wide scale RNA structure profiling allows us to investigate general RNA structural features over 10s of 1000s of mRNAs and to compare RNA structuromes between plant species. Here, we provide a comprehensive and up-to-date overview of: (i) RNA structure probing methods; (ii) the biological functions of RNA structure; (iii) genome-wide RNA structural features corresponding to their regulatory mechanisms; and (iv) RNA structurome evolution in plants.
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene.

PubMed

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R

2016-07-03

Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish.
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene

PubMed Central

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R.

2016-01-01

ABSTRACT Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish. PMID:27222321
The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial ‘mobilome’

PubMed Central

Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

2009-01-01

Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’. PMID:19840100
The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial 'mobilome'.

PubMed

Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

2009-11-01

Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element 'mobilome'.
Genome-wide Analyses of the Structural Gene Families Involved in the Legume-specific 5-Deoxyisoflavonoid Biosynthesis of Lotus japonicus

PubMed Central

Shimada, Norimoto; Sato, Shusei; Akashi, Tomoyoshi; Nakamura, Yasukazu; Tabata, Satoshi; Ayabe, Shin-ichi; Aoki, Toshio

2007-01-01

Abstract A model legume Lotus japonicus (Regel) K. Larsen is one of the subjects of genome sequencing and functional genomics programs. In the course of targeted approaches to the legume genomics, we analyzed the genes encoding enzymes involved in the biosynthesis of the legume-specific 5-deoxyisoflavonoid of L. japonicus, which produces isoflavan phytoalexins on elicitor treatment. The paralogous biosynthetic genes were assigned as comprehensively as possible by biochemical experiments, similarity searches, comparison of the gene structures, and phylogenetic analyses. Among the 10 biosynthetic genes investigated, six comprise multigene families, and in many cases they form gene clusters in the chromosomes. Semi-quantitative reverse transcriptase–PCR analyses showed coordinate up-regulation of most of the genes during phytoalexin induction and complex accumulation patterns of the transcripts in different organs. Some paralogous genes exhibited similar expression specificities, suggesting their genetic redundancy. The molecular evolution of the biosynthetic genes is discussed. The results presented here provide reliable annotations of the genes and genetic markers for comparative and functional genomics of leguminous plants. PMID:17452423
Was it worth it? Patients' perspectives on the perceived value of genomic-based individualized medicine.

PubMed

Halverson, Colin Me; Clift, Kristin E; McCormick, Jennifer B

2016-04-01

The value of genomic sequencing is often understood in terms of its ability to affect diagnosis or treatment. In these terms, successes occur only in a minority of cases. This paper presents views from patients who had exome sequencing done clinically to explore how they perceive the utility of genomic medicine. The authors used semi-structured, qualitative interviews in order to study patients' attitudes toward genomic sequencing in oncology and rare-disease settings. Participants from 37 cases were interviewed. In terms of the testing's key values-regardless of having received what clinicians described as meaningful results-participants expressed four qualities that are separate from traditional views of clinical utility: Participants felt they had been empowered over their own health. They felt they had contributed altruistically to the progress of genomic technology in medicine. They felt their suffering had been legitimated. They also felt a sense of closure, having done everything they could. Patients expressed overwhelmingly positive attitudes toward sequencing. Their rationale was not solely based on the results' clinical utility. It is important for clinicians to understand this non-medical reasoning as it pertains to patient decision-making and informed consent.
Sooty mangabey genome sequence provides insight into AIDS resistance in a natural SIV host.

PubMed

Palesch, David; Bosinger, Steven E; Tharp, Gregory K; Vanderford, Thomas H; Paiardini, Mirko; Chahroudi, Ann; Johnson, Zachary P; Kirchhoff, Frank; Hahn, Beatrice H; Norgren, Robert B; Patel, Nirav B; Sodora, Donald L; Dawoud, Reem A; Stewart, Caro-Beth; Seepo, Sara M; Harris, R Alan; Liu, Yue; Raveendran, Muthuswamy; Han, Yi; English, Adam; Thomas, Gregg W C; Hahn, Matthew W; Pipes, Lenore; Mason, Christopher E; Muzny, Donna M; Gibbs, Richard A; Sauter, Daniel; Worley, Kim; Rogers, Jeffrey; Silvestri, Guido

2018-01-03

In contrast to infections with human immunodeficiency virus (HIV) in humans and simian immunodeficiency virus (SIV) in macaques, SIV infection of a natural host, sooty mangabeys (Cercocebus atys), is non-pathogenic despite high viraemia. Here we sequenced and assembled the genome of a captive sooty mangabey. We conducted genome-wide comparative analyses of transcript assemblies from C. atys and AIDS-susceptible species, such as humans and macaques, to identify candidates for host genetic factors that influence susceptibility. We identified several immune-related genes in the genome of C. atys that show substantial sequence divergence from macaques or humans. One of these sequence divergences, a C-terminal frameshift in the toll-like receptor-4 (TLR4) gene of C. atys, is associated with a blunted in vitro response to TLR-4 ligands. In addition, we found a major structural change in exons 3-4 of the immune-regulatory protein intercellular adhesion molecule 2 (ICAM-2); expression of this variant leads to reduced cell surface expression of ICAM-2. These data provide a resource for comparative genomic studies of HIV and/or SIV pathogenesis and may help to elucidate the mechanisms by which SIV-infected sooty mangabeys avoid AIDS.
Direct Capture Technologies for Genomics-Guided Discovery of Natural Products.

PubMed

Chan, Andrew N; Santa Maria, Kevin C; Li, Bo

2016-01-01

Microbes are important producers of natural products, which have played key roles in understanding biology and treating disease. However, the full potential of microbes to produce natural products has yet to be realized; the overwhelming majority of natural product gene clusters encoded in microbial genomes remain "cryptic", and have not been expressed or characterized. In contrast to the fast-growing number of genomic sequences and bioinformatic tools, methods to connect these genes to natural product molecules are still limited, creating a bottleneck in genome-mining efforts to discover novel natural products. Here we review developing technologies that leverage the power of homologous recombination to directly capture natural product gene clusters and express them in model hosts for isolation and structural characterization. Although direct capture is still in its early stages of development, it has been successfully utilized in several different classes of natural products. These early successes will be reviewed, and the methods will be compared and contrasted with existing traditional technologies. Lastly, we will discuss the opportunities for the development of direct capture in other organisms, and possibilities to integrate direct capture with emerging genome-editing techniques to accelerate future study of natural products.
Sooty mangabey genome sequence provides insight into AIDS resistance in a natural SIV host

PubMed Central

Palesch, David; Bosinger, Steven E.; Tharp, Gregory K.; Vanderford, Thomas H.; Paiardini, Mirko; Chahroudi, Ann; Johnson, Zachary P.; Kirchhoff, Frank; Hahn, Beatrice H.; Norgren, Robert B.; Patel, Nirav B.; Sodora, Donald L.; Dawoud, Reem A.; Stewart, Caro-Beth; Seepo, Sara M.; Harris, R. Alan; Liu, Yue; Raveendran, Muthuswamy; Han, Yi; English, Adam; Thomas, Gregg W. C.; Hahn, Matthew W.; Pipes, Lenore; Mason, Christopher E.; Muzny, Donna M.; Gibbs, Richard A.; Sauter, Daniel; Worley, Kim; Rogers, Jeffrey; Silvestri, Guido

2018-01-01

In contrast to infections with human immunodeficiency virus (HIV) in humans and simian immunodeficiency virus (SIV) in macaques, SIV infection of a natural host, sooty mangabeys (Cercocebus atys), is non-pathogenic despite high viraemia1. Here we sequenced and assembled the genome of a captive sooty mangabey. We conducted genome-wide comparative analyses of transcript assemblies from C. atys and AIDS-susceptible species, such as humans and macaques, to identify candidates for host genetic factors that influence susceptibility. We identified several immune-related genes in the genome of C. atys that show substantial sequence divergence from macaques or humans. One of these sequence divergences, a C-terminal frameshift in the toll-like receptor-4 (TLR4) gene of C. atys, is associated with a blunted in vitro response to TLR-4 ligands. In addition, we found a major structural change in exons 3–4 of the immune-regulatory protein intercellular adhesion molecule 2 (ICAM-2); expression of this variant leads to reduced cell surface expression of ICAM-2. These data provide a resource for comparative genomic studies of HIV and/or SIV pathogenesis and may help to elucidate the mechanisms by which SIV-infected sooty mangabeys avoid AIDS. PMID:29300007
Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation.

PubMed

Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

2015-01-01

Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes.
Analysis of Genomic Regions of Trichoderma harzianum IOC-3844 Related to Biomass Degradation

PubMed Central

Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

2015-01-01

Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes. PMID:25836973
Structural Genomics of Bacterial Virulence Factors

DTIC Science & Technology

2005-05-01

is deficient to mammals and unique to bacteria, the enzymes involved in the pathway may be useful for antibiotic design. Recent genome sequence...the SARS S1 spike protein with a high affinity antibody (඘R)" ( Sui et al., 2004). Both the Si protein and antibody have been expressed and purified in... Streptococcus group are now in preparation. Key Research Accomplishments * Development of the VirFact database (J;p ’liL- tbur.htm o.i) of virulence
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Chromatin organization and global regulation of Hox gene clusters

PubMed Central

Montavon, Thomas; Duboule, Denis

2013-01-01

During development, a properly coordinated expression of Hox genes, within their different genomic clusters is critical for patterning the body plans of many animals with a bilateral symmetry. The fascinating correspondence between the topological organization of Hox clusters and their transcriptional activation in space and time has served as a paradigm for understanding the relationships between genome structure and function. Here, we review some recent observations, which revealed highly dynamic changes in the structure of chromatin at Hox clusters, in parallel with their activation during embryonic development. We discuss the relevance of these findings for our understanding of large-scale gene regulation. PMID:23650639
Conserved structure and expression of hsp70 paralogs in teleost fishes.

PubMed

Metzger, David C H; Hemmer-Hansen, Jakob; Schulte, Patricia M

2016-06-01

The cytosolic 70KDa heat shock proteins (Hsp70s) are widely used as biomarkers of environmental stress in ecological and toxicological studies in fish. Here we analyze teleost genome sequences to show that two genes encoding inducible hsp70s (hsp70-1 and hsp70-2) are likely present in all teleost fish. Phylogenetic and synteny analyses indicate that hsp70-1 and hsp70-2 are distinct paralogs that originated prior to the diversification of the teleosts. The promoters of both genes contain a TATA box and conserved heat shock elements (HSEs), but unlike mammalian HSP70s, both genes contain an intron in the 5' UTR. The hsp70-2 gene has undergone tandem duplication in several species. In addition, many other teleost genome assemblies have multiple copies of hsp70-2 present on separate, small, genomic scaffolds. To verify that these represent poorly assembled tandem duplicates, we cloned the genomic region surrounding hsp70-2 in Fundulus heteroclitus and showed that the hsp70-2 gene copies that are on separate scaffolds in the genome assembly are arranged as tandem duplicates. Real-time quantitative PCR of F. heteroclitus genomic DNA indicates that four copies of the hsp70-2 gene are likely present in the F. heteroclitus genome. Comparison of expression patterns in F. heteroclitus and Gasterosteus aculeatus demonstrates that hsp70-2 has a higher fold increase than hsp70-1 following heat shock in gill but not in muscle tissue, revealing a conserved difference in expression patterns between isoforms and tissues. These data indicate that ecological and toxicological studies using hsp70 as a biomarker in teleosts should take this complexity into account. Copyright © 2016 Elsevier Inc. All rights reserved.
Genome-wide Identification and Expression Analysis of the CDPK Gene Family in Grape, Vitis spp.

PubMed

Zhang, Kai; Han, Yong-Tao; Zhao, Feng-Li; Hu, Yang; Gao, Yu-Rong; Ma, Yan-Fei; Zheng, Yi; Wang, Yue-Jin; Wen, Ying-Qiang

2015-06-30

Calcium-dependent protein kinases (CDPKs) play vital roles in plant growth and development, biotic and abiotic stress responses, and hormone signaling. Little is known about the CDPK gene family in grapevine. In this study, we performed a genome-wide analysis of the 12X grape genome (Vitis vinifera) and identified nineteen CDPK genes. Comparison of the structures of grape CDPK genes allowed us to examine their functional conservation and differentiation. Segmentally duplicated grape CDPK genes showed high structural conservation and contributed to gene family expansion. Additional comparisons between grape and Arabidopsis thaliana demonstrated that several grape CDPK genes occured in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grapevine and Arabidopsis. Phylogenetic analysis divided the grape CDPK genes into four groups. Furthermore, we examined the expression of the corresponding nineteen homologous CDPK genes in the Chinese wild grape (Vitis pseudoreticulata) under various conditions, including biotic stress, abiotic stress, and hormone treatments. The expression profiles derived from reverse transcription and quantitative PCR suggested that a large number of VpCDPKs responded to various stimuli on the transcriptional level, indicating their versatile roles in the responses to biotic and abiotic stresses. Moreover, we examined the subcellular localization of VpCDPKs by transiently expressing six VpCDPK-GFP fusion proteins in Arabidopsis mesophyll protoplasts; this revealed high variability consistent with potential functional differences. Taken as a whole, our data provide significant insights into the evolution and function of grape CDPKs and a framework for future investigation of grape CDPK genes.
Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell

PubMed Central

Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A.; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

2016-01-01

The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. PMID:26601937
NCBI GEO: archive for functional genomics data sets—10 years on

PubMed Central

Barrett, Tanya; Troup, Dennis B.; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Muertter, Rolf N.; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra

2011-01-01

A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20 000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/. PMID:21097893
Insights into the noncoding RNome of nitrogen-fixing endosymbiotic α-proteobacteria.

PubMed

Jiménez-Zurdo, José I; Valverde, Claudio; Becker, Anke

2013-02-01

Symbiotic chronic infection of legumes by rhizobia involves transition of invading bacteria from a free-living environment in soil to an intracellular state as differentiated nitrogen-fixing bacteroids within the nodules elicited in the host plant. The adaptive flexibility demanded by this complex lifestyle is likely facilitated by the large set of regulatory proteins encoded by rhizobial genomes. However, proteins are not the only relevant players in the regulation of gene expression in bacteria. Large-scale high-throughput analysis of prokaryotic genomes is evidencing the expression of an unexpected plethora of small untranslated transcripts (sRNAs) with housekeeping or regulatory roles. sRNAs mostly act in response to environmental cues as post-transcriptional regulators of gene expression through protein-assisted base-pairing interactions with target mRNAs. Riboregulation contributes to fine-tune a wide range of bacterial processes which, in intracellular animal pathogens, largely compromise virulence traits. Here, we summarize the incipient knowledge about the noncoding RNome structure of nitrogen-fixing endosymbiotic bacteria as inferred from genome-wide searches for sRNA genes in the alfalfa partner Sinorhizobium meliloti and further comparative genomics analysis. The biology of relevant S. meliloti RNA chaperones (e.g., Hfq) is also reviewed as a first global indicator of the impact of riboregulation in the establishment of the symbiotic interaction.

Constructing a 'Chromonome' of Yellowtail (Seriola quinqueradiata) for Comparative Analysis of Chromosomal Rearrangements

PubMed Central

Kawase, Junya; Aoki, Jun-ya; Araki, Kazuo

2018-01-01

To investigate chromosome evolution in fish species, we newly mapped 181 markers that allowed us to construct a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map with 1,713 DNA markers, which was far denser than a previous map, and we anchored the de novo assembled sequences onto the RH physical map. Finally, we mapped a total of 13,977 expressed sequence tags (ESTs) on a genome sequence assembly aligned with the physical map. Using the high-density physical map and anchored genome sequences, we accurately compared the yellowtail genome structure with the genome structures of five model fishes to identify characteristics of the yellowtail genome. Between yellowtail and Japanese medaka (Oryzias latipes), almost all regions of the chromosomes were conserved and some blocks comprising several markers were translocated. Using the genome information of the spotted gar (Lepisosteus oculatus) as a reference, we further documented syntenic relationships and chromosomal rearrangements that occurred during evolution in four other acanthopterygian species (Japanese medaka, zebrafish, spotted green pufferfish and three-spined stickleback). The evolutionary chromosome translocation frequency was 1.5-2-times higher in yellowtail than in medaka, pufferfish, and stickleback. PMID:29290830
Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

PubMed

Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P

2013-03-21

Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term.

PubMed

Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-09-01

To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.
The detailed 3D multi-loop aggregate/rosette chromatin architecture and functional dynamic organization of the human and mouse genomes.

PubMed

Knoch, Tobias A; Wachsmuth, Malte; Kepper, Nick; Lesnussa, Michael; Abuseiris, Anis; Ali Imam, A M; Kolovos, Petros; Zuin, Jessica; Kockx, Christel E M; Brouwer, Rutger W W; van de Werken, Harmen J G; van IJcken, Wilfred F J; Wendt, Kerstin S; Grosveld, Frank G

2016-01-01

The dynamic three-dimensional chromatin architecture of genomes and its co-evolutionary connection to its function-the storage, expression, and replication of genetic information-is still one of the central issues in biology. Here, we describe the much debated 3D architecture of the human and mouse genomes from the nucleosomal to the megabase pair level by a novel approach combining selective high-throughput high-resolution chromosomal interaction capture ( T2C ), polymer simulations, and scaling analysis of the 3D architecture and the DNA sequence. The genome is compacted into a chromatin quasi-fibre with ~5 ± 1 nucleosomes/11 nm, folded into stable ~30-100 kbp loops forming stable loop aggregates/rosettes connected by similar sized linkers. Minor but significant variations in the architecture are seen between cell types and functional states. The architecture and the DNA sequence show very similar fine-structured multi-scaling behaviour confirming their co-evolution and the above. This architecture, its dynamics, and accessibility, balance stability and flexibility ensuring genome integrity and variation enabling gene expression/regulation by self-organization of (in)active units already in proximity. Our results agree with the heuristics of the field and allow "architectural sequencing" at a genome mechanics level to understand the inseparable systems genomic properties.
Genome-wide organization and expression profiling of the R2R3-MYB transcription factor family in pineapple (Ananas comosus).

PubMed

Liu, Chaoyang; Xie, Tao; Chen, Chenjie; Luan, Aiping; Long, Jianmei; Li, Chuhao; Ding, Yaqi; He, Yehua

2017-07-01

The MYB proteins comprise one of the largest families of plant transcription factors, which are involved in various plant physiological and biochemical processes. Pineapple (Ananas comosus) is one of three most important tropical fruits worldwide. The completion of pineapple genome sequencing provides a great opportunity to investigate the organization and evolutionary traits of pineapple MYB genes at the genome-wide level. In the present study, a total of 94 pineapple R2R3-MYB genes were identified and further phylogenetically classified into 26 subfamilies, as supported by the conserved gene structures and motif composition. Collinearity analysis indicated that the segmental duplication events played a crucial role in the expansion of pineapple MYB gene family. Further comparative phylogenetic analysis suggested that there have been functional divergences of MYB gene family during plant evolution. RNA-seq data from different tissues and developmental stages revealed distinct temporal and spatial expression profiles of the AcMYB genes. Further quantitative expression analysis showed the specific expression patterns of the selected putative stress-related AcMYB genes in response to distinct abiotic stress and hormonal treatments. The comprehensive expression analysis of the pineapple MYB genes, especially the tissue-preferential and stress-responsive genes, could provide valuable clues for further function characterization. In this work, we systematically identified AcMYB genes by analyzing the pineapple genome sequence using a set of bioinformatics approaches. Our findings provide a global insight into the organization, phylogeny and expression patterns of the pineapple R2R3-MYB genes, and hence contribute to the greater understanding of their biological roles in pineapple.
Genome-Wide Identification, Phylogenetic and Expression Analyses of the Ubiquitin-Conjugating Enzyme Gene Family in Maize.

PubMed

Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang

2015-01-01

Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize.
Genome-Wide Identification, Phylogenetic and Expression Analyses of the Ubiquitin-Conjugating Enzyme Gene Family in Maize

PubMed Central

Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang

2015-01-01

Background Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). Methodology/Principal Findings In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Conclusions Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize. PMID:26606743
The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

PubMed

Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

2015-01-01

A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

PubMed

Lee, Mikyung; Kim, Yangseok

2009-12-16

Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.
High-throughput Cloning and Expression of Integral Membrane Proteins in Escherichia coli

PubMed Central

Bruni, Renato

2014-01-01

Recently, several structural genomics centers have been established and a remarkable number of three-dimensional structures of soluble proteins have been solved. For membrane proteins, the number of structures solved has been significantly trailing those for their soluble counterparts, not least because over-expression and purification of membrane proteins is a much more arduous process. By using high throughput technologies, a large number of membrane protein targets can be screened simultaneously and a greater number of expression and purification conditions can be employed, leading to a higher probability of successfully determining the structure of membrane proteins. This unit describes the cloning, expression and screening of membrane proteins using high throughput methodologies developed in our laboratory. Basic Protocol 1 deals with the cloning of inserts into expression vectors by ligation-independent cloning. Basic Protocol 2 describes the expression and purification of the target proteins on a miniscale. Lastly, for the targets that express at the miniscale, basic protocols 3 and 4 outline the methods employed for the expression and purification of targets at the midi-scale, as well as a procedure for detergent screening and identification of detergent(s) in which the target protein is stable. PMID:24510647
The Tc1/mariner transposable element family shapes genetic variation and gene expression in the protist Trichomonas vaginalis

PubMed Central

2014-01-01

Background Trichomonas vaginalis is the most prevalent non-viral sexually transmitted parasite. Although the protist is presumed to reproduce asexually, 60% of its haploid genome contains transposable elements (TEs), known contributors to genome variability. The availability of a draft genome sequence and our collection of >200 global isolates of T. vaginalis facilitate the study and analysis of TE population dynamics and their contribution to genomic variability in this protist. Results We present here a pilot study of a subset of class II Tc1/mariner TEs that belong to the T. vaginalis Tvmar1 family. We report the genetic structure of 19 Tvmar1 loci, their ability to encode a full-length transposase protein, and their insertion frequencies in 94 global isolates from seven regions of the world. While most of the Tvmar1 elements studied exhibited low insertion frequencies, two of the 19 loci (locus 1 and locus 9) show high insertion frequencies of 1.00 and 0.96, respectively. The genetic structuring of the global populations identified by principal component analysis (PCA) of the Tvmar1 loci is in general agreement with published data based on genotyping, showing that Tvmar1 polymorphisms are a robust indicator of T. vaginalis genetic history. Analysis of expression of 22 genes flanking 13 Tvmar1 loci indicated significantly altered expression of six of the genes next to five Tvmar1 insertions, suggesting that the insertions have functional implications for T. vaginalis gene expression. Conclusions Our study is the first in T. vaginalis to describe Tvmar1 population dynamics and its contribution to genetic variability of the parasite. We show that a majority of our studied Tvmar1 insertion loci exist at very low frequencies in the global population, and insertions are variable between geographical isolates. In addition, we observe that low frequency insertion is related to reduced or abolished expression of flanking genes. While low insertion frequencies might be expected, we identified two Tvmar1 insertion loci that are fixed across global populations. This observation indicates that Tvmar1 insertion may have differing impacts and fitness costs in the host genome and may play varying roles in the adaptive evolution of T. vaginalis. PMID:24834134
Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

PubMed

Li, Sanshu; Breaker, Ronald R

2017-10-13

With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs. Although initial examinations of several motifs provide evidence for their likely functions, other motifs will require more in-depth analysis to reveal their functions.
‘Someday it will be the norm’: physician perspectives on the utility of genome sequencing for patient care in the MedSeq Project

PubMed Central

Vassy, Jason L; Christensen, Kurt D; Slashinski, Melody J; Lautenbach, Denise M; Raghavan, Sridharan; Robinson, Jill Oliver; Blumenthal-Barby, Jennifer; Feuerman, Lindsay Zausmer; Lehmann, Lisa Soleymani; Murray, Michael F; Green, Robert C; McGuire, Amy L

2015-01-01

Aim To describe practicing physicians’ perceived clinical utility of genome sequencing. Materials & methods We conducted a mixed-methods analysis of data from 18 primary care physicians and cardiologists in a study of the clinical integration of whole-genome sequencing. Physicians underwent brief genomics continuing medical education before completing surveys and semi-structured interviews. Results Physicians described sequencing as currently lacking clinical utility because of its uncertain interpretation and limited impact on clinical decision-making, but they expressed the idea that its clinical integration was inevitable. Potential clinical uses for sequencing included complementing other clinical information, risk stratification, motivating patient behavior change and pharmacogenetics. Conclusion Physicians given genomics continuing medical education use the language of both evidence-based and personalized medicine in describing the utility of genome-wide testing in patient care. PMID:25642274
Genome Evolution Due to Allopolyploidization in Wheat

PubMed Central

Feldman, Moshe; Levy, Avraham A.

2012-01-01

The wheat group has evolved through allopolyploidization, namely, through hybridization among species from the plant genera Aegilops and Triticum followed by genome doubling. This speciation process has been associated with ecogeographical expansion and with domestication. In the past few decades, we have searched for explanations for this impressive success. Our studies attempted to probe the bases for the wide genetic variation characterizing these species, which accounts for their great adaptability and colonizing ability. Central to our work was the investigation of how allopolyploidization alters genome structure and expression. We found in wheat that allopolyploidy accelerated genome evolution in two ways: (1) it triggered rapid genome alterations through the instantaneous generation of a variety of cardinal genetic and epigenetic changes (which we termed “revolutionary” changes), and (2) it facilitated sporadic genomic changes throughout the species’ evolution (i.e., evolutionary changes), which are not attainable at the diploid level. Our major findings in natural and synthetic allopolyploid wheat indicate that these alterations have led to the cytological and genetic diploidization of the allopolyploids. These genetic and epigenetic changes reflect the dynamic structural and functional plasticity of the allopolyploid wheat genome. The significance of this plasticity for the successful establishment of wheat allopolyploids, in nature and under domestication, is discussed. PMID:23135324
Comparative Transcriptomic Analysis of Two Brassica napus Near-Isogenic Lines Reveals a Network of Genes That Influences Seed Oil Accumulation.

PubMed

Wang, Jingxue; Singh, Sanjay K; Du, Chunfang; Li, Chen; Fan, Jianchun; Pattanaik, Sitakanta; Yuan, Ling

2016-01-01

Rapeseed ( Brassica napus ) is an important oil seed crop, providing more than 13% of the world's supply of edible oils. An in-depth knowledge of the gene network involved in biosynthesis and accumulation of seed oil is critical for the improvement of B. napus . Using available genomic and transcriptomic resources, we identified 1,750 acyl-lipid metabolism (ALM) genes that are distributed over 19 chromosomes in the B . napus genome. B. rapa and B. oleracea , two diploid progenitors of B. napus , contributed almost equally to the ALM genes. Genome collinearity analysis demonstrated that the majority of the ALM genes have arisen due to genome duplication or segmental duplication events. In addition, we profiled the expression patterns of the ALM genes in four different developmental stages. Furthermore, we developed two B. napus near isogenic lines (NILs). The high oil NIL, YC13-559, accumulates significantly higher (∼10%) seed oil compared to the other, YC13-554. Comparative gene expression analysis revealed upregulation of lipid biosynthesis-related regulatory genes in YC13-559, including SHOOTMERISTEMLESS, LEAFY COTYLEDON 1 (LEC1), LEC2, FUSCA3, ABSCISIC ACID INSENSITIVE 3 (ABI3), ABI4, ABI5 , and WRINKLED1 , as well as structural genes, such as ACETYL-CoA CARBOXYLASE, ACYL-CoA DIACYLGLYCEROL ACYLTRANSFERASE , and LONG - CHAIN ACYL-CoA SYNTHETASES . We observed that several genes related to the phytohormones, gibberellins, jasmonate, and indole acetic acid, were differentially expressed in the NILs. Our findings provide a broad account of the numbers, distribution, and expression profiles of acyl-lipid metabolism genes, as well as gene networks that potentially control oil accumulation in B . napus seeds. The upregulation of key regulatory and structural genes related to lipid biosynthesis likely plays a major role for the increased seed oil in YC13-559.
Activation of the alpha-globin gene expression correlates with dramatic upregulation of nearby non-globin genes and changes in local and large-scale chromatin spatial structure.

PubMed

Ulianov, Sergey V; Galitsyna, Aleksandra A; Flyamer, Ilya M; Golov, Arkadiy K; Khrameeva, Ekaterina E; Imakaev, Maxim V; Abdennur, Nezar A; Gelfand, Mikhail S; Gavrilov, Alexey A; Razin, Sergey V

2017-07-11

In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation. We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation. Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.
Genome resequencing and transcriptome profiling reveal structural diversity and expression patterns of constitutive disease resistance genes in Huanglongbing-tolerant Poncirus trifoliata and its hybrids

PubMed Central

Rawat, Nidhi; Kumar, Brajendra; Albrecht, Ute; Du, Dongliang; Huang, Ming; Yu, Qibin; Zhang, Yi; Duan, Yong-Ping; Bowman, Kim D; Gmitter, Fred G; Deng, Zhanao

2017-01-01

Huanglongbing (HLB) is the most destructive bacterial disease of citrus worldwide. While most citrus varieties are susceptible to HLB, Poncirus trifoliata, a close relative of Citrus, and some of its hybrids with Citrus are tolerant to HLB. No specific HLB tolerance genes have been identified in P. trifoliata but recent studies have shown that constitutive disease resistance (CDR) genes were expressed at much higher levels in HLB-tolerant Poncirus hybrids and the expression of CDR genes was modulated by Candidatus Liberibacter asiaticus (CLas), the pathogen of HLB. The current study was undertaken to mine and characterize the CDR gene family in Citrus and Poncirus and to understand its association with HLB tolerance in Poncirus. We identified 17 CDR genes in two citrus genomes, deduced their structures, and investigated their phylogenetic relationships. We revealed that the expansion of the CDR family in Citrus seems to be due to segmental and tandem duplication events. Through genome resequencing and transcriptome sequencing, we identified eight CDR genes in the Poncirus genome (PtCDR1-PtCDR8). The number of SNPs was the highest in PtCDR2 and the lowest in PtCDR7. Most of the deletion and insertion events were observed in the UTR regions of Citrus and Poncirus CDR genes. PtCDR2 and PtCDR8 were in abundance in the leaf transcriptomes of two HLB-tolerant Poncirus genotypes and were also upregulated in HLB-tolerant, Poncirus hybrids as revealed by real-time PCR analysis. These two CDR genes seem to be good candidate genes for future studies of their role in citrus-CLas interactions. PMID:29152310
The membrane skeleton in Paramecium: Molecular characterization of a novel epiplasmin family and preliminary GFP expression results.

PubMed

Pomel, Sébastien; Diogon, Marie; Bouchard, Philippe; Pradel, Lydie; Ravet, Viviane; Coffe, Gérard; Viguès, Bernard

2006-02-01

Previous attempts to identify the membrane skeleton of Paramecium cells have revealed a protein pattern that is both complex and specific. The most prominent structural elements, epiplasmic scales, are centered around ciliary units and are closely apposed to the cytoplasmic side of the inner alveolar membrane. We sought to characterize epiplasmic scale proteins (epiplasmins) at the molecular level. PCR approaches enabled the cloning and sequencing of two closely related genes by amplifications of sequences from a macronuclear genomic library. Using these two genes (EPI-1 and EPI-2), we have contributed to the annotation of the Paramecium tetraurelia macronuclear genome and identified 39 additional (paralogous) sequences. Two orthologous sequences were found in the Tetrahymena thermophila genome. Structural analysis of the 43 sequences indicates that the hallmark of this new multigenic family is a 79 aa domain flanked by two Q-, P- and V-rich stretches of sequence that are much more variable in amino-acid composition. Such features clearly distinguish members of the multigenic family from epiplasmic proteins previously sequenced in other ciliates. The expression of Green Fluorescent Protein (GFP)-tagged epiplasmin showed significant labeling of epiplasmic scales as well as oral structures. We expect that the GFP construct described herein will prove to be a useful tool for comparative subcellular localization of different putative epiplasmins in Paramecium.
Differential nuclear scaffold/matrix attachment marks expressed genes.

PubMed

Linnemann, Amelia K; Platts, Adrian E; Krawetz, Stephen A

2009-02-15

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14-18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5' of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution.

Differential nuclear scaffold/matrix attachment marks expressed genes†

PubMed Central

Linnemann, Amelia K.; Platts, Adrian E.; Krawetz, Stephen A.

2009-01-01

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14–18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5′ of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution. PMID:19017725
Rewiring the severe acute respiratory syndrome coronavirus (SARS-CoV) transcription circuit: Engineering a recombination-resistant genome

NASA Astrophysics Data System (ADS)

Yount, Boyd; Roberts, Rhonda S.; Lindesmith, Lisa; Baric, Ralph S.

2006-08-01

Live virus vaccines provide significant protection against many detrimental human and animal diseases, but reversion to virulence by mutation and recombination has reduced appeal. Using severe acute respiratory syndrome coronavirus as a model, we engineered a different transcription regulatory circuit and isolated recombinant viruses. The transcription network allowed for efficient expression of the viral transcripts and proteins, and the recombinant viruses replicated to WT levels. Recombinant genomes were then constructed that contained mixtures of the WT and mutant regulatory circuits, reflecting recombinant viruses that might occur in nature. Although viable viruses could readily be isolated from WT and recombinant genomes containing homogeneous transcription circuits, chimeras that contained mixed regulatory networks were invariantly lethal, because viable chimeric viruses were not isolated. Mechanistically, mixed regulatory circuits promoted inefficient subgenomic transcription from inappropriate start sites, resulting in truncated ORFs and effectively minimize viral structural protein expression. Engineering regulatory transcription circuits of intercommunicating alleles successfully introduces genetic traps into a viral genome that are lethal in RNA recombinant progeny viruses. regulation | systems biology | vaccine design
Cloning and characterization of a Candida albicans maltase gene involved in sucrose utilization.

PubMed Central

Geber, A; Williamson, P R; Rex, J H; Sweeney, E C; Bennett, J E

1992-01-01

In order to isolate the structural gene involved in sucrose utilization, we screened a sucrose-induced Candida albicans cDNA library for clones expressing alpha-glucosidase activity. The C. albicans maltase structural gene (CAMAL2) was isolated. No other clones expressing alpha-glucosidase activity. were detected. A genomic CAMAL2 clone was obtained by screening a size-selected genomic library with the cDNA clone. DNA sequence analysis reveals that CAMAL2 encodes a 570-amino-acid protein which shares 50% identity with the maltase structural gene (MAL62) of Saccharomyces carlsbergensis. The substrate specificity of the recombinant protein purified from Escherichia coli identifies the enzyme as a maltase. Northern (RNA) analysis reveals that transcription of CAMAL2 is induced by maltose and sucrose and repressed by glucose. These results suggest that assimilation of sucrose in C. albicans relies on an inducible maltase enzyme. The family of genes controlling sucrose utilization in C. albicans shares similarities with the MAL gene family of Saccharomyces cerevisiae and provides a model system for studying gene regulation in this pathogenic yeast. Images PMID:1400249
Overview on Sobemoviruses and a Proposal for the Creation of the Family Sobemoviridae

PubMed Central

Sõmera, Merike; Sarmiento, Cecilia; Truve, Erkki

2015-01-01

The genus Sobemovirus, unassigned to any family, consists of viruses with single-stranded plus-oriented single-component RNA genomes and small icosahedral particles. Currently, 14 species within the genus have been recognized by the International Committee on Taxonomy of Viruses (ICTV) but several new species are to be recognized in the near future. Sobemovirus genomes are compact with a conserved structure of open reading frames and with short untranslated regions. Several sobemoviruses are important pathogens. Moreover, over the last decade sobemoviruses have become important model systems to study plant virus evolution. In the current review we give an overview of the structure and expression of sobemovirus genomes, processing and functions of individual proteins, particle structure, pathology and phylogenesis of sobemoviruses as well as of satellite RNAs present together with these viruses. Based on a phylogenetic analysis we propose that a new family Sobemoviridae should be recognized including the genera Sobemovirus and Polemovirus. Finally, we outline the future perspectives and needs for the research focusing on sobemoviruses. PMID:26083319
Design of microarray experiments for genetical genomics studies.

PubMed

Bueno Filho, Júlio S S; Gilmour, Steven G; Rosa, Guilherme J M

2006-10-01

Microarray experiments have been used recently in genetical genomics studies, as an additional tool to understand the genetic mechanisms governing variation in complex traits, such as for estimating heritabilities of mRNA transcript abundances, for mapping expression quantitative trait loci, and for inferring regulatory networks controlling gene expression. Several articles on the design of microarray experiments discuss situations in which treatment effects are assumed fixed and without any structure. In the case of two-color microarray platforms, several authors have studied reference and circular designs. Here, we discuss the optimal design of microarray experiments whose goals refer to specific genetic questions. Some examples are used to illustrate the choice of a design for comparing fixed, structured treatments, such as genotypic groups. Experiments targeting single genes or chromosomic regions (such as with transgene research) or multiple epistatic loci (such as within a selective phenotyping context) are discussed. In addition, microarray experiments in which treatments refer to families or to subjects (within family structures or complex pedigrees) are presented. In these cases treatments are more appropriately considered to be random effects, with specific covariance structures, in which the genetic goals relate to the estimation of genetic variances and the heritability of transcriptional abundances.
Homoeolog-specific transcriptional bias in allopolyploid wheat

PubMed Central

2010-01-01

Background Interaction between parental genomes is accompanied by global changes in gene expression which, eventually, contributes to growth vigor and the broader phenotypic diversity of allopolyploid species. In order to gain a better understanding of the effects of allopolyploidization on the regulation of diverged gene networks, we performed a genome-wide analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat created by the hybridization of a tetraploid derivative of hexaploid wheat with the diploid ancestor of the wheat D genome Ae. tauschii. Results Affymetrix wheat genome arrays were used for both the discovery of divergent homoeolog-specific mutations and analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat. More than 34,000 detectable parent-specific features (PSF) distributed across the wheat genome were used to assess AB genome (could not differentiate A and B genome contributions) and D genome parental expression in the allopolyploid transcriptome. In re-synthesized polyploid 81% of PSFs detected mid-parent levels of gene expression, and only 19% of PSFs showed the evidence of non-additive expression. Non-additive expression in both AB and D genomes was strongly biased toward up-regulation of parental type of gene expression with only 6% and 11% of genes, respectively, being down-regulated. Of all the non-additive gene expression, 84% can be explained by differences in the parental genotypes used to make the allopolyploid. Homoeolog-specific co-regulation of several functional gene categories was found, particularly genes involved in photosynthesis and protein biosynthesis in wheat. Conclusions Here, we have demonstrated that the establishment of interactions between the diverged regulatory networks in allopolyploids is accompanied by massive homoeolog-specific up- and down-regulation of gene expression. This study provides insights into interactions between homoeologous genomes and their role in growth vigor, development, and fertility of allopolyploid species. PMID:20849627
Genome-wide gene expression profiling of low-dose, long-term exposure of human osteosarcoma cells to bisphenol A and its analogs bisphenols AF and S.

PubMed

Fic, A; Mlakar, S Jurković; Juvan, P; Mlakar, V; Marc, J; Dolenc, M Sollner; Broberg, K; Mašič, L Peterlin

2015-08-01

The bisphenols AF (BPAF) and S (BPS) are structural analogs of the endocrine disruptor bisphenol A (BPA), and are used in common products as a replacement for BPA. To elucidate genome-wide gene expression responses, estrogen-dependent osteosarcoma cells were cultured with 10 nM BPA, BPAF, or BPS, for 8 h and 3 months. Genome-wide gene expression was analyzed using the Illumina Expression BeadChip. Three months exposure had significant effects on gene expression, particularly for BPS, followed by BPAF and BPA, according to the number of differentially expressed genes (1980, 778, 60, respectively), the magnitude of changes in gene expression, and the number of enriched biological processes (800, 415, 33, respectively) and pathways (77, 52, 6, respectively). 'Embryonic skeletal system development' was the most enriched bone-related process, which was affected only by BPAF and BPS. Interestingly, all three bisphenols showed highest down-regulation of genes related to the cardiovascular system (e.g., NPPB, NPR3, TXNIP). BPA only and BPA/BPAF/BPS also affected genes related to the immune system and fetal development, respectively. For BPAF and BPS, the 'isoprenoid biosynthetic process' was enriched (up-regulated genes: HMGCS1, PDSS1, ACAT2, RCE1, DHDDS). Compared to BPA, BPAF and BPS had more effects on gene expression after long-term exposure. These findings stress the need for careful toxicological characterization of BPA analogs in the future. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genome-Wide Identification and Expression Profiling of Cytokinin Oxidase/Dehydrogenase (CKX) Genes Reveal Likely Roles in Pod Development and Stress Responses in Oilseed Rape (Brassica napus L.).

PubMed

Liu, Pu; Zhang, Chao; Ma, Jin-Qi; Zhang, Li-Yuan; Yang, Bo; Tang, Xin-Yu; Huang, Ling; Zhou, Xin-Tong; Lu, Kun; Li, Jia-Na

2018-03-16

Cytokinin oxidase/dehydrogenases (CKXs) play a critical role in the irreversible degradation of cytokinins, thereby regulating plant growth and development. Brassica napus is one of the most widely cultivated oilseed crops worldwide. With the completion of whole-genome sequencing of B. napus , genome-wide identification and expression analysis of the BnCKX gene family has become technically feasible. In this study, we identified 23 BnCKX genes and analyzed their phylogenetic relationships, gene structures, conserved motifs, protein subcellular localizations, and other properties. We also analyzed the expression of the 23 BnCKX genes in the B. napus cultivar Zhong Shuang 11 ('ZS11') by quantitative reverse-transcription polymerase chain reaction (qRT-PCR), revealing their diverse expression patterns. We selected four BnCKX genes based on the results of RNA-sequencing and qRT-PCR and compared their expression in cultivated varieties with extremely long versus short siliques. The expression levels of BnCKX5-1 , 5-2 , 6-1 , and 7-1 significantly differed between the two lines and changed during pod development, suggesting they might play roles in determining silique length and in pod development. Finally, we investigated the effects of treatment with the synthetic cytokinin 6-benzylaminopurine (6-BA) and the auxin indole-3-acetic acid (IAA) on the expression of the four selected BnCKX genes. Our results suggest that regulating BnCKX expression is a promising way to enhance the harvest index and stress resistance in plants.
NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data.

PubMed

Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

2016-01-01

The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data.
The elephant shark methylome reveals conservation of epigenetic regulation across jawed vertebrates

PubMed Central

Peat, Julian R.; Ortega-Recalde, Oscar; Kardailsky, Olga; Hore, Timothy A.

2017-01-01

Background: Methylation of CG dinucleotides constitutes a critical system of epigenetic memory in bony vertebrates, where it modulates gene expression and suppresses transposon activity. The genomes of studied vertebrates are pervasively hypermethylated, with the exception of regulatory elements such as transcription start sites (TSSs), where the presence of methylation is associated with gene silencing. This system is not found in the sparsely methylated genomes of invertebrates, and establishing how it arose during early vertebrate evolution is impeded by a paucity of epigenetic data from basal vertebrates. Methods: We perform whole-genome bisulfite sequencing to generate the first genome-wide methylation profiles of a cartilaginous fish, the elephant shark Callorhinchus milii. Employing these to determine the elephant shark methylome structure and its relationship with expression, we compare this with higher vertebrates and an invertebrate chordate using published methylation and transcriptome data. Results: Like higher vertebrates, the majority of elephant shark CG sites are highly methylated, and methylation is abundant across the genome rather than patterned in the mosaic configuration of invertebrates. This global hypermethylation includes transposable elements and the bodies of genes at all expression levels. Significantly, we document an inverse relationship between TSS methylation and expression in the elephant shark, supporting the presence of the repressive regulatory architecture shared by higher vertebrates. Conclusions: Our demonstration that methylation patterns in a cartilaginous fish are characteristic of higher vertebrates imply the conservation of this epigenetic modification system across jawed vertebrates separated by 465 million years of evolution. In addition, these findings position the elephant shark as a valuable model to explore the evolutionary history and function of vertebrate methylation. PMID:28580133
The elephant shark methylome reveals conservation of epigenetic regulation across jawed vertebrates.

PubMed

Peat, Julian R; Ortega-Recalde, Oscar; Kardailsky, Olga; Hore, Timothy A

2017-01-01

Methylation of CG dinucleotides constitutes a critical system of epigenetic memory in bony vertebrates, where it modulates gene expression and suppresses transposon activity. The genomes of studied vertebrates are pervasively hypermethylated, with the exception of regulatory elements such as transcription start sites (TSSs), where the presence of methylation is associated with gene silencing. This system is not found in the sparsely methylated genomes of invertebrates, and establishing how it arose during early vertebrate evolution is impeded by a paucity of epigenetic data from basal vertebrates. We perform whole-genome bisulfite sequencing to generate the first genome-wide methylation profiles of a cartilaginous fish, the elephant shark Callorhinchus milii . Employing these to determine the elephant shark methylome structure and its relationship with expression, we compare this with higher vertebrates and an invertebrate chordate using published methylation and transcriptome data. Results: Like higher vertebrates, the majority of elephant shark CG sites are highly methylated, and methylation is abundant across the genome rather than patterned in the mosaic configuration of invertebrates. This global hypermethylation includes transposable elements and the bodies of genes at all expression levels. Significantly, we document an inverse relationship between TSS methylation and expression in the elephant shark, supporting the presence of the repressive regulatory architecture shared by higher vertebrates. Our demonstration that methylation patterns in a cartilaginous fish are characteristic of higher vertebrates imply the conservation of this epigenetic modification system across jawed vertebrates separated by 465 million years of evolution. In addition, these findings position the elephant shark as a valuable model to explore the evolutionary history and function of vertebrate methylation.
Comprehensive Genome-Wide Survey, Genomic Constitution and Expression Profiling of the NAC Transcription Factor Family in Foxtail Millet (Setaria italica L.)

PubMed Central

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B., Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants. PMID:23691254
Comprehensive genome-wide survey, genomic constitution and expression profiling of the NAC transcription factor family in foxtail millet (Setaria italica L.).

PubMed

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B, Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.
Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

PubMed

Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

2016-03-01

Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Marsupials and monotremes possess a novel family of MHC class I genes that is lost from the eutherian lineage.

PubMed

Papenfuss, Anthony T; Feng, Zhi-Ping; Krasnec, Katina; Deakin, Janine E; Baker, Michelle L; Miller, Robert D

2015-07-22

Major histocompatibility complex (MHC) class I genes are found in the genomes of all jawed vertebrates. The evolution of this gene family is closely tied to the evolution of the vertebrate genome. Family members are frequently found in four paralogous regions, which were formed in two rounds of genome duplication in the early vertebrates, but in some species class Is have been subject to additional duplication or translocation, creating additional clusters. The gene family is traditionally grouped into two subtypes: classical MHC class I genes that are usually MHC-linked, highly polymorphic, expressed in a broad range of tissues and present endogenously-derived peptides to cytotoxic T-cells; and non-classical MHC class I genes generally have lower polymorphism, may have tissue-specific expression and have evolved to perform immune-related or non-immune functions. As immune genes can evolve rapidly and are subject to different selection pressure, we hypothesised that there may be divergent, as yet unannotated or uncharacterised class I genes. Application of a novel method of sensitive genome searching of available vertebrate genome sequences revealed a new, extensive sub-family of divergent MHC class I genes, denoted as UT, which has not previously been characterized. These class I genes are found in both American and Australian marsupials, and in monotremes, at an evolutionary chromosomal breakpoint, but are not present in non-mammalian genomes and have been lost from the eutherian lineage. We show that UT family members are expressed in the thymus of the gray short-tailed opossum and in other immune tissues of several Australian marsupials. Structural homology modelling shows that the proteins encoded by this family are predicted to have an open, though short, antigen-binding groove. We have identified a novel sub-family of putatively non-classical MHC class I genes that are specific to marsupials and monotremes. This family was present in the ancestral mammal and is found in extant marsupials and monotremes, but has been lost from the eutherian lineage. The function of this family is as yet unknown, however, their predicted structure may be consistent with presentation of antigens to T-cells.
Evolution of genome size and complexity in the rhabdoviridae.

PubMed

Walker, Peter J; Firth, Cadhla; Widen, Steven G; Blasdell, Kim R; Guzman, Hilda; Wood, Thomas G; Paradkar, Prasad N; Holmes, Edward C; Tesh, Robert B; Vasilakis, Nikos

2015-02-01

RNA viruses exhibit substantial structural, ecological and genomic diversity. However, genome size in RNA viruses is likely limited by a high mutation rate, resulting in the evolution of various mechanisms to increase complexity while minimising genome expansion. Here we conduct a large-scale analysis of the genome sequences of 99 animal rhabdoviruses, including 45 genomes which we determined de novo, to identify patterns of genome expansion and the evolution of genome complexity. All but seven of the rhabdoviruses clustered into 17 well-supported monophyletic groups, of which eight corresponded to established genera, seven were assigned as new genera, and two were taxonomically ambiguous. We show that the acquisition and loss of new genes appears to have been a central theme of rhabdovirus evolution, and has been associated with the appearance of alternative, overlapping and consecutive ORFs within the major structural protein genes, and the insertion and loss of additional ORFs in each gene junction in a clade-specific manner. Changes in the lengths of gene junctions accounted for as much as 48.5% of the variation in genome size from the smallest to the largest genome, and the frequency with which new ORFs were observed increased in the 3' to 5' direction along the genome. We also identify several new families of accessory genes encoded in these regions, and show that non-canonical expression strategies involving TURBS-like termination-reinitiation, ribosomal frame-shifts and leaky ribosomal scanning appear to be common. We conclude that rhabdoviruses have an unusual capacity for genomic plasticity that may be linked to their discontinuous transcription strategy from the negative-sense single-stranded RNA genome, and propose a model that accounts for the regular occurrence of genome expansion and contraction throughout the evolution of the Rhabdoviridae.
Evolution of Genome Size and Complexity in the Rhabdoviridae

PubMed Central

Walker, Peter J.; Firth, Cadhla; Widen, Steven G.; Blasdell, Kim R.; Guzman, Hilda; Wood, Thomas G.; Paradkar, Prasad N.; Holmes, Edward C.; Tesh, Robert B.; Vasilakis, Nikos

2015-01-01

RNA viruses exhibit substantial structural, ecological and genomic diversity. However, genome size in RNA viruses is likely limited by a high mutation rate, resulting in the evolution of various mechanisms to increase complexity while minimising genome expansion. Here we conduct a large-scale analysis of the genome sequences of 99 animal rhabdoviruses, including 45 genomes which we determined de novo, to identify patterns of genome expansion and the evolution of genome complexity. All but seven of the rhabdoviruses clustered into 17 well-supported monophyletic groups, of which eight corresponded to established genera, seven were assigned as new genera, and two were taxonomically ambiguous. We show that the acquisition and loss of new genes appears to have been a central theme of rhabdovirus evolution, and has been associated with the appearance of alternative, overlapping and consecutive ORFs within the major structural protein genes, and the insertion and loss of additional ORFs in each gene junction in a clade-specific manner. Changes in the lengths of gene junctions accounted for as much as 48.5% of the variation in genome size from the smallest to the largest genome, and the frequency with which new ORFs were observed increased in the 3’ to 5’ direction along the genome. We also identify several new families of accessory genes encoded in these regions, and show that non-canonical expression strategies involving TURBS-like termination-reinitiation, ribosomal frame-shifts and leaky ribosomal scanning appear to be common. We conclude that rhabdoviruses have an unusual capacity for genomic plasticity that may be linked to their discontinuous transcription strategy from the negative-sense single-stranded RNA genome, and propose a model that accounts for the regular occurrence of genome expansion and contraction throughout the evolution of the Rhabdoviridae. PMID:25679389
The Burmese python genome reveals the molecular basis for extreme adaptation in snakes

PubMed Central

Castoe, Todd A.; de Koning, A. P. Jason; Hall, Kathryn T.; Card, Daren C.; Schield, Drew R.; Fujita, Matthew K.; Ruggiero, Robert P.; Degner, Jack F.; Daza, Juan M.; Gu, Wanjun; Reyes-Velasco, Jacobo; Shaney, Kyle J.; Castoe, Jill M.; Fox, Samuel E.; Poole, Alex W.; Polanco, Daniel; Dobry, Jason; Vandewege, Michael W.; Li, Qing; Schott, Ryan K.; Kapusta, Aurélie; Minx, Patrick; Feschotte, Cédric; Uetz, Peter; Ray, David A.; Hoffmann, Federico G.; Bogden, Robert; Smith, Eric N.; Chang, Belinda S. W.; Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Richardson, Michael K.; Mackessy, Stephen P.; Bronikowski, Anne M.; Yandell, Mark; Warren, Wesley C.; Secor, Stephen M.; Pollock, David D.

2013-01-01

Snakes possess many extreme morphological and physiological adaptations. Identification of the molecular basis of these traits can provide novel understanding for vertebrate biology and medicine. Here, we study snake biology using the genome sequence of the Burmese python (Python molurus bivittatus), a model of extreme physiological and metabolic adaptation. We compare the python and king cobra genomes along with genomic samples from other snakes and perform transcriptome analysis to gain insights into the extreme phenotypes of the python. We discovered rapid and massive transcriptional responses in multiple organ systems that occur on feeding and coordinate major changes in organ size and function. Intriguingly, the homologs of these genes in humans are associated with metabolism, development, and pathology. We also found that many snake metabolic genes have undergone positive selection, which together with the rapid evolution of mitochondrial proteins, provides evidence for extensive adaptive redesign of snake metabolic pathways. Additional evidence for molecular adaptation and gene family expansions and contractions is associated with major physiological and phenotypic adaptations in snakes; genes involved are related to cell cycle, development, lungs, eyes, heart, intestine, and skeletal structure, including GRB2-associated binding protein 1, SSH, WNT16, and bone morphogenetic protein 7. Finally, changes in repetitive DNA content, guanine-cytosine isochore structure, and nucleotide substitution rates indicate major shifts in the structure and evolution of snake genomes compared with other amniotes. Phenotypic and physiological novelty in snakes seems to be driven by system-wide coordination of protein adaptation, gene expression, and changes in the structure of the genome. PMID:24297902
The Burmese python genome reveals the molecular basis for extreme adaptation in snakes.

PubMed

Castoe, Todd A; de Koning, A P Jason; Hall, Kathryn T; Card, Daren C; Schield, Drew R; Fujita, Matthew K; Ruggiero, Robert P; Degner, Jack F; Daza, Juan M; Gu, Wanjun; Reyes-Velasco, Jacobo; Shaney, Kyle J; Castoe, Jill M; Fox, Samuel E; Poole, Alex W; Polanco, Daniel; Dobry, Jason; Vandewege, Michael W; Li, Qing; Schott, Ryan K; Kapusta, Aurélie; Minx, Patrick; Feschotte, Cédric; Uetz, Peter; Ray, David A; Hoffmann, Federico G; Bogden, Robert; Smith, Eric N; Chang, Belinda S W; Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Richardson, Michael K; Mackessy, Stephen P; Bronikowski, Anne M; Bronikowsi, Anne M; Yandell, Mark; Warren, Wesley C; Secor, Stephen M; Pollock, David D

2013-12-17

Snakes possess many extreme morphological and physiological adaptations. Identification of the molecular basis of these traits can provide novel understanding for vertebrate biology and medicine. Here, we study snake biology using the genome sequence of the Burmese python (Python molurus bivittatus), a model of extreme physiological and metabolic adaptation. We compare the python and king cobra genomes along with genomic samples from other snakes and perform transcriptome analysis to gain insights into the extreme phenotypes of the python. We discovered rapid and massive transcriptional responses in multiple organ systems that occur on feeding and coordinate major changes in organ size and function. Intriguingly, the homologs of these genes in humans are associated with metabolism, development, and pathology. We also found that many snake metabolic genes have undergone positive selection, which together with the rapid evolution of mitochondrial proteins, provides evidence for extensive adaptive redesign of snake metabolic pathways. Additional evidence for molecular adaptation and gene family expansions and contractions is associated with major physiological and phenotypic adaptations in snakes; genes involved are related to cell cycle, development, lungs, eyes, heart, intestine, and skeletal structure, including GRB2-associated binding protein 1, SSH, WNT16, and bone morphogenetic protein 7. Finally, changes in repetitive DNA content, guanine-cytosine isochore structure, and nucleotide substitution rates indicate major shifts in the structure and evolution of snake genomes compared with other amniotes. Phenotypic and physiological novelty in snakes seems to be driven by system-wide coordination of protein adaptation, gene expression, and changes in the structure of the genome.
MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

DOE PAGES

Paugh, Steven W.; Coss, David R.; Bao, Ju; ...

2016-02-04

MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less

MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paugh, Steven W.; Coss, David R.; Bao, Ju

MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less
LINE1 family member is negative regulator of HLA-G expression.

PubMed

Ikeno, Masashi; Suzuki, Nobutaka; Kamiya, Megumi; Takahashi, Yuji; Kudoh, Jun; Okazaki, Tsuneko

2012-11-01

Class Ia molecules of human leucocyte antigen (HLA-A, -B and -C) are widely expressed and play a central role in the immune system by presenting peptides derived from the lumen of the endoplasmic reticulum. In contrast, class Ib molecules such as HLA-G serve novel functions. The distribution of HLA-G is mostly limited to foetal trophoblastic tissues and some tumour tissues. The mechanism required for the tissue-specific regulation of the HLA-G gene has not been well understood. Here, we investigated the genomic regulation of HLA-G by manipulating one copy of a genomic DNA fragment on a human artificial chromosome. We identified a potential negative regulator of gene expression in a sequence upstream of HLA-G that overlapped with the long interspersed element (LINE1); silencing of HLA-G involved a DNA secondary structure generated in LINE1. The presence of a LINE1 gene silencer may explain the limited expression of HLA-G compared with other class I genes.
The Gene Expression Omnibus database

PubMed Central

Clough, Emily; Barrett, Tanya

2016-01-01

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome–protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/. PMID:27008011
Condensin-driven remodelling of X chromosome topology during dosage compensation

NASA Astrophysics Data System (ADS)

Crane, Emily; Bian, Qian; McCord, Rachel Patton; Lajoie, Bryan R.; Wheeler, Bayly S.; Ralston, Edward J.; Uzawa, Satoru; Dekker, Job; Meyer, Barbara J.

2015-07-01

The three-dimensional organization of a genome plays a critical role in regulating gene expression, yet little is known about the machinery and mechanisms that determine higher-order chromosome structure. Here we perform genome-wide chromosome conformation capture analysis, fluorescent in situ hybridization (FISH), and RNA-seq to obtain comprehensive three-dimensional (3D) maps of the Caenorhabditis elegans genome and to dissect X chromosome dosage compensation, which balances gene expression between XX hermaphrodites and XO males. The dosage compensation complex (DCC), a condensin complex, binds to both hermaphrodite X chromosomes via sequence-specific recruitment elements on X (rex sites) to reduce chromosome-wide gene expression by half. Most DCC condensin subunits also act in other condensin complexes to control the compaction and resolution of all mitotic and meiotic chromosomes. By comparing chromosome structure in wild-type and DCC-defective embryos, we show that the DCC remodels hermaphrodite X chromosomes into a sex-specific spatial conformation distinct from autosomes. Dosage-compensated X chromosomes consist of self-interacting domains (~1 Mb) resembling mammalian topologically associating domains (TADs). TADs on X chromosomes have stronger boundaries and more regular spacing than on autosomes. Many TAD boundaries on X chromosomes coincide with the highest-affinity rex sites and become diminished or lost in DCC-defective mutants, thereby converting the topology of X to a conformation resembling autosomes. rex sites engage in DCC-dependent long-range interactions, with the most frequent interactions occurring between rex sites at DCC-dependent TAD boundaries. These results imply that the DCC reshapes the topology of X chromosomes by forming new TAD boundaries and reinforcing weak boundaries through interactions between its highest-affinity binding sites. As this model predicts, deletion of an endogenous rex site at a DCC-dependent TAD boundary using CRISPR/Cas9 greatly diminished the boundary. Thus, the DCC imposes a distinct higher-order structure onto X chromosomes while regulating gene expression chromosome-wide.
Condensin-Driven Remodeling of X-Chromosome Topology during Dosage Compensation

PubMed Central

Crane, Emily; Bian, Qian; McCord, Rachel Patton; Lajoie, Bryan R.; Wheeler, Bayly S.; Ralston, Edward J.; Uzawa, Satoru; Dekker, Job; Meyer, Barbara J.

2015-01-01

The three-dimensional organization of a genome plays a critical role in regulating gene expression, yet little is known about the machinery and mechanisms that determine higher-order chromosome structure1,2. Here we perform genome-wide chromosome conformation capture analysis, FISH, and RNA-seq to obtain comprehensive 3D maps of the Caenorhabditis elegans genome and to dissect X-chromosome dosage compensation, which balances gene expression between XX hermaphrodites and XO males. The dosage compensation complex (DCC), a condensin complex, binds to both hermaphrodite X chromosomes via sequence-specific recruitment elements on X (rex sites) to reduce chromosome-wide gene expression by half3–7. Most DCC condensin subunits also act in other condensin complexes to control the compaction and resolution of all mitotic and meiotic chromosomes5,6. By comparing chromosome structure in wild-type and DCC-defective embryos, we show that the DCC remodels hermaphrodite X chromosomes into a sex-specific spatial conformation distinct from autosomes. Dosage-compensated X chromosomes consist of self-interacting domains (~1 Mb) resembling mammalian Topologically Associating Domains (TADs)8,9. TADs on X have stronger boundaries and more regular spacing than on autosomes. Many TAD boundaries on X coincide with the highest-affinity rex sites and become diminished or lost in DCC-defective mutants, thereby converting the topology of X to a conformation resembling autosomes. rex sites engage in DCC-dependent long-range interactions, with the most frequent interactions occurring between rex sites at DCC-dependent TAD boundaries. These results imply that the DCC reshapes the topology of X by forming new TAD boundaries and reinforcing weak boundaries through interactions between its highest-affinity binding sites. As this model predicts, deletion of an endogenous rex site at a DCC-dependent TAD boundary using CRISPR/Cas9 greatly diminished the boundary. Thus, the DCC imposes a distinct higher-order structure onto X while regulating gene expression chromosome wide. PMID:26030525
Co-evolution of plant LTR-retrotransposons and their host genomes.

PubMed

Zhao, Meixia; Ma, Jianxin

2013-07-01

Transposable elements (TEs), particularly, long terminal repeat retrotransposons (LTR-RTs), are the most abundant DNA components in all plant species that have been investigated, and are largely responsible for plant genome size variation. Although plant genomes have experienced periodic proliferation and/or recent burst of LTR-retrotransposons, the majority of LTR-RTs are inactivated by DNA methylation and small RNA-mediated silencing mechanisms, and/or were deleted/truncated by unequal homologous recombination and illegitimate recombination, as suppression mechanisms that counteract genome expansion caused by LTR-RT amplification. LTR-RT DNA is generally enriched in pericentromeric regions of the host genomes, which appears to be the outcomes of preferential insertions of LTR-RTs in these regions and low effectiveness of selection that purges LTR-RT DNA from these regions relative to chromosomal arms. Potential functions of various TEs in their host genomes remain blurry; nevertheless, LTR-RTs have been recognized to play important roles in maintaining chromatin structures and centromere functions and regulation of gene expressions in their host genomes.
X Chromosome Crossover Formation and Genome Stability in Caenorhabditis elegans Are Independently Regulated by xnd-1

PubMed Central

McClendon, T. Brooke; Mainpal, Rana; Amrit, Francis R. G.; Krause, Michael W.; Ghazi, Arjumand; Yanowitz, Judith L.

2016-01-01

The germ line efficiently combats numerous genotoxic insults to ensure the high fidelity propagation of unaltered genomic information across generations. Yet, germ cells in most metazoans also intentionally create double-strand breaks (DSBs) to promote DNA exchange between parental chromosomes, a process known as crossing over. Homologous recombination is employed in the repair of both genotoxic lesions and programmed DSBs, and many of the core DNA repair proteins function in both processes. In addition, DNA repair efficiency and crossover (CO) distribution are both influenced by local and global differences in chromatin structure, yet the interplay between chromatin structure, genome integrity, and meiotic fidelity is still poorly understood. We have used the xnd-1 mutant of Caenorhabditis elegans to explore the relationship between genome integrity and crossover formation. Known for its role in ensuring X chromosome CO formation and germ line development, we show that xnd-1 also regulates genome stability. xnd-1 mutants exhibited a mortal germ line, high embryonic lethality, high incidence of males, and sensitivity to ionizing radiation. We discovered that a hypomorphic allele of mys-1 suppressed these genome instability phenotypes of xnd-1, but did not suppress the CO defects, suggesting it serves as a separation-of-function allele. mys-1 encodes a histone acetyltransferase, whose homolog Tip60 acetylates H2AK5, a histone mark associated with transcriptional activation that is increased in xnd-1 mutant germ lines, raising the possibility that thresholds of H2AK5ac may differentially influence distinct germ line repair events. We also show that xnd-1 regulated him-5 transcriptionally, independently of mys-1, and that ectopic expression of him-5 suppressed the CO defects of xnd-1. Our work provides xnd-1 as a model in which to study the link between chromatin factors, gene expression, and genome stability. PMID:27678523
Detection of gene expression changes at chromosomal rearrangement breakpoints in evolution

PubMed Central

2012-01-01

Background We study the relation between genome rearrangements, breakpoints and gene expression. Genome rearrangement research has been concerned with the creation of breakpoints and their position in the chromosome, but the functional consequences of individual breakpoints remain virtually unknown, and there are no direct genome-wide studies of breakpoints from this point of view. A question arises of what the biological consequences of breakpoint creation are, rather than just their structural aspects. The question is whether proximity to the site of a breakpoint event changes the activity of a gene. Results We investigate this by comparing the distribution of distances to the nearest breakpoint of genes that are differentially expressed with the distribution of the same distances for the entire gene complement. We study this in data on whole blood tissue in human versus macaque, and in cerebral cortex tissue in human versus chimpanzee. We find in both data sets that the distribution of distances to the nearest breakpoint of "changed expression genes" differs little from this distance calculated for the rest of the gene complement. In focusing on the changed expression genes closest to the breakpoints, however, we discover that several of these have previously been implicated in the literature as being connected to the evolutionary divergence of humans from other primates. Conclusions We conjecture that chromosomal rearrangements occasionally interrupt the regulatory configurations of genes close to the breakpoint, leading to changes in expression. PMID:22536904
Identification, characterization and expression analysis of pigeonpea miRNAs in response to Fusarium wilt.

PubMed

Hussain, Khalid; Mungikar, Kanak; Kulkarni, Abhijeet; Kamble, Avinash

2018-05-05

Upon confrontation with unfavourable conditions, plants invoke a very complex set of biochemical and physiological reactions and alter gene expression patterns to combat the situations. MicroRNAs (miRNAs), a class of small non-coding RNA, contribute extensively in regulation of gene expression through translation inhibition or degradation of their target mRNAs during such conditions. Therefore, identification of miRNAs and their targets holds importance in understanding the regulatory networks triggered during stress. Structure and sequence similarity based in silico prediction of miRNAs in Cajanus cajan L. (Pigeonpea) draft genome sequence has been carried out earlier. These annotations also appear in related GenBank genome sequence entries. However, there are no reports available on context dependent miRNA expression and their targets in pigeonpea. Therefore, in the present study we addressed these questions computationally, using pigeonpea EST sequence information. We identified five novel pigeonpea miRNA precursors, their mature forms and targets. Interestingly, only one of these miRNAs (miR169i-3p) was identified earlier in draft genome sequence. We then validated expression of these miRNAs, experimentally. It was also observed that these miRNAs show differential expression patterns in response to Fusarium inoculation indicating their biotic stress responsive nature. Overall these results will help towards better understanding the regulatory network of defense during pigeonpea -pathogen interactions and role of miRNAs in the process. Copyright © 2018 Elsevier B.V. All rights reserved.
Hidden genetic variation in the germline genome of Tetrahymena thermophila.

PubMed

Dimond, K L; Zufall, R A

2016-06-01

Genome architecture varies greatly among eukaryotes. This diversity may profoundly affect the origin and maintenance of genetic variation within a population. Ciliates are microbial eukaryotes with unusual genome features, such as the separation of germline and somatic genomes within a single cell and amitotic division. These features have previously been proposed to increase the rate of molecular evolution in these species. Here, we assessed the fitness effects of genetic variation in the two genomes of natural isolates of the ciliate Tetrahymena thermophila. We find more extensive genetic variation in fitness in the transcriptionally silent germline genome than in the expressed somatic genome. Surprisingly, this variation is not primarily deleterious, but has both beneficial and deleterious effects. We conclude that Tetrahymena genome architecture allows for the maintenance of genetic variation that would otherwise be eliminated by selection. We consider the effect of selection on the two genomes and the impacts of reproductive strategies and the mechanism of sex determination on the structure of this variation. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Genome-wide characterization of pectin methyl esterase genes reveals members differentially expressed in tolerant and susceptible wheats in response to Fusarium graminearum.

PubMed

Zega, Alessandra; D'Ovidio, Renato

2016-11-01

Pectin methyl esterase (PME) genes code for enzymes that are involved in structural modifications of the plant cell wall during plant growth and development. They are also involved in plant-pathogen interaction. PME genes belong to a multigene family and in this study we report the first comprehensive analysis of the PME gene family in bread wheat (Triticum aestivum L.). Like in other species, the members of the TaPME family are dispersed throughout the genome and their encoded products retain the typical structural features of PMEs. qRT-PCR analysis showed variation in the expression pattern of TaPME genes in different tissues and revealed that these genes are mainly expressed in flowering spikes. In our attempt to identify putative TaPME genes involved in wheat defense, we revealed a strong variation in the expression of the TaPME following Fusarium graminearum infection, the causal agent of Fusarium head blight (FHB). Particularly interesting was the finding that the expression profile of some PME genes was markedly different between the FHB-resistant wheat cultivar Sumai3 and the FHB-susceptible cultivar Bobwhite, suggesting a possible involvement of these PME genes in FHB resistance. Moreover, the expression analysis of the TaPME genes during F. graminearum progression within the spike revealed those genes that responded more promptly to pathogen invasion. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome

PubMed Central

Konkel, Miriam K.; Batzer, Mark A.

2010-01-01

It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families – long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements – mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development. PMID:20307669
Recent molecular genetic studies and methodological issues in suicide research.

PubMed

Tsai, Shih-Jen; Hong, Chen-Jee; Liou, Ying-Jay

2011-06-01

Suicide behavior (SB) spans a spectrum ranging from suicidal ideation to suicide attempts and completed suicide. Strong evidence suggests a genetic susceptibility to SB, including familial heritability and common occurrence in twins. This review addresses recent molecular genetic studies in SB that include case-control association, genome gene-expression microarray, and genome-wide association (GWA). This work also reviews epigenetics in SB and pharmacogenetic studies of antidepressant-induced suicide. SB fulfills criteria for a complex genetic phenotype in which environmental factors interact with multiple genes to influence susceptibility. So far, case-control association approaches are still the mainstream in SB genetic studies, although whole genome gene-expression microarray and GWA studies have begun to emerge in recent years. Genetic association studies have suggested several genes (e.g., serotonin transporter, tryptophan hydroxylase 2, and brain-derived neurotrophic factor) related to SB, but not all reports support these findings. The case-control approach while useful is limited by present knowledge of disease pathophysiology. Genome-wide studies of gene expression and genetic variation are not constrained by our limited knowledge. However, the explanatory power and path to clinical translation of risk estimates for common variants reported in genome-wide association studies remain unclear because of the presence of rare and structural genetic variation. As whole genome sequencing becomes increasingly widespread, available genomic information will no longer be the limiting factor in applying genetics to clinical medicine. These approaches provide exciting new avenues to identify new candidate genes for SB genetic studies. The other limitation of genetic association is the lack of a consistent definition of the SB phenotype among studies, an inconsistency that hampers the comparability of the studies and data pooling. In summary, SB involves multiple genes interacting with non-genetic factors. A better understanding of the SB genes by combining whole genome approaches with case-control association studies, may potentially lead to developing effective screening, prevention, and management of SB. Copyright © 2010 Elsevier Inc. All rights reserved.
Influence of sequence and size of DNA on packaging efficiency of parvovirus MVM-based vectors.

PubMed

Brandenburger, A; Coessens, E; El Bakkouri, K; Velu, T

1999-05-01

We have derived a vector from the autonomous parvovirus MVM(p), which expresses human IL-2 specifically in transformed cells (Russell et al., J. Virol 1992;66:2821-2828). Testing the therapeutic potential of these vectors in vivo requires high-titer stocks. Stocks with a titer of 10(9) can be obtained after concentration and purification (Avalosse et al., J. Virol. Methods 1996;62:179-183), but this method requires large culture volumes and cannot easily be scaled up. We wanted to increase the production of recombinant virus at the initial transfection step. Poor vector titers could be due to inadequate genome amplification or to inefficient packaging. Here we show that intracellular amplification of MVM vector genomes is not the limiting factor for vector production. Several vector genomes of different size and/or structure were amplified to an equal extent. Their amplification was also equivalent to that of a cotransfected wild-type genome. We did not observe any interference between vector and wild-type genomes at the level of DNA amplification. Despite equivalent genome amplification, vector titers varied greatly between the different genomes, presumably owing to differences in packaging efficiency. Genomes with a size close to 100% that of wild type were packaged most efficiently with loss of efficiency at lower and higher sizes. However, certain genomes of identical size showed different packaging efficiencies, illustrating the importance of the DNA sequence, and probably its structure.
Methods for understanding microbial community structures and functions in microbial fuel cells: a review.

PubMed

Zhi, Wei; Ge, Zheng; He, Zhen; Zhang, Husen

2014-11-01

Microbial fuel cells (MFCs) employ microorganisms to recover electric energy from organic matter. However, fundamental knowledge of electrochemically active bacteria is still required to maximize MFCs power output for practical applications. This review presents microbiological and electrochemical techniques to help researchers choose the appropriate methods for the MFCs study. Pre-genomic and genomic techniques such as 16S rRNA based phylogeny and metagenomics have provided important information in the structure and genetic potential of electrode-colonizing microbial communities. Post-genomic techniques such as metatranscriptomics allow functional characterizations of electrode biofilm communities by quantifying gene expression levels. Isotope-assisted phylogenetic analysis can further link taxonomic information to microbial metabolisms. A combination of electrochemical, phylogenetic, metagenomic, and post-metagenomic techniques offers opportunities to a better understanding of the extracellular electron transfer process, which in turn can lead to process optimization for power output. Copyright © 2014 Elsevier Ltd. All rights reserved.
Identification and characterization of a class of MALAT1 -like genomic loci

DOE PAGES

Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.; ...

2017-05-23

The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less
Identification and characterization of a class of MALAT1 -like genomic loci

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.

The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less
The History of Bordetella pertussis Genome Evolution Includes Structural Rearrangement

PubMed Central

Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Bowden, Katherine E.; Burroughs, Mark; Cassiday, Pamela K.; Davis, Jamie K.; Johnson, Taccara; Juieng, Phalasy; Knipe, Kristen; Mathis, Marsenia H.; Pruitt, Andrea M.; Rowe, Lori; Sheth, Mili; Tondella, M. Lucia; Williams, Margaret M.

2017-01-01

ABSTRACT Despite high pertussis vaccine coverage, reported cases of whooping cough (pertussis) have increased over the last decade in the United States and other developed countries. Although Bordetella pertussis is well known for its limited gene sequence variation, recent advances in long-read sequencing technology have begun to reveal genomic structural heterogeneity among otherwise indistinguishable isolates, even within geographically or temporally defined epidemics. We have compared rearrangements among complete genome assemblies from 257 B. pertussis isolates to examine the potential evolution of the chromosomal structure in a pathogen with minimal gene nucleotide sequence diversity. Discrete changes in gene order were identified that differentiated genomes from vaccine reference strains and clinical isolates of various genotypes, frequently along phylogenetic boundaries defined by single nucleotide polymorphisms. The observed rearrangements were primarily large inversions centered on the replication origin or terminus and flanked by IS481, a mobile genetic element with >240 copies per genome and previously suspected to mediate rearrangements and deletions by homologous recombination. These data illustrate that structural genome evolution in B. pertussis is not limited to reduction but also includes rearrangement. Therefore, although genomes of clinical isolates are structurally diverse, specific changes in gene order are conserved, perhaps due to positive selection, providing novel information for investigating disease resurgence and molecular epidemiology. IMPORTANCE Whooping cough, primarily caused by Bordetella pertussis, has resurged in the United States even though the coverage with pertussis-containing vaccines remains high. The rise in reported cases has included increased disease rates among all vaccinated age groups, provoking questions about the pathogen's evolution. The chromosome of B. pertussis includes a large number of repetitive mobile genetic elements that obstruct genome analysis. However, these mobile elements facilitate large rearrangements that alter the order and orientation of essential protein-encoding genes, which otherwise exhibit little nucleotide sequence diversity. By comparing the complete genome assemblies from 257 isolates, we show that specific rearrangements have been conserved throughout recent evolutionary history, perhaps by eliciting changes in gene expression, which may also provide useful information for molecular epidemiology. PMID:28167525
Complex genomic rearrangement in CCS-LacZ transgenic mice.

PubMed

Stroud, Dina Myers; Darrow, Bruce J; Kim, Sang Do; Zhang, Jie; Jongbloed, Monique R M; Rentschler, Stacey; Moskowitz, Ivan P G; Seidman, Jonathan; Fishman, Glenn I

2007-02-01

The cardiac conduction system (CCS)-lacZ insertional mouse mutant strain genetically labels the developing and mature CCS. This pattern of expression is presumed to reflect the site of transgene integration rather than regulatory elements within the transgene proper. We sought to characterize the genomic structure of the integration locus and identify nearby gene(s) that might potentially confer the observed CCS-specific transcription. We found rearrangement of chromosome 7 between regions D1 and E1 with altered transcription of multiple genes in the D1 region. Several lines of evidence suggested that regulatory elements from at least one gene, Slco3A1, influenced CCS-restricted reporter gene expression. In embryonic hearts, Slco3A1 was expressed in a spatial pattern similar to the CCS-lacZ transgene and was similarly neuregulin-responsive. At later stages, however, expression patterns of the transgene and Slco3A1 diverged, suggesting that the Slco3A1 locus may be necessary, but not sufficient to confer CCS-specific transgene expression in the CCS-lacZ line. (c) 2007 Wiley-Liss, Inc.
Genome-wide analysis of TCP family in tobacco.

PubMed

Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

2016-05-23

The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.

Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast

PubMed Central

Jeffares, Daniel C.; Jolly, Clemency; Hoti, Mimoza; Speed, Doug; Shaw, Liam; Rallis, Charalampos; Balloux, Francois; Dessimoz, Christophe; Bähler, Jürg; Sedlazeck, Fritz J.

2017-01-01

Large structural variations (SVs) within genomes are more challenging to identify than smaller genetic variants but may substantially contribute to phenotypic diversity and evolution. We analyse the effects of SVs on gene expression, quantitative traits and intrinsic reproductive isolation in the yeast Schizosaccharomyces pombe. We establish a high-quality curated catalogue of SVs in the genomes of a worldwide library of S. pombe strains, including duplications, deletions, inversions and translocations. We show that copy number variants (CNVs) show a variety of genetic signals consistent with rapid turnover. These transient CNVs produce stoichiometric effects on gene expression both within and outside the duplicated regions. CNVs make substantial contributions to quantitative traits, most notably intracellular amino acid concentrations, growth under stress and sugar utilization in winemaking, whereas rearrangements are strongly associated with reproductive isolation. Collectively, these findings have broad implications for evolution and for our understanding of quantitative traits including complex human diseases. PMID:28117401
Unusual DNA Structures Associated With Germline Genetic Activity in Caenorhabditis elegans

PubMed Central

Fire, Andrew; Alcazar, Rosa; Tan, Frederick

2006-01-01

We describe a surprising long-range periodicity that underlies a substantial fraction of C. elegans genomic sequence. Extended segments (up to several hundred nucleotides) of the C. elegans genome show a strong bias toward occurrence of AA/TT dinucleotides along one face of the helix while little or no such constraint is evident on the opposite helical face. Segments with this characteristic periodicity are highly overrepresented in intron sequences and are associated with a large fraction of genes with known germline expression in C. elegans. In addition to altering the path and flexibility of DNA in vitro, sequences of this character have been shown by others to constrain DNA∷nucleosome interactions, potentially producing a structure that could resist the assembly of highly ordered (phased) nucleosome arrays that have been proposed as a precursor to heterochromatin. We propose a number of ways that the periodic occurrence of An/Tn clusters could reflect evolution and function of genes that express in the germ cell lineage of C. elegans. PMID:16648589
Genome-Wide Analysis of the Sucrose Synthase Gene Family in Grape (Vitis vinifera): Structure, Evolution, and Expression Profiles

PubMed Central

Zhu, Xudong; Wang, Mengqi; Li, Xiaopeng; Jiu, Songtao; Wang, Chen; Fang, Jinggui

2017-01-01

Sucrose synthase (SS) is widely considered as the key enzyme involved in the plant sugar metabolism that is critical to plant growth and development, especially quality of the fruit. The members of SS gene family have been identified and characterized in multiple plant genomes. However, detailed information about this gene family is lacking in grapevine (Vitis vinifera L.). In this study, we performed a systematic analysis of the grape (V. vinifera) genome and reported that there are five SS genes (VvSS1–5) in the grape genome. Comparison of the structures of grape SS genes showed high structural conservation of grape SS genes, resulting from the selection pressures during the evolutionary process. The segmental duplication of grape SS genes contributed to this gene family expansion. The syntenic analyses between grape and soybean (Glycine max) demonstrated that these genes located in corresponding syntenic blocks arose before the divergence of grape and soybean. Phylogenetic analysis revealed distinct evolutionary paths for the grape SS genes. VvSS1/VvSS5, VvSS2/VvSS3 and VvSS4 originated from three ancient SS genes, which were generated by duplication events before the split of monocots and eudicots. Bioinformatics analysis of publicly available microarray data, which was validated by quantitative real-time reverse transcription PCR (qRT-PCR), revealed distinct temporal and spatial expression patterns of VvSS genes in various tissues, organs and developmental stages, as well as in response to biotic and abiotic stresses. Taken together, our results will be beneficial for further investigations into the functions of SS gene in the processes of grape resistance to environmental stresses. PMID:28350372
Systematic gene tagging using CRISPR/Cas9 in human stem cells to illuminate cell organization.

PubMed

Roberts, Brock; Haupt, Amanda; Tucker, Andrew; Grancharova, Tanya; Arakaki, Joy; Fuqua, Margaret A; Nelson, Angelique; Hookway, Caroline; Ludmann, Susan A; Mueller, Irina A; Yang, Ruian; Horwitz, Rick; Rafelski, Susanne M; Gunawardane, Ruwanthi N

2017-10-15

We present a CRISPR/Cas9 genome-editing strategy to systematically tag endogenous proteins with fluorescent tags in human induced pluripotent stem cells (hiPSC). To date, we have generated multiple hiPSC lines with monoallelic green fluorescent protein tags labeling 10 proteins representing major cellular structures. The tagged proteins include alpha tubulin, beta actin, desmoplakin, fibrillarin, nuclear lamin B1, nonmuscle myosin heavy chain IIB, paxillin, Sec61 beta, tight junction protein ZO1, and Tom20. Our genome-editing methodology using Cas9/crRNA ribonuclear protein and donor plasmid coelectroporation, followed by fluorescence-based enrichment of edited cells, typically resulted in <0.1-4% homology-directed repair (HDR). Twenty-five percent of clones generated from each edited population were precisely edited. Furthermore, 92% (36/39) of expanded clonal lines displayed robust morphology, genomic stability, expression and localization of the tagged protein to the appropriate subcellular structure, pluripotency-marker expression, and multilineage differentiation. It is our conclusion that, if cell lines are confirmed to harbor an appropriate gene edit, pluripotency, differentiation potential, and genomic stability are typically maintained during the clonal line-generation process. The data described here reveal general trends that emerged from this systematic gene-tagging approach. Final clonal lines corresponding to each of the 10 cellular structures are now available to the research community. © 2017 Roberts, Haupt, et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Molecular and phylogenetic characterization of the homoeologous EPSP Synthase genes of allohexaploid wheat, Triticum aestivum (L.).

PubMed

Aramrak, Attawan; Kidwell, Kimberlee K; Steber, Camille M; Burke, Ian C

2015-10-23

5-Enolpyruvylshikimate-3-phosphate synthase (EPSPS) is the sixth and penultimate enzyme in the shikimate biosynthesis pathway, and is the target of the herbicide glyphosate. The EPSPS genes of allohexaploid wheat (Triticum aestivum, AABBDD) have not been well characterized. Herein, the three homoeologous copies of the allohexaploid wheat EPSPS gene were cloned and characterized. Genomic and coding DNA sequences of EPSPS from the three related genomes of allohexaploid wheat were isolated using PCR and inverse PCR approaches from soft white spring "Louise'. Development of genome-specific primers allowed the mapping and expression analysis of TaEPSPS-7A1, TaEPSPS-7D1, and TaEPSPS-4A1 on chromosomes 7A, 7D, and 4A, respectively. Sequence alignments of cDNA sequences from wheat and wheat relatives served as a basis for phylogenetic analysis. The three genomic copies of wheat EPSPS differed by insertion/deletion and single nucleotide polymorphisms (SNPs), largely in intron sequences. RT-PCR analysis and cDNA cloning revealed that EPSPS is expressed from all three genomic copies. However, TaEPSPS-4A1 is expressed at much lower levels than TaEPSPS-7A1 and TaEPSPS-7D1 in wheat seedlings. Phylogenetic analysis of 1190-bp cDNA clones from wheat and wheat relatives revealed that: 1) TaEPSPS-7A1 is most similar to EPSPS from the tetraploid AB genome donor, T. turgidum (99.7 % identity); 2) TaEPSPS-7D1 most resembles EPSPS from the diploid D genome donor, Aegilops tauschii (100 % identity); and 3) TaEPSPS-4A1 resembles EPSPS from the diploid B genome relative, Ae. speltoides (97.7 % identity). Thus, EPSPS sequences in allohexaploid wheat are preserved from the most two recent ancestors. The wheat EPSPS genes are more closely related to Lolium multiflorum and Brachypodium distachyon than to Oryza sativa (rice). The three related EPSPS homoeologues of wheat exhibited conservation of the exon/intron structure and of coding region sequence, but contained significant sequence variation within intron regions. The genome-specific primers developed will enable future characterization of natural and induced variation in EPSPS sequence and expression. This can be useful in investigating new causes of glyphosate herbicide resistance.
Genome-Wide Identification, Evolutionary Expansion, and Expression Profile of Homeodomain-Leucine Zipper Gene Family in Poplar (Populus trichocarpa)

PubMed Central

Hu, Ruibo; Chi, Xiaoyuan; Chai, Guohua; Kong, Yingzhen; He, Guo; Wang, Xiaoyu; Shi, Dachuan; Zhang, Dongyuan; Zhou, Gongke

2012-01-01

Background Homeodomain-leucine zipper (HD-ZIP) proteins are plant-specific transcriptional factors known to play crucial roles in plant development. Although sequence phylogeny analysis of Populus HD-ZIPs was carried out in a previous study, no systematic analysis incorporating genome organization, gene structure, and expression compendium has been conducted in model tree species Populus thus far. Principal Findings In this study, a comprehensive analysis of Populus HD-ZIP gene family was performed. Sixty-three full-length HD-ZIP genes were found in Populus genome. These Populus HD-ZIP genes were phylogenetically clustered into four distinct subfamilies (HD-ZIP I–IV) and predominately distributed across 17 linkage groups (LG). Fifty genes from 25 Populus paralogous pairs were located in the duplicated blocks of Populus genome and then preferentially retained during the sequential evolutionary courses. Genomic organization analyses indicated that purifying selection has played a pivotal role in the retention and maintenance of Populus HD-ZIP gene family. Microarray analysis has shown that 21 Populus paralogous pairs have been differentially expressed across different tissues and under various stresses, with five paralogous pairs showing nearly identical expression patterns, 13 paralogous pairs being partially redundant and three paralogous pairs diversifying significantly. Quantitative real-time RT-PCR (qRT-PCR) analysis performed on 16 selected Populus HD-ZIP genes in different tissues and under both drought and salinity stresses confirms their tissue-specific and stress-inducible expression patterns. Conclusions Genomic organizations indicated that segmental duplications contributed significantly to the expansion of Populus HD-ZIP gene family. Exon/intron organization and conserved motif composition of Populus HD-ZIPs are highly conservative in the same subfamily, suggesting the members in the same subfamilies may also have conservative functionalities. Microarray and qRT-PCR analyses showed that 89% (56 out of 63) of Populus HD-ZIPs were duplicate genes that might have been retained by substantial subfunctionalization. Taken together, these observations may lay the foundation for future functional analysis of Populus HD-ZIP genes to unravel their biological roles. PMID:22359569
The genome- and transcriptome-wide analysis of innate immunity in the brown planthopper, Nilaparvata lugens

PubMed Central

2013-01-01

Background The brown planthopper (Nilaparvata lugens) is one of the most serious rice plant pests in Asia. N. lugens causes extensive rice damage by sucking rice phloem sap, which results in stunted plant growth and the transmission of plant viruses. Despite the importance of this insect pest, little is known about the immunological mechanisms occurring in this hemimetabolous insect species. Results In this study, we performed a genome- and transcriptome-wide analysis aiming at the immune-related genes. The transcriptome datasets include the N. lugens intestine, the developmental stage, wing formation, and sex-specific expression information that provided useful gene expression sequence data for the genome-wide analysis. As a result, we identified a large number of genes encoding N. lugens pattern recognition proteins, modulation proteins in the prophenoloxidase (proPO) activating cascade, immune effectors, and the signal transduction molecules involved in the immune pathways, including the Toll, Immune deficiency (Imd) and Janus kinase signal transducers and activators of transcription (JAK-STAT) pathways. The genome scale analysis revealed detailed information of the gene structure, distribution and transcription orientations in scaffolds. A comparison of the genome-available hemimetabolous and metabolous insect species indicate the differences in the immune-related gene constitution. We investigated the gene expression profiles with regards to how they responded to bacterial infections and tissue, as well as development and sex expression specificity. Conclusions The genome- and transcriptome-wide analysis of immune-related genes including pattern recognition and modulation molecules, immune effectors, and the signal transduction molecules involved in the immune pathways is an important step in determining the overall architecture and functional network of the immune components in N. lugens. Our findings provide the comprehensive gene sequence resource and expression profiles of the immune-related genes of N. lugens, which could facilitate the understanding of the innate immune mechanisms in the hemimetabolous insect species. These data give insight into clarifying the potential functional roles of the immune-related genes involved in the biological processes of development, reproduction, and virus transmission in N. lugens. PMID:23497397
Genome-Wide Posttranscriptional Dysregulation by MicroRNAs in Human Asthma as Revealed by Frac-seq.

PubMed

Martinez-Nunez, Rocio T; Rupani, Hitasha; Platé, Manuela; Niranjan, Mahesan; Chambers, Rachel C; Howarth, Peter H; Sanchez-Elsner, Tilman

2018-05-16

MicroRNAs are small noncoding RNAs that inhibit gene expression posttranscriptionally, implicated in virtually all biological processes. Although the effect of individual microRNAs is generally studied, the genome-wide role of multiple microRNAs is less investigated. We assessed paired genome-wide expression of microRNAs with total (cytoplasmic) and translational (polyribosome-bound) mRNA levels employing subcellular fractionation and RNA sequencing (Frac-seq) in human primary bronchoepithelium from healthy controls and severe asthmatics. Severe asthma is a chronic inflammatory disease of the airways characterized by poor response to therapy. We found genes (i.e., isoforms of a gene) and mRNA isoforms differentially expressed in asthma, with novel inflammatory and structural pathophysiological mechanisms related to bronchoepithelium disclosed solely by polyribosome-bound mRNAs (e.g., IL1A and LTB genes or ITGA6 and ITGA2 alternatively spliced isoforms). Gene expression (i.e., isoforms of a gene) and mRNA expression analysis revealed different molecular candidates and biological pathways, with differentially expressed polyribosome-bound and total mRNAs also showing little overlap. We reveal a hub of six dysregulated microRNAs accounting for ∼90% of all microRNA targeting, displaying preference for polyribosome-bound mRNAs. Transfection of this hub in bronchial epithelial cells from healthy donors mimicked asthma characteristics. Our work demonstrates extensive posttranscriptional gene dysregulation in human asthma, in which microRNAs play a central role, illustrating the feasibility and importance of assessing posttranscriptional gene expression when investigating human disease. Copyright © 2018 by The American Association of Immunologists, Inc.
Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell.

PubMed

Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

2016-03-01

The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Transcriptional regulation of FoxO3 gene by glucocorticoids in murine myotubes

PubMed Central

Kuo, Taiyi; Liu, Patty H.; Chen, Tzu-Chieh; Lee, Rebecca A.; New, Jenny; Zhang, Danyun; Lei, Cassandra; Chau, Andy; Tang, Yicheng; Cheung, Edna

2016-01-01

Glucocorticoids and FoxO3 exert similar metabolic effects in skeletal muscle. FoxO3 gene expression was increased by dexamethasone (Dex), a synthetic glucocorticoid, both in vitro and in vivo. In C2C12 myotubes the increased expression is due to, at least in part, the elevated rate of FoxO3 gene transcription. In the mouse FoxO3 gene, we identified three glucocorticoid receptor (GR) binding regions (GBRs): one being upstream of the transcription start site, −17kbGBR; and two in introns, +45kbGBR and +71kbGBR. Together, these three GBRs contain four 15-bp glucocorticoid response elements (GREs). Micrococcal nuclease (MNase) assay revealed that Dex treatment increased the sensitivity to MNase in the GRE of +45kbGBR and +71kbGBR upon 30- and 60-min Dex treatment, respectively. Conversely, Dex treatment did not affect the chromatin structure near the −17kbGBR, in which the GRE is located in the linker region. Dex treatment also increased histone H3 and/or H4 acetylation in genomic regions near all three GBRs. Moreover, using chromatin conformation capture (3C) assay, we showed that Dex treatment increased the interaction between the −17kbGBR and two genomic regions: one located around +500 bp and the other around +73 kb. Finally, the transcriptional coregulator p300 was recruited to all three GBRs upon Dex treatment. The reduction of p300 expression decreased FoxO3 gene expression and Dex-stimulated interaction between distinct genomic regions of FoxO3 gene identified by 3C. Overall, our results demonstrate that glucocorticoids activated FoxO3 gene transcription through multiple GREs by chromatin structural change and DNA looping. PMID:26758684
Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bao, Guanhui; University of Chinese Academy of Sciences, Beijing; Dong, Hongjun

Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work,more » we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance.« less
Genome-wide comparative analysis of DNA methylation between soybean cytoplasmic male-sterile line NJCMS5A and its maintainer NJCMS5B.

PubMed

Li, Yanwei; Ding, Xianlong; Wang, Xuan; He, Tingting; Zhang, Hao; Yang, Longshu; Wang, Tanliu; Chen, Linfeng; Gai, Junyi; Yang, Shouping

2017-08-10

DNA methylation is an important epigenetic modification. It can regulate the expression of many key genes without changing the primary structure of the genomic DNA, and plays a vital role in the growth and development of the organism. The genome-wide DNA methylation profile of the cytoplasmic male sterile (CMS) line in soybean has not been reported so far. In this study, genome-wide comparative analysis of DNA methylation between soybean CMS line NJCMS5A and its maintainer NJCMS5B was conducted by whole-genome bisulfite sequencing. The results showed 3527 differentially methylated regions (DMRs) and 485 differentially methylated genes (DMGs), including 353 high-credible methylated genes, 56 methylated genes coding unknown protein and 76 novel methylated genes with no known function were identified. Among them, 25 DMRs were further validated that the genome-wide DNA methylation data were reliable through bisulfite treatment, and 9 DMRs were confirmed the relationship between DNA methylation and gene expression by qRT-PCR. Finally, 8 key DMGs possibly associated with soybean CMS were identified. Genome-wide DNA methylation profile of the soybean CMS line NJCMS5A and its maintainer NJCMS5B was obtained for the first time. Several specific DMGs which participated in pollen and flower development were further identified to be probably associated with soybean CMS. This study will contribute to further understanding of the molecular mechanism behind soybean CMS.
A rapidly evolving secretome builds and patterns a sea shell

PubMed Central

Jackson, Daniel J; McDougall, Carmel; Green, Kathryn; Simpson, Fiona; Wörheide, Gert; Degnan, Bernard M

2006-01-01

Background Instructions to fabricate mineralized structures with distinct nanoscale architectures, such as seashells and coral and vertebrate skeletons, are encoded in the genomes of a wide variety of animals. In mollusks, the mantle is responsible for the extracellular production of the shell, directing the ordered biomineralization of CaCO3 and the deposition of architectural and color patterns. The evolutionary origins of the ability to synthesize calcified structures across various metazoan taxa remain obscure, with only a small number of protein families identified from molluskan shells. The recent sequencing of a wide range of metazoan genomes coupled with the analysis of gene expression in non-model animals has allowed us to investigate the evolution and process of biomineralization in gastropod mollusks. Results Here we show that over 25% of the genes expressed in the mantle of the vetigastropod Haliotis asinina encode secreted proteins, indicating that hundreds of proteins are likely to be contributing to shell fabrication and patterning. Almost 85% of the secretome encodes novel proteins; remarkably, only 19% of these have identifiable homologues in the full genome of the patellogastropod Lottia scutum. The spatial expression profiles of mantle genes that belong to the secretome is restricted to discrete mantle zones, with each zone responsible for the fabrication of one of the structural layers of the shell. Patterned expression of a subset of genes along the length of the mantle is indicative of roles in shell ornamentation. For example, Has-sometsuke maps precisely to pigmentation patterns in the shell, providing the first case of a gene product to be involved in molluskan shell pigmentation. We also describe the expression of two novel genes involved in nacre (mother of pearl) deposition. Conclusion The unexpected complexity and evolvability of this secretome and the modular design of the molluskan mantle enables diversification of shell strength and design, and as such must contribute to the variety of adaptive architectures and colors found in mollusk shells. The composition of this novel mantle-specific secretome suggests that there are significant molecular differences in the ways in which gastropods synthesize their shells. PMID:17121673
Analysis of Strand-Specific RNA-Seq Data Using Machine Learning Reveals the Structures of Transcription Units in Clostridium thermocellum

DOE PAGES

Chou, Wen-Chi; Ma, Qin; Yang, Shihui; ...

2015-03-12

The identification of transcription units (TUs) encoded in a bacterial genome is essential to elucidation of transcriptional regulation of the organism. To gain a detailed understanding of the dynamically composed TU structures, we have used four strand-specific RNA-seq (ssRNA-seq) datasets collected under two experimental conditions to derive the genomic TU organization of Clostridium thermocellum using a machine-learning approach. Our method accurately predicted the genomic boundaries of individual TUs based on two sets of parameters measuring the RNA-seq expression patterns across the genome: expression-level continuity and variance. A total of 2590 distinct TUs are predicted based on the four RNA-seq datasets.more » Moreover, among the predicted TUs, 44% have multiple genes. We assessed our prediction method on an independent set of RNA-seq data with longer reads. The evaluation confirmed the high quality of the predicted TUs. Functional enrichment analyses on a selected subset of the predicted TUs revealed interesting biology. To demonstrate the generality of the prediction method, we have also applied the method to RNA-seq data collected on Escherichia coli and achieved high prediction accuracies. The TU prediction program named SeqTU is publicly available athttps://code.google.com/p/seqtu/. We expect that the predicted TUs can serve as the baseline information for studying transcriptional and post-transcriptional regulation in C. thermocellum and other bacteria.« less
Genome-wide identification of WRKY transcription factors in kiwifruit (Actinidia spp.) and analysis of WRKY expression in responses to biotic and abiotic stresses.

PubMed

Jing, Zhaobin; Liu, Zhande

2018-04-01

As one of the largest transcriptional factor families in plants, WRKY transcription factors play important roles in various biotic and abiotic stress responses. To date, WRKY genes in kiwifruit (Actinidia spp.) remain poorly understood. In our study, o total of 97 AcWRKY genes have been identified in the kiwifruit genome. An overview of these AcWRKY genes is analyzed, including the phylogenetic relationships, exon-intron structures, synteny and expression profiles. The 97 AcWRKY genes were divided into three groups based on the conserved WRKY domain. Synteny analysis indicated that segmental duplication events contributed to the expansion of the kiwifruit AcWRKY family. In addition, the synteny analysis between kiwifruit and Arabidopsis suggested that some of the AcWRKY genes were derived from common ancestors before the divergence of these two species. Conserved motifs outside the AcWRKY domain may reflect their functional conservation. Genome-wide segmental and tandem duplication were found, which may contribute to the expansion of AcWRKY genes. Furthermore, the analysis of selected AcWRKY genes showed a variety of expression patterns in five different organs as well as during biotic and abiotic stresses. The genome-wide identification and characterization of kiwifruit WRKY transcription factors provides insight into the evolutionary history and is a useful resource for further functional analyses of kiwifruit.
Construction of Pseudomolecule Sequences of the aus Rice Cultivar Kasalath for Comparative Genomics of Asian Cultivated Rice

PubMed Central

Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong

2014-01-01

Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372
3D chromosome rendering from Hi-C data using virtual reality

NASA Astrophysics Data System (ADS)

Zhu, Yixin; Selvaraj, Siddarth; Weber, Philip; Fang, Jennifer; Schulze, Jürgen P.; Ren, Bing

2015-01-01

Most genome browsers display DNA linearly, using single-dimensional depictions that are useful to examine certain epigenetic mechanisms such as DNA methylation. However, these representations are insufficient to visualize intrachromosomal interactions and relationships between distal genome features. Relationships between DNA regions may be difficult to decipher or missed entirely if those regions are distant in one dimension but could be spatially proximal when mapped to three-dimensional space. For example, the visualization of enhancers folding over genes is only fully expressed in three-dimensional space. Thus, to accurately understand DNA behavior during gene expression, a means to model chromosomes is essential. Using coordinates generated from Hi-C interaction frequency data, we have created interactive 3D models of whole chromosome structures and its respective domains. We have also rendered information on genomic features such as genes, CTCF binding sites, and enhancers. The goal of this article is to present the procedure, findings, and conclusions of our models and renderings.
Chromatin Configuration Determines Cell Responses to Hormone Stimuli | Center for Cancer Research

Cancer.gov

Ever since selective gene expression was established as the central driver of cell behavior, researchers have been working to understand the forces that control gene transcription. Aberrant gene expression can cause or promote many diseases, including cancer, and alterations in gene expression are the goal of many therapeutic agents. Recent work has focused on the potential role of chromatin structure as a contributor to gene regulation. Chromatin can exist in a tightly packed/inaccessible or loose/accessible configuration depending on the interactions between DNA and its associated proteins. Patterns of chromatin structure can differ between cell types and can also change within cells in response to certain signals. Cancer researchers are particularly interested in the role of chromatin in gene regulation because many of the genomic regions found to be associated with cancer risk are in open chromatin structures.
Parasitism and the retrotransposon life cycle in plants: a hitchhiker's guide to the genome.

PubMed

Sabot, F; Schulman, A H

2006-12-01

LTR (long terminal repeat) retrotransposons are the main components of higher plant genomic DNA. They have shaped their host genomes through insertional mutagenesis and by effects on genome size, gene expression and recombination. These Class I transposable elements are closely related to retroviruses such as the HIV by their structure and presumptive life cycle. However, the retrotransposon life cycle has been closely investigated in few systems. For retroviruses and retrotransposons, individual defective copies can parasitize the activity of functional ones. However, some LTR retrotransposon groups as a whole, such as large retrotransposon derivatives and terminal repeats in miniature, are non-autonomous even though their genomic insertion patterns remain polymorphic between organismal accessions. Here, we examine what is known of the retrotransposon life cycle in plants, and in that context discuss the role of parasitism and complementation between and within retrotransposon groups.
NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data

PubMed Central

Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

2016-01-01

The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data. PMID:26848255

openSputnik--a database to ESTablish comparative plant genomics using unsaturated sequence collections.

PubMed

Rudd, Stephen

2005-01-01

The public expressed sequence tag collections are continually being enriched with high-quality sequences that represent an ever-expanding range of taxonomically diverse plant species. While these sequence collections provide biased insight into the populations of expressed genes available within individual species and their associated tissues, the information is conceivably of wider relevance in a comparative context. When we consider the available expressed sequence tag (EST) collections of summer 2004, most of the major plant taxonomic clades are at least superficially represented. Investigation of the five million available plant ESTs provides a wealth of information that has applications in modelling the routes of plant genome evolution and the identification of lineage-specific genes and gene families. Over four million ESTs from over 50 distinct plant species have been collated within an EST analysis pipeline called openSputnik. The ESTs were resolved down into approximately one million unigene sequences. These have been annotated using orthology-based annotation transfer from reference plant genomes and using a variety of contemporary bioinformatics methods to assign peptide, structural and functional attributes. The openSputnik database is available at http://sputnik.btk.fi.
Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome

PubMed Central

Lan, Tianying; Renner, Tanya; Ibarra-Laclette, Enrique; Farr, Kimberly M.; Chang, Tien-Hao; Cervantes-Pérez, Sergio Alan; Zheng, Chunfang; Sankoff, David; Tang, Haibao; Purbojati, Rikky W.; Putra, Alexander; Drautz-Moses, Daniela I.; Schuster, Stephan C.; Herrera-Estrella, Luis; Albert, Victor A.

2017-01-01

Utricularia gibba, the humped bladderwort, is a carnivorous plant that retains a tiny nuclear genome despite at least two rounds of whole genome duplication (WGD) since common ancestry with grapevine and other species. We used a third-generation genome assembly with several complete chromosomes to reconstruct the two most recent lineage-specific ancestral genomes that led to the modern U. gibba genome structure. Patterns of subgenome dominance in the most recent WGD, both architectural and transcriptional, are suggestive of allopolyploidization, which may have generated genomic novelty and led to instantaneous speciation. Syntenic duplicates retained in polyploid blocks are enriched for transcription factor functions, whereas gene copies derived from ongoing tandem duplication events are enriched in metabolic functions potentially important for a carnivorous plant. Among these are tandem arrays of cysteine protease genes with trap-specific expression that evolved within a protein family known to be useful in the digestion of animal prey. Further enriched functions among tandem duplicates (also with trap-enhanced expression) include peptide transport (intercellular movement of broken-down prey proteins), ATPase activities (bladder-trap acidification and transmembrane nutrient transport), hydrolase and chitinase activities (breakdown of prey polysaccharides), and cell-wall dynamic components possibly associated with active bladder movements. Whereas independently polyploid Arabidopsis syntenic gene duplicates are similarly enriched for transcriptional regulatory activities, Arabidopsis tandems are distinct from those of U. gibba, while still metabolic and likely reflecting unique adaptations of that species. Taken together, these findings highlight the special importance of tandem duplications in the adaptive landscapes of a carnivorous plant genome. PMID:28507139
Mitochondrial genome-maintaining activity of mouse mitochondrial transcription factor A and its transcript isoform in Saccharomyces cerevisiae.

PubMed

Yoon, Young Geol; Koob, Michael D; Yoo, Young Hyun

2011-09-15

Mitochondrial transcription factor A (Tfam) binds to and organizes mitochondrial DNA (mtDNA) genome into a mitochondrial nucleoid (mt-nucleoid) structure, which is necessary for mtDNA transcription and maintenance. Here, we demonstrate the mtDNA-organizing activity of mouse Tfam and its transcript isoform (Tfam(iso)), which has a smaller high-mobility group (HMG)-box1 domain, using a yeast model system that contains a deletion of the yeast homolog of mouse Tfam protein, Abf2p. When the mouse Tfam genes were introduced into the ABF2 locus of yeast genome, the corresponding mouse proteins, Tfam and Tfam(iso), can functionally replace the yeast Abf2p and support mtDNA maintenance and mitochondrial biogenesis in yeast. Growth properties, mtDNA content and mitochondrial protein levels of genes encoded in the mtDNA were comparable in the strains expressing mouse proteins and the wild-type yeast strain, indicating that the proteins have robust mtDNA-maintaining and -expressing function in yeast mitochondria. These results imply that the mtDNA-organizing activities of the mouse mt-nucleoid proteins are structurally and evolutionary conserved, thus they can maintain the mtDNA of distantly related and distinctively different species, such as yeast. Copyright © 2011 Elsevier B.V. All rights reserved.
Widespread occurrence of organelle genome-encoded 5S rRNAs including permuted molecules

PubMed Central

Valach, Matus; Burger, Gertraud; Gray, Michael W.; Lang, B. Franz

2014-01-01

5S Ribosomal RNA (5S rRNA) is a universal component of ribosomes, and the corresponding gene is easily identified in archaeal, bacterial and nuclear genome sequences. However, organelle gene homologs (rrn5) appear to be absent from most mitochondrial and several chloroplast genomes. Here, we re-examine the distribution of organelle rrn5 by building mitochondrion- and plastid-specific covariance models (CMs) with which we screened organelle genome sequences. We not only recover all organelle rrn5 genes annotated in GenBank records, but also identify more than 50 previously unrecognized homologs in mitochondrial genomes of various stramenopiles, red algae, cryptomonads, malawimonads and apusozoans, and surprisingly, in the apicoplast (highly derived plastid) genomes of the coccidian pathogens Toxoplasma gondii and Eimeria tenella. Comparative modeling of RNA secondary structure reveals that mitochondrial 5S rRNAs from brown algae adopt a permuted triskelion shape that has not been seen elsewhere. Expression of the newly predicted rrn5 genes is confirmed experimentally in 10 instances, based on our own and published RNA-Seq data. This study establishes that particularly mitochondrial 5S rRNA has a much broader taxonomic distribution and a much larger structural variability than previously thought. The newly developed CMs will be made available via the Rfam database and the MFannot organelle genome annotator. PMID:25429974
Widespread occurrence of organelle genome-encoded 5S rRNAs including permuted molecules.

PubMed

Valach, Matus; Burger, Gertraud; Gray, Michael W; Lang, B Franz

2014-12-16

5S Ribosomal RNA (5S rRNA) is a universal component of ribosomes, and the corresponding gene is easily identified in archaeal, bacterial and nuclear genome sequences. However, organelle gene homologs (rrn5) appear to be absent from most mitochondrial and several chloroplast genomes. Here, we re-examine the distribution of organelle rrn5 by building mitochondrion- and plastid-specific covariance models (CMs) with which we screened organelle genome sequences. We not only recover all organelle rrn5 genes annotated in GenBank records, but also identify more than 50 previously unrecognized homologs in mitochondrial genomes of various stramenopiles, red algae, cryptomonads, malawimonads and apusozoans, and surprisingly, in the apicoplast (highly derived plastid) genomes of the coccidian pathogens Toxoplasma gondii and Eimeria tenella. Comparative modeling of RNA secondary structure reveals that mitochondrial 5S rRNAs from brown algae adopt a permuted triskelion shape that has not been seen elsewhere. Expression of the newly predicted rrn5 genes is confirmed experimentally in 10 instances, based on our own and published RNA-Seq data. This study establishes that particularly mitochondrial 5S rRNA has a much broader taxonomic distribution and a much larger structural variability than previously thought. The newly developed CMs will be made available via the Rfam database and the MFannot organelle genome annotator. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

PubMed

Cao, Yinhe; Tung, Wen-Wen; Gao, J B

2004-01-01

With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.
Genomic structure of rat 3alpha-hydroxysteroid/dihydrodiol dehydrogenase (3alpha-HSD/DD, AKR1C9).

PubMed

Lin, H K; Hung, C F; Moore, M; Penning, T M

1999-11-01

Rat liver 3alpha-hydroxysteroid/dihydrodiol dehydrogenase (3alpha-HSD/DD) is a member of the aldo-keto reductase (AKR) superfamily. It is involved in the inactivation of steroid hormones and the metabolic activation of polycyclic aromatic hydrocarbons (PAH) by converting trans-dihydrodiols into reactive and redox-active o-quinones. The structure of the 5'-flanking region of the gene and factors involved in the constitutive and regulated expression of this gene have been reported [H.-K. Lin, T.M. Penning, Cloning, sequencing, and functional analysis of the 5'-flanking region of the rat 3alpha-hydroxysteroid/dihydrodiol dehydrogenase gene, Cancer Res. 55 (1995) 4105-4113]. We now describe the complete genomic structure of the rat type 1 3alpha-HSD/DD gene. Charon 4A and P1 genomic clones contained at least three rat genes (type 1, type 2 and type 3 3alpha-HSD/DD) each of which encoded for the same open reading frame (ORF) but differed in their exon-intron organization. 5'-RACE confirmed that the type 1 3alpha-HSD/DD gene encodes for the dominant transcript in rat liver and it was the regulation of this gene that was previously studied. The rat type 1 3alpha-HSD/DD gene is 30 kb in length and consists of nine exons and eight introns. Exon 9 encodes +931 to 966 bp of the ORF and the 1292 bp 3'-UTR implicated in mRNA stability. This genomic structure is nearly identical to the homologous human genes, type 1 3alpha-HSD (chlordecone reductase/DD4, AKR1C4), type 2 3alpha-HSD (AKR1C3) and type 3 3alpha-HSD (bile-acid binding protein, AKR1C2) genes. Three different cDNA's containing identical ORFs for 3alpha-HSD have been reported suggesting that all three genes may be expressed in rat liver. Using 5' primers corresponding to the 5'-UTR's of the three different cDNA's only one PCR fragment was obtained and corresponded to the type 1 3alpha-HSD/DD gene. These data suggested that the type 2 and type 3 3alpha-HSD/DD genes are not abundantly expressed in rat liver. It is unknown whether the type 2 and type 3 3alpha-HSD/DD genes represent pseudo-genes or whether they represent genes that are differentially expressed in other rat tissues.
Genetic and epigenetic alteration among three homoeologous genes of a class E MADS box gene in hexaploid wheat.

PubMed

Shitsukawa, Naoki; Tahira, Chikako; Kassai, Ken-Ichiro; Hirabayashi, Chizuru; Shimizu, Tomoaki; Takumi, Shigeo; Mochida, Keiichi; Kawaura, Kanako; Ogihara, Yasunari; Murai, Koji

2007-06-01

Bread wheat (Triticum aestivum) is a hexaploid species with A, B, and D ancestral genomes. Most bread wheat genes are present in the genome as triplicated homoeologous genes (homoeologs) derived from the ancestral species. Here, we report that both genetic and epigenetic alterations have occurred in the homoeologs of a wheat class E MADS box gene. Two class E genes are identified in wheat, wheat SEPALLATA (WSEP) and wheat LEAFY HULL STERILE1 (WLHS1), which are homologs of Os MADS45 and Os MADS1 in rice (Oryza sativa), respectively. The three wheat homoeologs of WSEP showed similar genomic structures and expression profiles. By contrast, the three homoeologs of WLHS1 showed genetic and epigenetic alterations. The A genome WLHS1 homoeolog (WLHS1-A) had a structural alteration that contained a large novel sequence in place of the K domain sequence. A yeast two-hybrid analysis and a transgenic experiment indicated that the WLHS1-A protein had no apparent function. The B and D genome homoeologs, WLHS1-B and WLHS1-D, respectively, had an intact MADS box gene structure, but WLHS1-B was predominantly silenced by cytosine methylation. Consequently, of the three WLHS1 homoeologs, only WLHS1-D functions in hexaploid wheat. This is a situation where three homoeologs are differentially regulated by genetic and epigenetic mechanisms.
Structural, functional and evolutionary relationships between homing endonucleases and proteins from their host organisms

PubMed Central

Taylor, Gregory K.; Stoddard, Barry L.

2012-01-01

Homing endonucleases (HEs) are highly specific DNA-cleaving enzymes that are encoded by invasive DNA elements (usually mobile introns or inteins) within the genomes of phage, bacteria, archea, protista and eukaryotic organelles. Six unique structural HE families, that collectively span four distinct nuclease catalytic motifs, have been characterized to date. Members of each family display structural homology and functional relationships to a wide variety of proteins from various organisms. The biological functions of those proteins are highly disparate and include non-specific DNA-degradation enzymes, restriction endonucleases, DNA-repair enzymes, resolvases, intron splicing factors and transcription factors. These relationships suggest that modern day HEs share common ancestors with proteins involved in genome fidelity, maintenance and gene expression. This review summarizes the results of structural studies of HEs and corresponding proteins from host organisms that have illustrated the manner in which these factors are related. PMID:22406833
Genome-Wide Expression Profiling of Complex Regional Pain Syndrome

PubMed Central

Jin, Eun-Heui; Zhang, Enji; Ko, Youngkwon; Sim, Woo Seog; Moon, Dong Eon; Yoon, Keon Jung; Hong, Jang Hee; Lee, Won Hyung

2013-01-01

Complex regional pain syndrome (CRPS) is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II) and 5 controls (cut-off value: 1.5-fold change and p<0.05). Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1), matrix metalloproteinase 9 (MMP9), alanine aminopeptidase N (ANPEP), l-histidine decarboxylase (HDC), granulocyte colony-stimulating factor 3 receptor (G-CSF3R), and signal transducer and activator of transcription 3 (STAT3) genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10−4). The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression. PMID:24244504
Genomics and evolutionary aspect of calcium signaling event in calmodulin and calmodulin-like proteins in plants.

PubMed

Mohanta, Tapan Kumar; Kumar, Pradeep; Bae, Hanhong

2017-02-03

Ca 2+ ion is a versatile second messenger that operate in a wide ranges of cellular processes that impact nearly every aspect of life. Ca 2+ regulates gene expression and biotic and abiotic stress responses in organisms ranging from unicellular algae to multi-cellular higher plants through the cascades of calcium signaling processes. In this study, we deciphered the genomics and evolutionary aspects of calcium signaling event of calmodulin (CaM) and calmodulin like- (CML) proteins. We studied the CaM and CML gene family of 41 different species across the plant lineages. Genomic analysis showed that plant encodes more calmodulin like-protein than calmodulins. Further analyses showed, the majority of CMLs were intronless, while CaMs were intron rich. Multiple sequence alignment showed, the EF-hand domain of CaM contains four conserved D-x-D motifs, one in each EF-hand while CMLs contain only one D-x-D-x-D motif in the fourth EF-hand. Phylogenetic analysis revealed that, the CMLs were evolved earlier than CaM and later diversified. Gene expression analysis demonstrated that different CaM and CMLs genes were express differentially in different tissues in a spatio-temporal manner. In this study we provided in detailed genome-wide identifications and characterization of CaM and CML protein family, phylogenetic relationships, and domain structure. Expression study of CaM and CML genes were conducted in Glycine max and Phaseolus vulgaris. Our study provides a strong foundation for future functional research in CaM and CML gene family in plant kingdom.
Reprogramming somatic cells into iPS cells activates LINE-1 retroelement mobility

PubMed Central

Wissing, Silke; Muñoz-Lopez, Martin; Macia, Angela; Yang, Zhiyuan; Montano, Mauricio; Collins, William; Garcia-Perez, Jose Luis; Moran, John V.; Greene, Warner C.

2012-01-01

Long interspersed element-1 (LINE-1 or L1) retrotransposons account for nearly 17% of human genomic DNA and represent a major evolutionary force that has reshaped the structure and function of the human genome. However, questions remain concerning both the frequency and the developmental timing of L1 retrotransposition in vivo and whether the mobility of these retroelements commonly results in insertional and post-insertional mechanisms of genomic injury. Cells exhibiting high rates of L1 retrotransposition might be especially at risk for such injury. We assessed L1 mRNA expression and L1 retrotransposition in two biologically relevant cell types, human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs), as well as in control parental human dermal fibroblasts (HDFs). Full-length L1 mRNA and the L1 open reading frame 1-encoded protein (ORF1p) were readily detected in hESCs and iPSCs, but not in HDFs. Sequencing analysis proved the expression of human-specific L1 element mRNAs in iPSCs. Bisulfite sequencing revealed that the increased L1 expression observed in iPSCs correlates with an overall decrease in CpG methylation in the L1 promoter region. Finally, retrotransposition of an engineered human L1 element was ∼10-fold more efficient in iPSCs than in parental HDFs. These findings indicate that somatic cell reprogramming is associated with marked increases in L1 expression and perhaps increases in endogenous L1 retrotransposition, which could potentially impact the genomic integrity of the resultant iPSCs. PMID:21989055
Integrating genome-wide genetic variations and monocyte expression data reveals trans-regulated gene modules in humans.

PubMed

Rotival, Maxime; Zeller, Tanja; Wild, Philipp S; Maouche, Seraya; Szymczak, Silke; Schillert, Arne; Castagné, Raphaele; Deiseroth, Arne; Proust, Carole; Brocheton, Jessy; Godefroy, Tiphaine; Perret, Claire; Germain, Marine; Eleftheriadis, Medea; Sinning, Christoph R; Schnabel, Renate B; Lubos, Edith; Lackner, Karl J; Rossmann, Heidi; Münzel, Thomas; Rendon, Augusto; Erdmann, Jeanette; Deloukas, Panos; Hengstenberg, Christian; Diemert, Patrick; Montalescot, Gilles; Ouwehand, Willem H; Samani, Nilesh J; Schunkert, Heribert; Tregouet, David-Alexandre; Ziegler, Andreas; Goodall, Alison H; Cambien, François; Tiret, Laurence; Blankenberg, Stefan

2011-12-01

One major expectation from the transcriptome in humans is to characterize the biological basis of associations identified by genome-wide association studies. So far, few cis expression quantitative trait loci (eQTLs) have been reliably related to disease susceptibility. Trans-regulating mechanisms may play a more prominent role in disease susceptibility. We analyzed 12,808 genes detected in at least 5% of circulating monocyte samples from a population-based sample of 1,490 European unrelated subjects. We applied a method of extraction of expression patterns-independent component analysis-to identify sets of co-regulated genes. These patterns were then related to 675,350 SNPs to identify major trans-acting regulators. We detected three genomic regions significantly associated with co-regulated gene modules. Association of these loci with multiple expression traits was replicated in Cardiogenics, an independent study in which expression profiles of monocytes were available in 758 subjects. The locus 12q13 (lead SNP rs11171739), previously identified as a type 1 diabetes locus, was associated with a pattern including two cis eQTLs, RPS26 and SUOX, and 5 trans eQTLs, one of which (MADCAM1) is a potential candidate for mediating T1D susceptibility. The locus 12q24 (lead SNP rs653178), which has demonstrated extensive disease pleiotropy, including type 1 diabetes, hypertension, and celiac disease, was associated to a pattern strongly correlating to blood pressure level. The strongest trans eQTL in this pattern was CRIP1, a known marker of cellular proliferation in cancer. The locus 12q15 (lead SNP rs11177644) was associated with a pattern driven by two cis eQTLs, LYZ and YEATS4, and including 34 trans eQTLs, several of them tumor-related genes. This study shows that a method exploiting the structure of co-expressions among genes can help identify genomic regions involved in trans regulation of sets of genes and can provide clues for understanding the mechanisms linking genome-wide association loci to disease.
Trichostatin A effects on gene expression in the protozoan parasite Entamoeba histolytica

PubMed Central

Ehrenkaufer, Gretchen M; Eichinger, Daniel J; Singh, Upinder

2007-01-01

Background Histone modification regulates chromatin structure and influences gene expression associated with diverse biological functions including cellular differentiation, cancer, maintenance of genome architecture, and pathogen virulence. In Entamoeba, a deep-branching eukaryote, short chain fatty acids (SCFA) affect histone acetylation and parasite development. Additionally, a number of active histone modifying enzymes have been identified in the parasite genome. However, the overall extent of gene regulation tied to histone acetylation is not known. Results In order to identify the genome-wide effects of histone acetylation in regulating E. histolytica gene expression, we used whole-genome expression profiling of parasites treated with SCFA and Trichostatin A (TSA). Despite significant changes in histone acetylation patterns, exposure of parasites to SCFA resulted in minimal transcriptional changes (11 out of 9,435 genes transcriptionally regulated). In contrast, exposure to TSA, a more specific inhibitor of histone deacetylases, significantly affected transcription of 163 genes (122 genes upregulated and 41 genes downregulated). Genes modulated by TSA were not regulated by treatment with 5-Azacytidine, an inhibitor of DNA-methyltransferase, indicating that in E. histolytica the crosstalk between DNA methylation and histone modification is not substantial. However, the set of genes regulated by TSA overlapped substantially with genes regulated during parasite development: 73/122 genes upregulated by TSA exposure were upregulated in E. histolytica cysts (p-value = 6 × 10-53) and 15/41 genes downregulated by TSA exposure were downregulated in E. histolytica cysts (p-value = 3 × 10-7). Conclusion This work represents the first genome-wide analysis of histone acetylation and its effects on gene expression in E. histolytica. The data indicate that SCFAs, despite their ability to influence histone acetylation, have minimal effects on gene transcription in cultured parasites. In contrast, the effect of TSA on E. histolytica gene expression is more substantial and includes genes involved in the encystation pathway. These observations will allow further dissection of the effects of histone acetylation and the genetic pathways regulating stage conversion in this pathogenic parasite. PMID:17612405
Role of the putative structural protein Sed1p in mitochondrial genome maintenance.

PubMed

Phadnis, Naina; Ayres Sia, Elaine

2004-09-24

The nuclear gene MIP1 encodes the mitochondrial DNA polymerase responsible for replicating the mitochondrial genome in Saccharomyces cerevisiae. A number of other factors involved in replicating and segregating the mitochondrial genome are yet to be identified. Here, we report that a bacterial two-hybrid screen using the mitochondrial polymerase, Mip1p, as bait identified the yeast protein Sed1p. Sed1p is a cell surface protein highly expressed in the stationary phase. We find that several modified forms of Sed1p are expressed and the largest of these forms interacts with the mitochondrial polymerase in vitro. Deletion of SED1 causes a 3.5-fold increase in the rate of mitochondrial DNA point mutations as well as a 4.3-fold increase in the rate of loss of respiration. In contrast, we see no change in the rate of nuclear point mutations indicating the specific role of Sed1p function in mitochondrial genome stability. Indirect immunofluorescence analysis of Sed1p localization shows that Sed1p is targeted to the mitochondria. Moreover, Sed1p is detected in purified mitochondrial fractions and the localization to the mitochondria of the largest modified form is insensitive to the action of proteinase K. Deletion of the sed1 gene results in a reduction in the quantity of Mip1p and also affects the levels of a mitochondrially-expressed protein, Cox3p. Our results point towards a role for Sed1p in mitochondrial genome maintenance.
Genome-wide identification, phylogeny and expressional profiles of mitogen activated protein kinase kinase kinase (MAPKKK) gene family in bread wheat (Triticum aestivum L.).

PubMed

Wang, Meng; Yue, Hong; Feng, Kewei; Deng, Pingchuan; Song, Weining; Nie, Xiaojun

2016-08-22

Mitogen-activated protein kinase kinase kinases (MAPKKKs) are the important components of MAPK cascades, which play the crucial role in plant growth and development as well as in response to diverse stresses. Although this family has been systematically studied in many plant species, little is known about MAPKKK genes in wheat (Triticum aestivum L.), especially those involved in the regulatory network of stress processes. In this study, we identified 155 wheat MAPKKK genes through a genome-wide search method based on the latest available wheat genome information, of which 29 belonged to MEKK, 11 to ZIK and 115 to Raf subfamily, respectively. Then, chromosome localization, gene structure and conserved protein motifs and phylogenetic relationship as well as regulatory network of these TaMAPKKKs were systematically investigated and results supported the prediction. Furthermore, a total of 11 homologous groups between A, B and D sub-genome and 24 duplication pairs among them were detected, which contributed to the expansion of wheat MAPKKK gene family. Finally, the expression profiles of these MAPKKKs during development and under different abiotic stresses were investigated using the RNA-seq data. Additionally, 10 tissue-specific and 4 salt-responsive TaMAPKKK genes were selected to validate their expression level through qRT-PCR analysis. This study for the first time reported the genome organization, evolutionary features and expression profiles of the wheat MAPKKK gene family, which laid the foundation for further functional analysis of wheat MAPKKK genes, and contributed to better understanding the roles and regulatory mechanism of MAPKKKs in wheat.
Analysis of the Genome Structure of the Nonpathogenic Probiotic Escherichia coli Strain Nissle 1917

PubMed Central

Grozdanov, Lubomir; Raasch, Carsten; Schulze, Jürgen; Sonnenborn, Ulrich; Gottschalk, Gerhard; Hacker, Jörg; Dobrindt, Ulrich

2004-01-01

Nonpathogenic Escherichia coli strain Nissle 1917 (O6:K5:H1) is used as a probiotic agent in medicine, mainly for the treatment of various gastroenterological diseases. To gain insight on the genetic level into its properties of colonization and commensalism, this strain's genome structure has been analyzed by three approaches: (i) sequence context screening of tRNA genes as a potential indication of chromosomal integration of horizontally acquired DNA, (ii) sequence analysis of 280 kb of genomic islands (GEIs) coding for important fitness factors, and (iii) comparison of Nissle 1917 genome content with that of other E. coli strains by DNA-DNA hybridization. PCR-based screening of 324 nonpathogenic and pathogenic E. coli isolates of different origins revealed that some chromosomal regions are frequently detectable in nonpathogenic E. coli and also among extraintestinal and intestinal pathogenic strains. Many known fitness factor determinants of strain Nissle 1917 are localized on four GEIs which have been partially sequenced and analyzed. Comparison of these data with the available knowledge of the genome structure of E. coli K-12 strain MG1655 and of uropathogenic E. coli O6 strains CFT073 and 536 revealed structural similarities on the genomic level, especially between the E. coli O6 strains. The lack of defined virulence factors (i.e., alpha-hemolysin, P-fimbrial adhesins, and the semirough lipopolysaccharide phenotype) combined with the expression of fitness factors such as microcins, different iron uptake systems, adhesins, and proteases, which may support its survival and successful colonization of the human gut, most likely contributes to the probiotic character of E. coli strain Nissle 1917. PMID:15292145
Epigenetics and Epigenomics of Plants.

PubMed

Yadav, Chandra Bhan; Pandey, Garima; Muthamilarasan, Mehanathan; Prasad, Manoj

2018-01-23

The genetic material DNA in association with histone proteins forms the complex structure called chromatin, which is prone to undergo modification through certain epigenetic mechanisms including cytosine DNA methylation, histone modifications, and small RNA-mediated methylation. Alterations in chromatin structure lead to inaccessibility of genomic DNA to various regulatory proteins such as transcription factors, which eventually modulates gene expression. Advancements in high-throughput sequencing technologies have provided the opportunity to study the epigenetic mechanisms at genome-wide levels. Epigenomic studies using high-throughput technologies will widen the understanding of mechanisms as well as functions of regulatory pathways in plant genomes, which will further help in manipulating these pathways using genetic and biochemical approaches. This technology could be a potential research tool for displaying the systematic associations of genetic and epigenetic variations, especially in terms of cytosine methylation onto the genomic region in a specific cell or tissue. A comprehensive study of plant populations to correlate genotype to epigenotype and to phenotype, and also the study of methyl quantitative trait loci (QTL) or epiGWAS, is possible by using high-throughput sequencing methods, which will further accelerate molecular breeding programs for crop improvement. Graphical Abstract.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride

PubMed Central

Matroudi, S.; Zamani, M.R.; Motallebi, M.

2008-01-01

In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Into the Fourth Dimension: Dysregulation of Genome Architecture in Aging and Alzheimer's Disease.

PubMed

Winick-Ng, Warren; Rylett, R Jane

2018-01-01

Alzheimer's disease (AD) is a progressive neurodegenerative disease characterized by synapse dysfunction and cognitive impairment. Understanding the development and progression of AD is challenging, as the disease is highly complex and multifactorial. Both environmental and genetic factors play a role in AD pathogenesis, highlighted by observations of complex DNA modifications at the single gene level, and by new evidence that also implicates changes in genome architecture in AD patients. The four-dimensional structure of chromatin in space and time is essential for context-dependent regulation of gene expression in post-mitotic neurons. Dysregulation of epigenetic processes have been observed in the aging brain and in patients with AD, though there is not yet agreement on the impact of these changes on transcription. New evidence shows that proteins involved in genome organization have altered expression and localization in the AD brain, suggesting that the genomic landscape may play a critical role in the development of AD. This review discusses the role of the chromatin organizers and epigenetic modifiers in post-mitotic cells, the aging brain, and in the development and progression of AD. How these new insights can be used to help determine disease risk and inform treatment strategies will also be discussed.

Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence.

PubMed

Savage, Jeanne E; Jansen, Philip R; Stringer, Sven; Watanabe, Kyoko; Bryois, Julien; de Leeuw, Christiaan A; Nagel, Mats; Awasthi, Swapnil; Barr, Peter B; Coleman, Jonathan R I; Grasby, Katrina L; Hammerschlag, Anke R; Kaminski, Jakob A; Karlsson, Robert; Krapohl, Eva; Lam, Max; Nygaard, Marianne; Reynolds, Chandra A; Trampush, Joey W; Young, Hannah; Zabaneh, Delilah; Hägg, Sara; Hansell, Narelle K; Karlsson, Ida K; Linnarsson, Sten; Montgomery, Grant W; Muñoz-Manchado, Ana B; Quinlan, Erin B; Schumann, Gunter; Skene, Nathan G; Webb, Bradley T; White, Tonya; Arking, Dan E; Avramopoulos, Dimitrios; Bilder, Robert M; Bitsios, Panos; Burdick, Katherine E; Cannon, Tyrone D; Chiba-Falek, Ornit; Christoforou, Andrea; Cirulli, Elizabeth T; Congdon, Eliza; Corvin, Aiden; Davies, Gail; Deary, Ian J; DeRosse, Pamela; Dickinson, Dwight; Djurovic, Srdjan; Donohoe, Gary; Conley, Emily Drabant; Eriksson, Johan G; Espeseth, Thomas; Freimer, Nelson A; Giakoumaki, Stella; Giegling, Ina; Gill, Michael; Glahn, David C; Hariri, Ahmad R; Hatzimanolis, Alex; Keller, Matthew C; Knowles, Emma; Koltai, Deborah; Konte, Bettina; Lahti, Jari; Le Hellard, Stephanie; Lencz, Todd; Liewald, David C; London, Edythe; Lundervold, Astri J; Malhotra, Anil K; Melle, Ingrid; Morris, Derek; Need, Anna C; Ollier, William; Palotie, Aarno; Payton, Antony; Pendleton, Neil; Poldrack, Russell A; Räikkönen, Katri; Reinvang, Ivar; Roussos, Panos; Rujescu, Dan; Sabb, Fred W; Scult, Matthew A; Smeland, Olav B; Smyrnis, Nikolaos; Starr, John M; Steen, Vidar M; Stefanis, Nikos C; Straub, Richard E; Sundet, Kjetil; Tiemeier, Henning; Voineskos, Aristotle N; Weinberger, Daniel R; Widen, Elisabeth; Yu, Jin; Abecasis, Goncalo; Andreassen, Ole A; Breen, Gerome; Christiansen, Lene; Debrabant, Birgit; Dick, Danielle M; Heinz, Andreas; Hjerling-Leffler, Jens; Ikram, M Arfan; Kendler, Kenneth S; Martin, Nicholas G; Medland, Sarah E; Pedersen, Nancy L; Plomin, Robert; Polderman, Tinca J C; Ripke, Stephan; van der Sluis, Sophie; Sullivan, Patrick F; Vrieze, Scott I; Wright, Margaret J; Posthuma, Danielle

2018-06-25

Intelligence is highly heritable 1 and a major determinant of human health and well-being 2 . Recent genome-wide meta-analyses have identified 24 genomic loci linked to variation in intelligence 3-7 , but much about its genetic underpinnings remains to be discovered. Here, we present a large-scale genetic association study of intelligence (n = 269,867), identifying 205 associated genomic loci (190 new) and 1,016 genes (939 new) via positional mapping, expression quantitative trait locus (eQTL) mapping, chromatin interaction mapping, and gene-based association analysis. We find enrichment of genetic effects in conserved and coding regions and associations with 146 nonsynonymous exonic variants. Associated genes are strongly expressed in the brain, specifically in striatal medium spiny neurons and hippocampal pyramidal neurons. Gene set analyses implicate pathways related to nervous system development and synaptic structure. We confirm previous strong genetic correlations with multiple health-related outcomes, and Mendelian randomization analysis results suggest protective effects of intelligence for Alzheimer's disease and ADHD and bidirectional causation with pleiotropic effects for schizophrenia. These results are a major step forward in understanding the neurobiology of cognitive function as well as genetically related neurological and psychiatric disorders.
Genome-Wide Identification and Structural Analysis of bZIP Transcription Factor Genes in Brassica napus.

PubMed

Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Lu, Kun; Xu, Xinfu; Wang, Rui; Li, Jiana; Qu, Cunmin

2017-10-24

The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed ( Brassica napus ). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B . napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B . napus and its parental lines and for molecular breeding studies of bZIP genes in B . napus .
Genome-Wide Identification and Structural Analysis of bZIP Transcription Factor Genes in Brassica napus

PubMed Central

Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Xu, Xinfu; Wang, Rui; Li, Jiana

2017-01-01

The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed (Brassica napus). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B. napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B. napus and its parental lines and for molecular breeding studies of bZIP genes in B. napus. PMID:29064393
Nutrigenomics and nutrigenetics.

PubMed

Farhud, Dd; Zarif Yeganeh, M; Zarif Yeganeh, M

2010-01-01

The nutrients are able to interact with molecular mechanisms and modulate the physiological functions in the body. The Nutritional Genomics focuses on the interaction between bioactive food components and the genome, which includes Nutrigenetics and Nutrigenomics. The influence of nutrients on f genes expression is called Nutrigenomics, while the heterogeneous response of gene variants to nutrients, dietary components and developing nutraceticals is called Nutrigenetics. Genetic variation is known to affect food tolerances among human subpopulations and may also influence dietary requirements and raising the possibility of individualizing nutritional intake for optimal health and disease prevention on the basis of an individual's genome. Nutrigenomics provides a genetic understanding for how common dietary components affect the balance between health and disease by altering the expression and/or structure of an individual's genetic makeup. Nutrigenetics describes that the genetic profile have impact on the response of body to bioactive food components by influencing their absorption, metabolism, and site of action.In this way, considering different aspects of gene-nutrient interaction and designing appropriate diet for every specific genotype that optimize individual health, diagnosis and nutritional treatment of genome instability, we could prevent and control conversion of healthy phenotype to diseases.
Nutrigenomics and Nutrigenetics

PubMed Central

Farhud, DD; Zarif Yeganeh, M; Zarif Yeganeh, M

2010-01-01

The nutrients are able to interact with molecular mechanisms and modulate the physiological functions in the body. The Nutritional Genomics focuses on the interaction between bioactive food components and the genome, which includes Nutrigenetics and Nutrigenomics. The influence of nutrients on f genes expression is called Nutrigenomics, while the heterogeneous response of gene variants to nutrients, dietary components and developing nutraceticals is called Nutrigenetics. Genetic variation is known to affect food tolerances among human subpopulations and may also influence dietary requirements and raising the possibility of individualizing nutritional intake for optimal health and disease prevention on the basis of an individual’s genome. Nutrigenomics provides a genetic understanding for how common dietary components affect the balance between health and disease by altering the expression and/or structure of an individual’s genetic makeup. Nutrigenetics describes that the genetic profile have impact on the response of body to bioactive food components by influencing their absorption, metabolism, and site of action. In this way, considering different aspects of gene–nutrient interaction and designing appropriate diet for every specific genotype that optimize individual health, diagnosis and nutritional treatment of genome instability, we could prevent and control conversion of healthy phenotype to diseases. PMID:23113033
Genome-Wide Identification of the Alba Gene Family in Plants and Stress-Responsive Expression of the Rice Alba Genes

PubMed Central

Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan

2018-01-01

Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa, Zea mays, Sorghum bicolor, Cicer arietinum, and Vitis vinifera, and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii, Physcomitrella patens, and Amborella trichopoda, revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice (OsAlba), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure–function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants. PMID:29597290
A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome.

PubMed

Konkel, Miriam K; Batzer, Mark A

2010-08-01

It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families - long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements - mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development. Copyright © 2010 Elsevier Ltd. All rights reserved.
Suppression of HPV E6 and E7 expression by BAF53 depletion in cervical cancer cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Kiwon; Lee, Ah-Young; Kwon, Yunhee Kim

Highlights: {yields} Integration of HPV into host genome critical for activation of E6 and E7 oncogenes. {yields} BAF53 is essential for higher-order chromatin structure. {yields} BAF53 knockdown suppresses E6 and E7 from HPV integrants, but not from episomal HPVs. {yields} BAF53 knockdown decreases H3K9Ac and H4K12Ac on P105 promoter of integrated HPV 18. {yields} BAF53 knockdown restores the p53-dependent signaling pathway in HeLa and SiHa cells. -- Abstract: Deregulation of the expression of human papillomavirus (HPV) oncogenes E6 and E7 plays a pivotal role in cervical carcinogenesis because the E6 and E7 proteins neutralize p53 and Rb tumor suppressor pathways,more » respectively. In approximately 90% of all cervical carcinomas, HPVs are found to be integrated into the host genome. Following integration, the core-enhancer element and P105 promoter that control expression of E6 and E7 adopt a chromatin structure that is different from that of episomal HPV, and this has been proposed to contribute to activation of E6 and E7 expression. However, the molecular basis underlying this chromatin structural change remains unknown. Previously, BAF53 has been shown to be essential for the integrity of higher-order chromatin structure and interchromosomal interactions. Here, we examined whether BAF53 is required for activated expression of E6 and E7 genes. We found that BAF53 knockdown led to suppression of expression of E6 and E7 genes from HPV integrants in cervical carcinoma cell lines HeLa and SiHa. Conversely, expression of transiently transfected HPV18-LCR-Luciferase was not suppressed by BAF53 knockdown. The level of the active histone marks H3K9Ac and H4K12Ac on the P105 promoter of integrated HPV 18 was decreased in BAF53 knockdown cells. BAF53 knockdown restored the p53-dependent signaling pathway in HeLa and SiHa cells. These results suggest that activated expression of the E6 and E7 genes of integrated HPV is dependent on BAF53-dependent higher-order chromatin structure or nuclear motor activity.« less
An orthologous transcriptional signature differentiates responses towards closely related chemicals in Arabidopsis thaliana and brassica napus

EPA Science Inventory

Herbicides are structurally diverse chemicals that inhibit plant-specific targets, however their off-target and potentially differentiating side-effects are less well defined. In this study, genome-wide expression profiling based on Affymetrix AtH1 arrays was used to identify dis...
Acute Toluene Exposure alters expression of genes associated with synaptic structure and function

EPA Science Inventory

Toluene (TOL), a volatile organic compound, is a ubiquitous air pollutant of interest to EPA regulatory programs. Whereas its acute functional effects are well described, several potential modes of action in the CNS have been proposed. Therefore, the genomic response to acute TOL...
Primary structural variation in anaplasma marginale Msp2 efficiently generates immune escape variants

USDA-ARS?s Scientific Manuscript database

Antigenic variation allows microbial pathogens to evade immune clearance and establish persistent infection. Anaplasma marginale utilizes gene conversion of a repertoire of silent msp2 alleles into a single active expression site to encode unique Msp2 variants. As the genomic complement of msp2 alle...
Genomic Characterization of Variable Surface Antigens Reveals a Telomere Position Effect as a Prerequisite for RNA Interference-Mediated Silencing in Paramecium tetraurelia

PubMed Central

Baranasic, Damir; Oppermann, Timo; Cheaib, Miriam; Cullum, John; Schmidt, Helmut

2014-01-01

ABSTRACT Antigenic or phenotypic variation is a widespread phenomenon of expression of variable surface protein coats on eukaryotic microbes. To clarify the mechanism behind mutually exclusive gene expression, we characterized the genetic properties of the surface antigen multigene family in the ciliate Paramecium tetraurelia and the epigenetic factors controlling expression and silencing. Genome analysis indicated that the multigene family consists of intrachromosomal and subtelomeric genes; both classes apparently derive from different gene duplication events: whole-genome and intrachromosomal duplication. Expression analysis provides evidence for telomere position effects, because only subtelomeric genes follow mutually exclusive transcription. Microarray analysis of cultures deficient in Rdr3, an RNA-dependent RNA polymerase, in comparison to serotype-pure wild-type cultures, shows cotranscription of a subset of subtelomeric genes, indicating that the telomere position effect is due to a selective occurrence of Rdr3-mediated silencing in subtelomeric regions. We present a model of surface antigen evolution by intrachromosomal gene duplication involving the maintenance of positive selection of structurally relevant regions. Further analysis of chromosome heterogeneity shows that alternative telomere addition regions clearly affect transcription of closely related genes. Consequently, chromosome fragmentation appears to be of crucial importance for surface antigen expression and evolution. Our data suggest that RNAi-mediated control of this genetic network by trans-acting RNAs allows rapid epigenetic adaptation by phenotypic variation in combination with long-term genetic adaptation by Darwinian evolution of antigen genes. PMID:25389173
Genome engineering and gene expression control for bacterial strain development.

PubMed

Song, Chan Woo; Lee, Joungmin; Lee, Sang Yup

2015-01-01

In recent years, a number of techniques and tools have been developed for genome engineering and gene expression control to achieve desired phenotypes of various bacteria. Here we review and discuss the recent advances in bacterial genome manipulation and gene expression control techniques, and their actual uses with accompanying examples. Genome engineering has been commonly performed based on homologous recombination. During such genome manipulation, the counterselection systems employing SacB or nucleases have mainly been used for the efficient selection of desired engineered strains. The recombineering technology enables simple and more rapid manipulation of the bacterial genome. The group II intron-mediated genome engineering technology is another option for some bacteria that are difficult to be engineered by homologous recombination. Due to the increasing demands on high-throughput screening of bacterial strains having the desired phenotypes, several multiplex genome engineering techniques have recently been developed and validated in some bacteria. Another approach to achieve desired bacterial phenotypes is the repression of target gene expression without the modification of genome sequences. This can be performed by expressing antisense RNA, small regulatory RNA, or CRISPR RNA to repress target gene expression at the transcriptional or translational level. All of these techniques allow efficient and rapid development and screening of bacterial strains having desired phenotypes, and more advanced techniques are expected to be seen. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

PubMed

Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

2015-01-01

Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genomic Organization, Phylogenetic and Expression Analysis of the B-BOX Gene Family in Tomato

PubMed Central

Chu, Zhuannan; Wang, Xin; Li, Ying; Yu, Huiyang; Li, Jinhua; Lu, Yongen; Li, Hanxia; Ouyang, Bo

2016-01-01

The B-BOX (BBX) proteins encode a class of zinc-finger transcription factors possessing one or two B-BOX domains and in some cases an additional CCT (CO, CO-like and TOC1) motif, which play important roles in regulating plant growth, development and stress response. Nevertheless, no systematic study of BBX genes has undertaken in tomato (Solanum lycopersicum). Here we present the results of a genome-wide analysis of the 29 BBX genes in this important vegetable species. Their structures, conserved domains, phylogenetic relationships, subcellular localizations, and promoter cis-regulatory elements were analyzed; their tissue expression profiles and expression patterns under various hormones and stress treatments were also investigated in detail. Tomato BBX genes can be divided into five subfamilies, and twelve of them were found to be segmentally duplicated. Real-time quantitative PCR analysis showed that most BBX genes exhibited different temporal and spatial expression patterns. The expression of most BBX genes can be induced by drought, polyethylene glycol-6000 or heat stress. Some BBX genes were induced strongly by phytohormones such as abscisic acid, gibberellic acid, or ethephon. The majority of tomato BBX proteins was predicted to be located in nuclei, and the transient expression assay using Arabidopsis mesophyll protoplasts demonstrated that all the seven BBX members tested (SlBBX5, 7, 15, 17, 20, 22, and 24) were localized in nucleus. Our analysis of tomato BBX genes on the genome scale would provide valuable information for future functional characterization of specific genes in this family. PMID:27807440
A look at the possible mechanism and potential of magneto therapy.

PubMed

Jacobson, J I

1991-03-07

A testable theoretical model for the mechanism of magneto-therapy is presented. The theory delineated is the equation mc2 = Bvl coulomb which sets in dual resonance gravitational and electromagnetic potentials. This proposed unification of Einstein's gravity and Maxwell's electromagnetism is designated Jacobson's resonance and is a general expression of Zeeman and cyclotron resonance. The application of this theory involves the utilization of exogenously sourced very weak magnetic fields on the order of magnitude 10(-8) gauss to reorient the atomic crystal lattice structures of genomic magnetic domains. Examples of genomic magnetic domains are homeoboxes and oncogenes and associated structures like peptide hormone trophic factors. Various phenomena are also analyzed in terms of how they may relate to biological systems such as solitons, phonons, cyclotron resonance, the piezoelectric effect, the fractional quantum Hall effect, string theory, and biologically closed electric circuits. The potential of magneto-therapy in the treatment of various genomic and associated disorders is explored. The ultimate question "Can an oncogene be electromagnetically induced into becoming a structurally homologous normal gene?" is posed.
Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia), a carnivorous plant with a minimal genome

PubMed Central

2011-01-01

Background The carnivorous plant Utricularia gibba (bladderwort) is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution), and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS). Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey digestion that were previously thought to be encoded by bacteria. Supporting physiological data, global gene expression analysis shows that traps significantly over-express genes involved in respiration and that phosphate uptake might occur mainly in traps, whereas nitrogen uptake could in part take place in vegetative parts. Expression of DNA repair and ROS detoxification enzymes may be indicative of a response to increased respiration. Finally, evidence from the bladderwort transcriptome, direct measurement of ROS in situ, and cross-species comparisons of organellar genomes and multiple nuclear genes supports the hypothesis that increased nucleotide substitution rates throughout the plant may be due to the mutagenic action of amplified ROS production. PMID:21639913
MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression

PubMed Central

Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.

2016-01-01

MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769
Genomic Organization, Transcriptomic Analysis, and Functional Characterization of Avian α- and β-Keratins in Diverse Feather Forms

PubMed Central

Fan, Wen-Lang; Yan, Jie; Chen, Chih-Kuan; Lai, Yu-Ting; Wu, Siao-Man; Mao, Chi-Tang; Chen, Jun-Jie; Lu, Mei-Yeh Jade; Ho, Meng-Ru; Widelitz, Randall B.; Chen, Chih-Feng; Chuong, Cheng-Ming; Li, Wen-Hsiung

2014-01-01

Feathers are hallmark avian integument appendages, although they were also present on theropods. They are composed of flexible corneous materials made of α- and β-keratins, but their genomic organization and their functional roles in feathers have not been well studied. First, we made an exhaustive search of α- and β-keratin genes in the new chicken genome assembly (Galgal4). Then, using transcriptomic analysis, we studied α- and β-keratin gene expression patterns in five types of feather epidermis. The expression patterns of β-keratin genes were different in different feather types, whereas those of α-keratin genes were less variable. In addition, we obtained extensive α- and β-keratin mRNA in situ hybridization data, showing that α-keratins and β-keratins are preferentially expressed in different parts of the feather components. Together, our data suggest that feather morphological and structural diversity can largely be attributed to differential combinations of α- and β-keratin genes in different intrafeather regions and/or feather types from different body parts. The expression profiles provide new insights into the evolutionary origin and diversification of feathers. Finally, functional analysis using mutant chicken keratin forms based on those found in the human α-keratin mutation database led to abnormal phenotypes. This demonstrates that the chicken can be a convenient model for studying the molecular biology of human keratin-based diseases. PMID:25152353
Nucleocapsid protein-dependent assembly of the RNA packaging signal of Middle East respiratory syndrome coronavirus.

PubMed

Hsin, Wei-Chen; Chang, Chan-Hua; Chang, Chi-You; Peng, Wei-Hao; Chien, Chung-Liang; Chang, Ming-Fu; Chang, Shin C

2018-05-24

Middle East respiratory syndrome coronavirus (MERS-CoV) consists of a positive-sense, single-stranded RNA genome and four structural proteins: the spike, envelope, membrane, and nucleocapsid protein. The assembly of the viral genome into virus particles involves viral structural proteins and is believed to be mediated through recognition of specific sequences and RNA structures of the viral genome. A culture system for the production of MERS coronavirus-like particles (MERS VLPs) was determined and established by electron microscopy and the detection of coexpressed viral structural proteins. Using the VLP system, a 258-nucleotide RNA fragment, which spans nucleotides 19,712 to 19,969 of the MERS-CoV genome (designated PS258(19712-19969) ME ), was identified to function as a packaging signal. Assembly of the RNA packaging signal into MERS VLPs is dependent on the viral nucleocapsid protein. In addition, a 45-nucleotide stable stem-loop substructure of the PS258(19712-19969) ME interacted with both the N-terminal domain and the C-terminal domain of the viral nucleocapsid protein. Furthermore, a functional SARS-CoV RNA packaging signal failed to assemble into the MERS VLPs, which indicated virus-specific assembly of the RNA genome. A MERS-oV RNA packaging signal was identified by the detection of GFP expression following an incubation of MERS VLPs carrying the heterologous mRNA GFP-PS258(19712-19969) ME with virus permissive Huh7 cells. The MERS VLP system could help us in understanding virus infection and morphogenesis.

Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation.

PubMed

Keates, Tracy; Cooper, Christopher D O; Savitsky, Pavel; Allerston, Charles K; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

2012-06-15

The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. Copyright © 2011 Elsevier B.V. All rights reserved.
Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation

PubMed Central

Keates, Tracy; Cooper, Christopher D.O.; Savitsky, Pavel; Allerston, Charles K.; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A.; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

2012-01-01

The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. PMID:22027370
Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta

PubMed Central

Kanost, Michael R.; Arrese, Estela L.; Cao, Xiaolong; Chen, Yun-Ru; Chellapilla, Sanjay; Goldsmith, Marian R; Grosse-Wilde, Ewald; Heckel, David G.; Herndon, Nicolae; Jiang, Haobo; Papanicolaou, Alexie; Qu, Jiaxin; Soulages, Jose L.; Vogel, Heiko; Walters, James; Waterhouse, Robert M.; Ahn, Seung-Joon; Almeida, Francisca C.; An, Chunju; Aqrawi, Peshtewani; Bretschneider, Anne; Bryant, William B.; Bucks, Sascha; Chao, Hsu; Chevignon, Germain; Christen, Jayne M.; Clarke, David F.; Dittmer, Neal T.; Ferguson, Laura C.F.; Garavelou, Spyridoula; Gordon, Karl H.J.; Gunaratna, Ramesh T.; Han, Yi; Hauser, Frank; He, Yan; Heidel-Fischer, Hanna; Hirsh, Ariana; Hu, Yingxia; Jiang, Hongbo; Kalra, Divya; Klinner, Christian; König, Christopher; Kovar, Christie; Kroll, Ashley R.; Kuwar, Suyog S.; Lee, Sandy L.; Lehman, Rüdiger; Li, Kai; Li, Zhaofei; Liang, Hanquan; Lovelace, Shanna; Lu, Zhiqiang; Mansfield, Jennifer H.; McCulloch, Kyle J.; Mathew, Tittu; Morton, Brian; Muzny, Donna M.; Neunemann, David; Ongeri, Fiona; Pauchet, Yannick; Pu, Ling-Ling; Pyrousis, Ioannis; Rao, Xiang-Jun; Redding, Amanda; Roesel, Charles; Sanchez-Gracia, Alejandro; Schaack, Sarah; Shukla, Aditi; Tetreau, Guillaume; Wang, Yang; Xiong, Guang-Hua; Traut, Walther; Walsh, Tom K.; Worley, Kim C.; Wu, Di; Wu, Wenbi; Wu, Yuan-Qing; Zhang, Xiufeng; Zou, Zhen; Zucker, Hannah; Briscoe, Adriana D.; Burmester, Thorsten; Clem, Rollie J.; Feyereisen, René; Grimmelikhuijzen, Cornelis J.P; Hamodrakas, Stavros J.; Hansson, Bill S.; Huguet, Elisabeth; Jermiin, Lars S.; Lan, Que; Lehman, Herman K.; Lorenzen, Marce; Merzendorfer, Hans; Michalopoulos, Ioannis; Morton, David B.; Muthukrishnan, Subbaratnam; Oakeshott, John G.; Palmer, Will; Park, Yoonseong; Passarelli, A. Lorena; Rozas, Julio; Schwartz, Lawrence M.; Smith, Wendy; Southgate, Agnes; Vilcinskas, Andreas; Vogt, Richard; Wang, Ping; Werren, John; Yu, Xiao-Qiang; Zhou, Jing-Jiang; Brown, Susan J.; Scherer, Steven E.; Richards, Stephen; Blissard, Gary W.

2016-01-01

Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects. PMID:27522922
A Transcriptome Map of Actinobacillus pleuropneumoniae at Single-Nucleotide Resolution Using Deep RNA-Seq

PubMed Central

Su, Zhipeng; Zhu, Jiawen; Xu, Zhuofei; Xiao, Ran; Zhou, Rui; Li, Lu; Chen, Huanchun

2016-01-01

Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen. PMID:27018591
Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data

PubMed Central

Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong

2018-01-01

Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

PubMed

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Analysis of the complete nucleotide sequence and functional organization of the genome of Streptococcus pneumoniae bacteriophage Cp-1.

PubMed

Martín, A C; López, R; García, P

1996-06-01

Cp-1, a bacteriophage infecting Streptococcus pneumoniae, has a linear double-stranded DNA genome, with a terminal protein covalently linked to its 5' ends, that replicates by the protein-priming mechanism. We describe here the complete DNA sequence and transcriptional map of the Cp-1 genome. These analyses have led to the firm assignment of 10 genes and the localization of 19 additional open reading frames in the 19,345-bp Cp-1 DNA. Striking similarities and differences between some of these proteins and those of the Bacillus subtilis phage phi 29, a system that also replicates its DNA by the protein-priming mechanism, have been revealed. The genes coding for structural proteins and assembly factors are located in the central part of the Cp-1 genome. Several proteins corresponding to the predicted gene products were identified by in vitro and in vivo expression of the cloned genes. Mature major head protein from the virion particles results from hydrolysis of the primary gene product at the His-49 residue, whereas the phage gene is expressed in Escherichia coli without modification. We have also identified two open reading frames coding for proteins that show high degrees of similarity to the N- and C-terminal regions, respectively, of the single tail protein identified in phi 29. Sequencing and primer extension analysis suggest transcription of a small RNA showing a secondary structure similar to that of the prohead RNA required for the ATP-dependent packaging of phi 29 DNA. On the basis of its temporal expression, transcription of the Cp-1 genome takes place in two stages, early and late. Combined Northern (RNA) blot and primer extension experiments allowed us to map the 5' initiation sites of the transcripts, and we found that only three genes were transcribed from right to left. These analyses reveal that there are also noticeable differences between Cp-l and phi 29 in transcriptional organization. Considered together, the observations reported here provide new tangible evidence on phylogenetic relationships between B. subtilis and S. pneumoniae.
Genome-wide identification, phylogeny, and expression analysis of the SWEET gene family in tomato.

PubMed

Feng, Chao-Yang; Han, Jia-Xuan; Han, Xiao-Xue; Jiang, Jing

2015-12-01

The SWEET (Sugars Will Eventually Be Exported Transporters) gene family encodes membrane-embedded sugar transporters containing seven transmembrane helices harboring two MtN3 and saliva domain. SWEETs play important roles in diverse biological processes, including plant growth, development, and response to environmental stimuli. Here, we conducted an exhaustive search of the tomato genome, leading to the identification of 29 SWEET genes. We analyzed the structures, conserved domains, and phylogenetic relationships of these protein-coding genes in detail. We also analyzed the transcript levels of SWEET genes in various tissues, organs, and developmental stages to obtain information about their functions. Furthermore, we investigated the expression patterns of the SWEET genes in response to exogenous sugar and adverse environmental stress (high and low temperatures). Some family members exhibited tissue-specific expression, whereas others were more ubiquitously expressed. Numerous stress-responsive candidate genes were obtained. The results of this study provide insights into the characteristics of the SWEET genes in tomato and may serve as a basis for further functional studies of such genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Glycosyltransferase Gene Expression Profiles Classify Cancer Types and Propose Prognostic Subtypes

NASA Astrophysics Data System (ADS)

Ashkani, Jahanshah; Naidoo, Kevin J.

2016-05-01

Aberrant glycosylation in tumours stem from altered glycosyltransferase (GT) gene expression but can the expression profiles of these signature genes be used to classify cancer types and lead to cancer subtype discovery? The differential structural changes to cellular glycan structures are predominantly regulated by the expression patterns of GT genes and are a hallmark of neoplastic cell metamorphoses. We found that the expression of 210 GT genes taken from 1893 cancer patient samples in The Cancer Genome Atlas (TCGA) microarray data are able to classify six cancers; breast, ovarian, glioblastoma, kidney, colon and lung. The GT gene expression profiles are used to develop cancer classifiers and propose subtypes. The subclassification of breast cancer solid tumour samples illustrates the discovery of subgroups from GT genes that match well against basal-like and HER2-enriched subtypes and correlates to clinical, mutation and survival data. This cancer type glycosyltransferase gene signature finding provides foundational evidence for the centrality of glycosylation in cancer.
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term

PubMed Central

Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-01-01

Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098
Comparative study of four interleukin 17 cytokines of tongue sole Cynoglossus semilaevis: Genomic structure, expression pattern, and promoter activity.

PubMed

Chi, Heng; Sun, Li

2015-11-01

The interleukin (IL)-17 cytokine family participates in the regulation of many cellular functions. In the present study, we analyzed the genomic structure, expression, and promoter activity of four IL-17 members from the teleost fish tongue sole (Cynoglossus semilaevis), i.e. CsIL-17C CsIL-17D, CsIL-17F, and IL-17F like (IL-17Fl). We found that CsIL-17C, CsIL-17D, CsIL-17F, and CsIL-17Fl share 21.2%-28.6% overall sequence identities among themselves and 31.5%-71.2% overall sequence identities with their counterparts in other teleost. All four CsIL-17 members possess an IL-17 domain and four conserved cysteine residues. Phylogenetic analysis classified the four CsIL-17 members into three clusters. Under normal physiological conditions, the four CsIL-17 expressed in multiple tissues, especially non-immune tissues. Bacterial infection upregulated the expression of all four CsIL-17, while viral infection upregulated the expression of CsIL-17D and CsIL-17Fl but downregulated the expression of CsIL-17C and CsIL-17F. The 1.2 kb 5'-flanking regions of the four CsIL-17 exhibited apparent promoter activity and contain a number of putative transcription factor-binding sites. Furthermore, the promoter activities of CsIL-17C, CsIL-17D, and CsIL-17F, but not CsIL-17Fl, were modulated to significant extents by lipopolysaccharide, PolyI:C, and PMA. This study provides the first evidence that in teleost, different IL-17 members differ in expression pattern and promoter activity. Copyright © 2015 Elsevier Ltd. All rights reserved.
Peripheral blood gene expression signature differentiates children with autism from unaffected siblings

PubMed Central

Kong, SW; Shimizu-Motohashi, Y; Campbell, MG; Lee, IH; Collins, CD; Brewster, SJ; Holm, IA; Rappaport, L

2013-01-01

Autism spectrum disorder (ASD) is one of the most prevalent neurodevelopmental disorders with high heritability, yet a majority of genetic contribution to pathophysiology is not known. Siblings of individuals with ASD are at increased risk for ASD and autistic traits, but the genetic contribution for simplex families is estimated to be less when compared to multiplex families. To explore the genomic (dis-) similarity between proband and unaffected sibling in simplex families, we used genome-wide gene expression profiles of blood from 20 proband-unaffected sibling pairs and 18 unrelated control individuals. The global gene expression profiles of unaffected siblings were more similar to those from probands as they shared genetic and environmental background. One hundred eighty nine genes were significantly differentially expressed between proband-sib pairs (nominal p-value < 0.01) after controlling for age, sex, and family effects. Probands and siblings were distinguished into two groups by cluster analysis with these genes. Overall, unaffected siblings were equally distant from the centroid of probands and from that of unrelated controls with the differentially expressed genes. Interestingly, 5 of 20 siblings had gene expression profiles that were more similar to unrelated controls than to their matched probands. In summary, we found a set of genes that distinguished probands from the unaffected siblings, and a subgroup of unaffected siblings who were more similar to probands. The pathways that characterized probands compared to siblings using peripheral blood gene expression profiles were the up-regulation of ribosomal, spliceosomal, and mitochondrial pathways, and the down-regulation of neuroreceptor-ligand, immune response and calcium signaling pathways. Further integrative study with structural genetic variations such as de novo mutations, rare variants, and copy number variations would clarify whether these transcriptomic changes are structural or environmental in origin. PMID:23625158
Repetitive sequences: the hidden diversity of heterochromatin in prochilodontid fish

PubMed Central

Terencio, Maria L.; Schneider, Carlos H.; Gross, Maria C.; do Carmo, Edson Junior; Nogaroto, Viviane; de Almeida, Mara Cristina; Artoni, Roberto Ferreira; Vicari, Marcelo R.; Feldberg, Eliana

2015-01-01

Abstract The structure and organization of repetitive elements in fish genomes are still relatively poorly understood, although most of these elements are believed to be located in heterochromatic regions. Repetitive elements are considered essential in evolutionary processes as hotspots for mutations and chromosomal rearrangements, among other functions – thus providing new genomic alternatives and regulatory sites for gene expression. The present study sought to characterize repetitive DNA sequences in the genomes of Semaprochilodus insignis (Jardine & Schomburgk, 1841) and Semaprochilodus taeniurus (Valenciennes, 1817) and identify regions of conserved syntenic blocks in this genome fraction of three species of Prochilodontidae (Semaprochilodus insignis, Semaprochilodus taeniurus, and Prochilodus lineatus (Valenciennes, 1836) by cross-FISH using Cot-1 DNA (renaturation kinetics) probes. We found that the repetitive fractions of the genomes of Semaprochilodus insignis and Semaprochilodus taeniurus have significant amounts of conserved syntenic blocks in hybridization sites, but with low degrees of similarity between them and the genome of Prochilodus lineatus, especially in relation to B chromosomes. The cloning and sequencing of the repetitive genomic elements of Semaprochilodus insignis and Semaprochilodus taeniurus using Cot-1 DNA identified 48 fragments that displayed high similarity with repetitive sequences deposited in public DNA databases and classified as microsatellites, transposons, and retrotransposons. The repetitive fractions of the Semaprochilodus insignis and Semaprochilodus taeniurus genomes exhibited high degrees of conserved syntenic blocks in terms of both the structures and locations of hybridization sites, but a low degree of similarity with the syntenic blocks of the Prochilodus lineatus genome. Future comparative analyses of other prochilodontidae species will be needed to advance our understanding of the organization and evolution of the genomes in this group of fish. PMID:26752156
Global transgenerational gene expression dynamics in two newly synthesized allohexaploid wheat (Triticum aestivum) lines

PubMed Central

2012-01-01

Background Alteration in gene expression resulting from allopolyploidization is a prominent feature in plants, but its spectrum and extent are not fully known. Common wheat (Triticum aestivum) was formed via allohexaploidization about 10,000 years ago, and became the most important crop plant. To gain further insights into the genome-wide transcriptional dynamics associated with the onset of common wheat formation, we conducted microarray-based genome-wide gene expression analysis on two newly synthesized allohexaploid wheat lines with chromosomal stability and a genome constitution analogous to that of the present-day common wheat. Results Multi-color GISH (genomic in situ hybridization) was used to identify individual plants from two nascent allohexaploid wheat lines between Triticum turgidum (2n = 4x = 28; genome BBAA) and Aegilops tauschii (2n = 2x = 14; genome DD), which had a stable chromosomal constitution analogous to that of common wheat (2n = 6x = 42; genome BBAADD). Genome-wide analysis of gene expression was performed for these allohexaploid lines along with their parental plants from T. turgidum and Ae. tauschii, using the Affymetrix Gene Chip Wheat Genome-Array. Comparison with the parental plants coupled with inclusion of empirical mid-parent values (MPVs) revealed that whereas the great majority of genes showed the expected parental additivity, two major patterns of alteration in gene expression in the allohexaploid lines were identified: parental dominance expression and non-additive expression. Genes involved in each of the two altered expression patterns could be classified into three distinct groups, stochastic, heritable and persistent, based on their transgenerational heritability and inter-line conservation. Strikingly, whereas both altered patterns of gene expression showed a propensity of inheritance, identity of the involved genes was highly stochastic, consistent with the involvement of diverse Gene Ontology (GO) terms. Nonetheless, those genes showing non-additive expression exhibited a significant enrichment for vesicle-function. Conclusions Our results show that two patterns of global alteration in gene expression are conditioned by allohexaploidization in wheat, that is, parental dominance expression and non-additive expression. Both altered patterns of gene expression but not the identity of the genes involved are likely to play functional roles in stabilization and establishment of the newly formed allohexaploid plants, and hence, relevant to speciation and evolution of T. aestivum. PMID:22277161
The Rice B-Box Zinc Finger Gene Family: Genomic Identification, Characterization, Expression Profiling and Diurnal Analysis

PubMed Central

Huang, Jianyan; Zhao, Xiaobo; Weng, Xiaoyu; Wang, Lei; Xie, Weibo

2012-01-01

Background The B-box (BBX) -containing proteins are a class of zinc finger proteins that contain one or two B-box domains and play important roles in plant growth and development. The Arabidopsis BBX gene family has recently been re-identified and renamed. However, there has not been a genome-wide survey of the rice BBX (OsBBX) gene family until now. Methodology/Principal Findings In this study, we identified 30 rice BBX genes through a comprehensive bioinformatics analysis. Each gene was assigned a uniform nomenclature. We described the chromosome localizations, gene structures, protein domains, phylogenetic relationship, whole life-cycle expression profile and diurnal expression patterns of the OsBBX family members. Based on the phylogeny and domain constitution, the OsBBX gene family was classified into five subfamilies. The gene duplication analysis revealed that only chromosomal segmental duplication contributed to the expansion of the OsBBX gene family. The expression profile of the OsBBX genes was analyzed by Affymetrix GeneChip microarrays throughout the entire life-cycle of rice cultivar Zhenshan 97 (ZS97). In addition, microarray analysis was performed to obtain the expression patterns of these genes under light/dark conditions and after three phytohormone treatments. This analysis revealed that the expression patterns of the OsBBX genes could be classified into eight groups. Eight genes were regulated under the light/dark treatments, and eleven genes showed differential expression under at least one phytohormone treatment. Moreover, we verified the diurnal expression of the OsBBX genes using the data obtained from the Diurnal Project and qPCR analysis, and the results indicated that many of these genes had a diurnal expression pattern. Conclusions/Significance The combination of the genome-wide identification and the expression and diurnal analysis of the OsBBX gene family should facilitate additional functional studies of the OsBBX genes. PMID:23118960
Genomic data assimilation for estimating hybrid functional Petri net from time-course gene expression data.

PubMed

Nagasaki, Masao; Yamaguchi, Rui; Yoshida, Ryo; Imoto, Seiya; Doi, Atsushi; Tamada, Yoshinori; Matsuno, Hiroshi; Miyano, Satoru; Higuchi, Tomoyuki

2006-01-01

We propose an automatic construction method of the hybrid functional Petri net as a simulation model of biological pathways. The problems we consider are how we choose the values of parameters and how we set the network structure. Usually, we tune these unknown factors empirically so that the simulation results are consistent with biological knowledge. Obviously, this approach has the limitation in the size of network of interest. To extend the capability of the simulation model, we propose the use of data assimilation approach that was originally established in the field of geophysical simulation science. We provide genomic data assimilation framework that establishes a link between our simulation model and observed data like microarray gene expression data by using a nonlinear state space model. A key idea of our genomic data assimilation is that the unknown parameters in simulation model are converted as the parameter of the state space model and the estimates are obtained as the maximum a posteriori estimators. In the parameter estimation process, the simulation model is used to generate the system model in the state space model. Such a formulation enables us to handle both the model construction and the parameter tuning within a framework of the Bayesian statistical inferences. In particular, the Bayesian approach provides us a way of controlling overfitting during the parameter estimations that is essential for constructing a reliable biological pathway. We demonstrate the effectiveness of our approach using synthetic data. As a result, parameter estimation using genomic data assimilation works very well and the network structure is suitably selected.
Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fernandez-Fueyo, Elena; Ruiz-Duenas, Francisco J.; Ferreira, Patrica

Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We alsomore » observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.« less
Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

PubMed Central

Fernandez-Fueyo, Elena; Ruiz-Dueñas, Francisco J.; Ferreira, Patricia; Floudas, Dimitrios; Hibbett, David S.; Canessa, Paulo; Larrondo, Luis F.; James, Tim Y.; Seelenfreund, Daniela; Lobos, Sergio; Polanco, Rubén; Tello, Mario; Honda, Yoichi; Watanabe, Takahito; Watanabe, Takashi; Ryu, Jae San; Kubicek, Christian P.; Schmoll, Monika; Gaskell, Jill; Hammel, Kenneth E.; St. John, Franz J.; Vanden Wymelenberg, Amber; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit S.; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Lavín, José L.; Oguiza, José A.; Perez, Gumer; Pisabarro, Antonio G.; Ramirez, Lucia; Santoyo, Francisco; Master, Emma; Coutinho, Pedro M.; Henrissat, Bernard; Lombard, Vincent; Magnuson, Jon Karl; Kües, Ursula; Hori, Chiaki; Igarashi, Kiyohiko; Samejima, Masahiro; Held, Benjamin W.; Barry, Kerrie W.; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Riley, Robert; Salamov, Asaf A.; Hoffmeister, Dirk; Schwenk, Daniel; Hadar, Yitzhak; Yarden, Oded; de Vries, Ronald P.; Wiebenga, Ad; Stenlid, Jan; Eastwood, Daniel; Grigoriev, Igor V.; Berka, Randy M.; Blanchette, Robert A.; Kersten, Phil; Martinez, Angel T.; Vicuna, Rafael; Cullen, Dan

2012-01-01

Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2+. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium. PMID:22434909
Genome-wide identification and characterization of WRKY gene family in Salix suchowensis.

PubMed

Bi, Changwei; Xu, Yiqing; Ye, Qiaolin; Yin, Tongming; Ye, Ning

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I-III), with five subgroups (IIa-IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon-intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants.
Genome-wide identification and characterization of WRKY gene family in Salix suchowensis

PubMed Central

Ye, Qiaolin; Yin, Tongming

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I–III), with five subgroups (IIa–IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon–intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants. PMID:27651997

Genome-wide analysis of carotenoid cleavage oxygenase genes and their responses to various phytohormones and abiotic stresses in apple (Malus domestica).

PubMed

Chen, Hongfei; Zuo, Xiya; Shao, Hongxia; Fan, Sheng; Ma, Juanjuan; Zhang, Dong; Zhao, Caiping; Yan, Xiangyan; Liu, Xiaojie; Han, Mingyu

2018-02-01

Carotenoid cleavage oxygenases (CCOs) are able to cleave carotenoids to produce apocarotenoids and their derivatives, which are important for plant growth and development. In this study, 21 apple CCO genes were identified and divided into six groups based on their phylogenetic relationships. We further characterized the apple CCO genes in terms of chromosomal distribution, structure and the presence of cis-elements in the promoter. We also predicted the cellular localization of the encoded proteins. An analysis of the synteny within the apple genome revealed that tandem, segmental, and whole-genome duplication events likely contributed to the expansion of the apple carotenoid oxygenase gene family. An additional integrated synteny analysis identified orthologous carotenoid oxygenase genes between apple and Arabidopsis thaliana, which served as references for the functional analysis of the apple CCO genes. The net photosynthetic rate, transpiration rate, and stomatal conductance of leaves decreased, while leaf stomatal density increased under drought and saline conditions. Tissue-specific gene expression analyses revealed diverse spatiotemporal expression patterns. Finally, hormone and abiotic stress treatments indicated that many apple CCO genes are responsive to various phytohormones as well as drought and salinity stresses. The genome-wide identification of apple CCO genes and the analyses of their expression patterns described herein may provide a solid foundation for future studies examining the regulation and functions of this gene family. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Murine Hyperglycemic Vasculopathy and Cardiomyopathy: Whole-Genome Gene Expression Analysis Predicts Cellular Targets and Regulatory Networks Influenced by Mannose Binding Lectin

PubMed Central

Zou, Chenhui; La Bonte, Laura R.; Pavlov, Vasile I.; Stahl, Gregory L.

2012-01-01

Hyperglycemia, in the absence of type 1 or 2 diabetes, is an independent risk factor for cardiovascular disease. We have previously demonstrated a central role for mannose binding lectin (MBL)-mediated cardiac dysfunction in acute hyperglycemic mice. In this study, we applied whole-genome microarray data analysis to investigate MBL’s role in systematic gene expression changes. The data predict possible intracellular events taking place in multiple cellular compartments such as enhanced insulin signaling pathway sensitivity, promoted mitochondrial respiratory function, improved cellular energy expenditure and protein quality control, improved cytoskeleton structure, and facilitated intracellular trafficking, all of which may contribute to the organismal health of MBL null mice against acute hyperglycemia. Our data show a tight association between gene expression profile and tissue function which might be a very useful tool in predicting cellular targets and regulatory networks connected with in vivo observations, providing clues for further mechanistic studies. PMID:22375142
Genome-wide identification and expression analysis of TCP transcription factors in Gossypium raimondii.

PubMed

Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong

2014-10-16

Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Genome-wide identification and expression analysis of TCP transcription factors in Gossypium raimondii

PubMed Central

Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong

2014-01-01

Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
HMGN proteins modulate chromatin regulatory sites and gene expression during activation of naïve B cells

PubMed Central

Zhang, Shaofei; Zhu, Iris; Deng, Tao; Furusawa, Takashi; Rochman, Mark; Vacchio, Melanie S.; Bosselut, Remy; Yamane, Arito; Casellas, Rafael; Landsman, David; Bustin, Michael

2016-01-01

The activation of naïve B lymphocyte involves rapid and major changes in chromatin organization and gene expression; however, the complete repertoire of nuclear factors affecting these genomic changes is not known. We report that HMGN proteins, which bind to nucleosomes and affect chromatin structure and function, co-localize with, and maintain the intensity of DNase I hypersensitive sites genome wide, in resting but not in activated B cells. Transcription analyses of resting and activated B cells from wild-type and Hmgn−/− mice, show that loss of HMGNs dampens the magnitude of the transcriptional response and alters the pattern of gene expression during the course of B-cell activation; defense response genes are most affected at the onset of activation. Our study provides insights into the biological function of the ubiquitous HMGN chromatin binding proteins and into epigenetic processes that affect the fidelity of the transcriptional response during the activation of B cell lymphocytes. PMID:27112571
Organizational differences between cytoplasmic male sterile and male fertile Brassica mitochondrial genomes are confined to a single transposed locus.

PubMed Central

L'Homme, Y; Brown, G G

1993-01-01

Comparison of the physical maps of male fertile (cam) and male sterile (pol) mitochondrial genomes of Brassica napus indicates that structural differences between the two mtDNAs are confined to a region immediately upstream of the atp6 gene. Relative to cam mtDNA, pol mtDNA possesses a 4.5 kb segment at this locus that includes a chimeric gene that is cotranscribed with atp6 and lacks an approximately 1kb region located upstream of the cam atp6 gene. The 4.5 kb pol segment is present and similarly organized in the mitochondrial genome of the common nap B.napus cytoplasm; however, the nap and pol DNA regions flanking this segment are different and the nap sequences are not expressed. The 4.5 kb CMS-associated pol segment has thus apparently undergone transposition during the evolution of the nap and pol cytoplasms and has been lost in the cam genome subsequent to the pol-cam divergence. This 4.5 kb segment comprises the single DNA region that is expressed differently in fertile, pol CMS and fertility restored pol cytoplasm plants. The finding that this locus is part of the single mtDNA region organized differently in the fertile and male sterile mitochondrial genomes provides strong support for the view that it specifies the pol CMS trait. Images PMID:8388101
Global Genetic Response in a Cancer Cell: Self-Organized Coherent Expression Dynamics

PubMed Central

Tsuchiya, Masa; Hashimoto, Midori; Takenaka, Yoshiko; Motoike, Ikuko N.; Yoshikawa, Kenichi

2014-01-01

Understanding the basic mechanism of the spatio-temporal self-control of genome-wide gene expression engaged with the complex epigenetic molecular assembly is one of major challenges in current biological science. In this study, the genome-wide dynamical profile of gene expression was analyzed for MCF-7 breast cancer cells induced by two distinct ErbB receptor ligands: epidermal growth factor (EGF) and heregulin (HRG), which drive cell proliferation and differentiation, respectively. We focused our attention to elucidate how global genetic responses emerge and to decipher what is an underlying principle for dynamic self-control of genome-wide gene expression. The whole mRNA expression was classified into about a hundred groups according to the root mean square fluctuation (rmsf). These expression groups showed characteristic time-dependent correlations, indicating the existence of collective behaviors on the ensemble of genes with respect to mRNA expression and also to temporal changes in expression. All-or-none responses were observed for HRG and EGF (biphasic statistics) at around 10–20 min. The emergence of time-dependent collective behaviors of expression occurred through bifurcation of a coherent expression state (CES). In the ensemble of mRNA expression, the self-organized CESs reveals distinct characteristic expression domains for biphasic statistics, which exhibits notably the presence of criticality in the expression profile as a route for genomic transition. In time-dependent changes in the expression domains, the dynamics of CES reveals that the temporal development of the characteristic domains is characterized as autonomous bistable switch, which exhibits dynamic criticality (the temporal development of criticality) in the genome-wide coherent expression dynamics. It is expected that elucidation of the biophysical origin for such critical behavior sheds light on the underlying mechanism of the control of whole genome. PMID:24831017
Spermatogenesis in mammals: proteomic insights.

PubMed

Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

2012-08-01

Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.
Robust one-Tube Ω-PCR Strategy Accelerates Precise Sequence Modification of Plasmids for Functional Genomics

PubMed Central

Chen, Letian; Wang, Fengpin; Wang, Xiaoyu; Liu, Yao-Guang

2013-01-01

Functional genomics requires vector construction for protein expression and functional characterization of target genes; therefore, a simple, flexible and low-cost molecular manipulation strategy will be highly advantageous for genomics approaches. Here, we describe a Ω-PCR strategy that enables multiple types of sequence modification, including precise insertion, deletion and substitution, in any position of a circular plasmid. Ω-PCR is based on an overlap extension site-directed mutagenesis technique, and is named for its characteristic Ω-shaped secondary structure during PCR. Ω-PCR can be performed either in two steps, or in one tube in combination with exonuclease I treatment. These strategies have wide applications for protein engineering, gene function analysis and in vitro gene splicing. PMID:23335613
Genome-wide Identification and analysis of the stress-resistance function of the TPS (Trehalose-6-Phosphate Synthase) gene family in cotton.

PubMed

Mu, Min; Lu, Xu-Ke; Wang, Jun-Juan; Wang, De-Long; Yin, Zu-Jun; Wang, Shuai; Fan, Wei-Li; Ye, Wu-Wei

2016-03-18

Trehalose (a-D-glucopyranosyl a-D-glucopyranoside) is a nonreducing disaccharide and is widely distributed in bacteria, fungi, algae, plants and invertebrates. In the study, the identification of trehalose-6-phosphate synthase (TPS) genes stress-related in cotton, and the genetic structure analysis and molecular evolution analysis of TPSs were conducted with bioinformatics methods, which could lay a foundation for further research of TPS functions in cotton. The genome information of Gossypium raimondii (group D), G. arboreum L. (group A), and G. hirsutum L. (group AD) was used in the study. Fifty-three TPSs were identified comprising 15 genes in group D, 14 in group A, and 24 in group AD. Bioinformatics methods were used to analyze the genetic structure and molecular evolution of TPSs. Real-time PCR analysis was performed to investigate the expression patterns of gene family members. All TPS family members in cotton can be divided into two subfamilies: Class I and Class II. The similarity of the TPS sequence is high within the same species and close within their family relatives. The genetic structures of two TPS subfamily members are different, with more introns and a more complicated gene structure in Class I. There is a TPS domain(Glyco transf_20) at the N-terminal in all TPS family members and a TPP domain(Trehalose_PPase) at the C-terminal in all except GrTPS6, GhTPS4, and GhTPS9. All Class II members contain a UDP-forming domain. The responses to environmental stresses showed that stresses could induce the expression of TPSs but the expression patterns vary with different stresses. The distribution of TPSs varies with different species but is relatively uniform on chromosomes. Genetic structure varies with different gene members, and expression levels vary with different stresses and exhibit tissue specificity. The upregulated genes in upland cotton TM-1 is significantly more than that in G. raimondii and G. arboreum L. Shixiya 1.
A transcriptomics investigation into pine reproductive organ development.

PubMed

Niu, Shihui; Yuan, Huwei; Sun, Xinrui; Porth, Ilga; Li, Yue; El-Kassaby, Yousry A; Li, Wei

2016-02-01

The development of reproductive structures in gymnosperms is still poorly studied because of a lack of genomic information and useful genetic tools. The hermaphroditic reproductive structure derived from unisexual gymnosperms is an even less studied aspect of seed plant evolution. To extend our understanding of the molecular mechanism of hermaphroditism and the determination of sexual identity of conifer reproductive structures in general, unisexual and bisexual cones from Pinus tabuliformis were profiled for gene expression using 60K microarrays. Expression patterns of genes during progression of sexual cone development were analysed using RNA-seq. The results showed that, overall, the transcriptomes of male structures in bisexual cones were more similar to those of female cones. However, the expression of several MADS-box genes in the bisexual cones was similar to that of male cones at the more juvenile developmental stage, while despite these expression shifts, male structures of bisexual cones and normal male cones were histologically indistinguishable and cone development was continuous. This study represents a starting point for in-depth analysis of the molecular regulation of cone development and also the origin of hermaphroditism in pine. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Construction of a rice glycoside hydrolase phylogenomic database and identification of targets for biofuel research

PubMed Central

Sharma, Rita; Cao, Peijian; Jung, Ki-Hong; Sharma, Manoj K.; Ronald, Pamela C.

2013-01-01

Glycoside hydrolases (GH) catalyze the hydrolysis of glycosidic bonds in cell wall polymers and can have major effects on cell wall architecture. Taking advantage of the massive datasets available in public databases, we have constructed a rice phylogenomic database of GHs (http://ricephylogenomics.ucdavis.edu/cellwalls/gh/). This database integrates multiple data types including the structural features, orthologous relationships, mutant availability, and gene expression patterns for each GH family in a phylogenomic context. The rice genome encodes 437 GH genes classified into 34 families. Based on pairwise comparison with eight dicot and four monocot genomes, we identified 138 GH genes that are highly diverged between monocots and dicots, 57 of which have diverged further in rice as compared with four monocot genomes scanned in this study. Chromosomal localization and expression analysis suggest a role for both whole-genome and localized gene duplications in expansion and diversification of GH families in rice. We examined the meta-profiles of expression patterns of GH genes in twenty different anatomical tissues of rice. Transcripts of 51 genes exhibit tissue or developmental stage-preferential expression, whereas, seventeen other genes preferentially accumulate in actively growing tissues. When queried in RiceNet, a probabilistic functional gene network that facilitates functional gene predictions, nine out of seventeen genes form a regulatory network with the well-characterized genes involved in biosynthesis of cell wall polymers including cellulose synthase and cellulose synthase-like genes of rice. Two-thirds of the GH genes in rice are up regulated in response to biotic and abiotic stress treatments indicating a role in stress adaptation. Our analyses identify potential GH targets for cell wall modification. PMID:23986771
Genome-wide identification and characterisation of F-box family in maize.

PubMed

Jia, Fengjuan; Wu, Bingjiang; Li, Hui; Huang, Jinguang; Zheng, Chengchao

2013-11-01

F-box-containing proteins, as the key components of the protein degradation machinery, are widely distributed in higher plants and are considered as one of the largest known families of regulatory proteins. The F-box protein family plays a crucial role in plant growth and development and in response to biotic and abiotic stresses. However, systematic analysis of the F-box family in maize (Zea mays) has not been reported yet. In this paper, we identified and characterised the maize F-box genes in a genome-wide scale, including phylogenetic analysis, chromosome distribution, gene structure, promoter analysis and gene expression profiles. A total of 359 F-box genes were identified and divided into 15 subgroups by phylogenetic analysis. The F-box domain was relatively conserved, whereas additional motifs outside the F-box domain may indicate the functional diversification of maize F-box genes. These genes were unevenly distributed in ten maize chromosomes, suggesting that they expanded in the maize genome because of tandem and segmental duplication events. The expression profiles suggested that the maize F-box genes had temporal and spatial expression patterns. Putative cis-acting regulatory DNA elements involved in abiotic stresses were observed in maize F-box gene promoters. The gene expression profiles under abiotic stresses also suggested that some genes participated in stress responsive pathways. Furthermore, ten genes were chosen for quantitative real-time PCR analysis under drought stress and the results were consistent with the microarray data. This study has produced a comparative genomics analysis of the maize ZmFBX gene family that can be used in further studies to uncover their roles in maize growth and development.
Genome-wide survey and expression analysis of F-box genes in chickpea.

PubMed

Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata

2015-02-13

The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.
A Genome-Wide Analysis of the LBD (LATERAL ORGAN BOUNDARIES Domain) Gene Family in Malus domestica with a Functional Characterization of MdLBD11

PubMed Central

Su, Ling; Liu, Xin; Hao, Yujin

2013-01-01

The plant-specific LBD (LATERAL ORGAN BOUNDARIES domain) genes belong to a major family of transcription factor that encode a zinc finger-like domain. It has been shown that LBD genes play crucial roles in the growth and development of Arabidopsis and other plant species. However, no detailed information concerning this family is available for apple. In the present study, we analyzed the apple (Malus domestica) genome and identified 58 LBD genes. This gene family was tested for its phylogenetic relationships with homologous genes in the Arabidopsis genome, as well as its location in the genome, structure and expression. We also transformed one MdLBD gene into Arabidopsis to evaluate its function. Like Arabidopsis, apple LBD genes also have a conserved CX2CX6CX3C zinc finger-like domain in the N terminus and can be divided into two classes. The expression profile indicated that apple LBD genes exhibited a variety of expression patterns, suggesting that they have diverse functions. At the same time, the expression analysis implied that members of this apple gene family were responsive to hormones and stress and that they may participate in hormone-mediated plant organogenesis, which was demonstrated with the overexpression of the apple LBD gene MdLBD11, resulting in an abnormal phenotype. This phenotype included upward curling leaves, delayed flowering, downward-pointing flowers, siliques and other abnormal traits. Based on these data, we concluded that the MdLBD genes may play an important role in apple growth and development as in Arabidopsis and other species. PMID:23468909
Genome-wide identification and analysis of the aldehyde dehydrogenase (ALDH) gene superfamily in apple (Malus × domestica Borkh.).

PubMed

Li, Xiaoqin; Guo, Rongrong; Li, Jun; Singer, Stacy D; Zhang, Yucheng; Yin, Xiangjing; Zheng, Yi; Fan, Chonghui; Wang, Xiping

2013-10-01

Aldehyde dehydrogenases (ALDHs) represent a protein superfamily encoding NAD(P)(+)-dependent enzymes that oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes. In plants, they are involved in many biological processes and play a role in the response to environmental stress. In this study, a total of 39 ALDH genes from ten families were identified in the apple (Malus × domestica Borkh.) genome. Synteny analysis of the apple ALDH (MdALDH) genes indicated that segmental and tandem duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of these gene families in apple. Moreover, synteny analysis between apple and Arabidopsis demonstrated that several MdALDH genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes appeared before the divergence of lineages that led to apple and Arabidopsis. In addition, phylogenetic analysis, as well as comparisons of exon-intron and protein structures, provided further insight into both their evolutionary relationships and their putative functions. Tissue-specific expression analysis of the MdALDH genes demonstrated diverse spatiotemporal expression patterns, while their expression profiles under abiotic stress and various hormone treatments indicated that many MdALDH genes were responsive to high salinity and drought, as well as different plant hormones. This genome-wide identification, as well as characterization of evolutionary relationships and expression profiles, of the apple MdALDH genes will not only be useful for the further analysis of ALDH genes and their roles in stress response, but may also aid in the future improvement of apple stress tolerance. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
VitisExpDB: a database resource for grape functional genomics.

PubMed

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-02-28

The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics

PubMed Central

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-01-01

Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
A genome-wide analysis of the LBD (LATERAL ORGAN BOUNDARIES domain) gene family in Malus domestica with a functional characterization of MdLBD11.

PubMed

Wang, Xiaofei; Zhang, Shizhong; Su, Ling; Liu, Xin; Hao, Yujin

2013-01-01

The plant-specific LBD (LATERAL ORGAN BOUNDARIES domain) genes belong to a major family of transcription factor that encode a zinc finger-like domain. It has been shown that LBD genes play crucial roles in the growth and development of Arabidopsis and other plant species. However, no detailed information concerning this family is available for apple. In the present study, we analyzed the apple (Malus domestica) genome and identified 58 LBD genes. This gene family was tested for its phylogenetic relationships with homologous genes in the Arabidopsis genome, as well as its location in the genome, structure and expression. We also transformed one MdLBD gene into Arabidopsis to evaluate its function. Like Arabidopsis, apple LBD genes also have a conserved CX2CX6CX3C zinc finger-like domain in the N terminus and can be divided into two classes. The expression profile indicated that apple LBD genes exhibited a variety of expression patterns, suggesting that they have diverse functions. At the same time, the expression analysis implied that members of this apple gene family were responsive to hormones and stress and that they may participate in hormone-mediated plant organogenesis, which was demonstrated with the overexpression of the apple LBD gene MdLBD11, resulting in an abnormal phenotype. This phenotype included upward curling leaves, delayed flowering, downward-pointing flowers, siliques and other abnormal traits. Based on these data, we concluded that the MdLBD genes may play an important role in apple growth and development as in Arabidopsis and other species.
A Genome-Wide Scan of Selective Sweeps and Association Mapping of Fruit Traits Using Microsatellite Markers in Watermelon

PubMed Central

Reddy, Umesh K.; Abburi, Lavanya; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Cantrell, Robert; Vajja, Venkata Gopinath; Reddy, Rishi; Tomason, Yan R.; Levi, Amnon; Wehner, Todd C.; Nimmakayala, Padma

2015-01-01

Our genetic diversity study uses microsatellites of known map position to estimate genome level population structure and linkage disequilibrium, and to identify genomic regions that have undergone selection during watermelon domestication and improvement. Thirty regions that showed evidence of selective sweep were scanned for the presence of candidate genes using the watermelon genome browser (www.icugi.org). We localized selective sweeps in intergenic regions, close to the promoters, and within the exons and introns of various genes. This study provided an evidence of convergent evolution for the presence of diverse ecotypes with special reference to American and European ecotypes. Our search for location of linked markers in the whole-genome draft sequence revealed that BVWS00358, a GA repeat microsatellite, is the GAGA type transcription factor located in the 5′ untranslated regions of a structure and insertion element that expresses a Cys2His2 Zinc finger motif, with presumed biological processes related to chitin response and transcriptional regulation. In addition, BVWS01708, an ATT repeat microsatellite, located in the promoter of a DTW domain-containing protein (Cla002761); and 2 other simple sequence repeats that association mapping link to fruit length and rind thickness. PMID:25425675

Universal and idiosyncratic characteristic lengths in bacterial genomes

NASA Astrophysics Data System (ADS)

Junier, Ivan; Frémont, Paul; Rivoire, Olivier

2018-05-01

In condensed matter physics, simplified descriptions are obtained by coarse-graining the features of a system at a certain characteristic length, defined as the typical length beyond which some properties are no longer correlated. From a physics standpoint, in vitro DNA has thus a characteristic length of 300 base pairs (bp), the Kuhn length of the molecule beyond which correlations in its orientations are typically lost. From a biology standpoint, in vivo DNA has a characteristic length of 1000 bp, the typical length of genes. Since bacteria live in very different physico-chemical conditions and since their genomes lack translational invariance, whether larger, universal characteristic lengths exist is a non-trivial question. Here, we examine this problem by leveraging the large number of fully sequenced genomes available in public databases. By analyzing GC content correlations and the evolutionary conservation of gene contexts (synteny) in hundreds of bacterial chromosomes, we conclude that a fundamental characteristic length around 10–20 kb can be defined. This characteristic length reflects elementary structures involved in the coordination of gene expression, which are present all along the genome of nearly all bacteria. Technically, reaching this conclusion required us to implement methods that are insensitive to the presence of large idiosyncratic genomic features, which may co-exist along these fundamental universal structures.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation.

PubMed

Mourad, Raphaël; Cuvier, Olivier

2016-05-01

Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation

PubMed Central

Mourad, Raphaël; Cuvier, Olivier

2016-01-01

Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1. PMID:27203237
Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

PubMed Central

Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

2008-01-01

While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490
DNA binding by FOXP3 domain-swapped dimer suggests mechanisms of long-range chromosomal interactions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Y.; Chen, C.; Zhang, Z.

2015-01-07

FOXP3 is a lineage-specific transcription factor that is required for regulatory T cell development and function. In this study, we determined the crystal structure of the FOXP3 forkhead domain bound to DNA. The structure reveals that FOXP3 can form a stable domain-swapped dimer to bridge DNA in the absence of cofactors, suggesting that FOXP3 may play a role in long-range gene interactions. To test this hypothesis, we used circular chromosome conformation capture coupled with high throughput sequencing (4C-seq) to analyze FOXP3-dependent genomic contacts around a known FOXP3-bound locus, Ptpn22. Our studies reveal that FOXP3 induces significant changes in the chromatinmore » contacts between the Ptpn22 locus and other Foxp3-regulated genes, reflecting a mechanism by which FOXP3 reorganizes the genome architecture to coordinate the expression of its target genes. Our results suggest that FOXP3 mediates long-range chromatin interactions as part of its mechanisms to regulate specific gene expression in regulatory T cells.« less
Uncovering the Salt Response of Soybean by Unraveling Its Wild and Cultivated Functional Genomes Using Tag Sequencing

PubMed Central

Ali, Zulfiqar; Zhang, Da Yong; Xu, Zhao Long; Xu, Ling; Yi, Jin Xin; He, Xiao Lan; Huang, Yi Hong; Liu, Xiao Qing; Khan, Asif Ali; Trethowan, Richard M.; Ma, Hong Xiang

2012-01-01

Soil salinity has very adverse effects on growth and yield of crop plants. Several salt tolerant wild accessions and cultivars are reported in soybean. Functional genomes of salt tolerant Glycine soja and a salt sensitive genotype of Glycine max were investigated to understand the mechanism of salt tolerance in soybean. For this purpose, four libraries were constructed for Tag sequencing on Illumina platform. We identify around 490 salt responsive genes which included a number of transcription factors, signaling proteins, translation factors and structural genes like transporters, multidrug resistance proteins, antiporters, chaperons, aquaporins etc. The gene expression levels and ratio of up/down-regulated genes was greater in tolerant plants. Translation related genes remained stable or showed slightly higher expression in tolerant plants under salinity stress. Further analyses of sequenced data and the annotations for gene ontology and pathways indicated that soybean adapts to salt stress through ABA biosynthesis and regulation of translation and signal transduction of structural genes. Manipulation of these pathways may mitigate the effect of salt stress thus enhancing salt tolerance. PMID:23209559
De Novo Assembly and Phasing of Dikaryotic Genomes from Two Isolates of Puccinia coronata f. sp. avenae, the Causal Agent of Oat Crown Rust

PubMed Central

Miller, Marisa E.; Zhang, Ying; Omidvar, Vahid; Sperschneider, Jana; Raley, Castle; Palmer, Jonathan M.; Garnica, Diana; Upadhyaya, Narayana; Rathjen, John; Taylor, Jennifer M.; Park, Robert F.; Dodds, Peter N.; Hirsch, Cory D.

2018-01-01

ABSTRACT Oat crown rust, caused by the fungus Pucinnia coronata f. sp. avenae, is a devastating disease that impacts worldwide oat production. For much of its life cycle, P. coronata f. sp. avenae is dikaryotic, with two separate haploid nuclei that may vary in virulence genotype, highlighting the importance of understanding haplotype diversity in this species. We generated highly contiguous de novo genome assemblies of two P. coronata f. sp. avenae isolates, 12SD80 and 12NC29, from long-read sequences. In total, we assembled 603 primary contigs for 12SD80, for a total assembly length of 99.16 Mbp, and 777 primary contigs for 12NC29, for a total length of 105.25 Mbp; approximately 52% of each genome was assembled into alternate haplotypes. This revealed structural variation between haplotypes in each isolate equivalent to more than 2% of the genome size, in addition to about 260,000 and 380,000 heterozygous single-nucleotide polymorphisms in 12SD80 and 12NC29, respectively. Transcript-based annotation identified 26,796 and 28,801 coding sequences for isolates 12SD80 and 12NC29, respectively, including about 7,000 allele pairs in haplotype-phased regions. Furthermore, expression profiling revealed clusters of coexpressed secreted effector candidates, and the majority of orthologous effectors between isolates showed conservation of expression patterns. However, a small subset of orthologs showed divergence in expression, which may contribute to differences in virulence between 12SD80 and 12NC29. This study provides the first haplotype-phased reference genome for a dikaryotic rust fungus as a foundation for future studies into virulence mechanisms in P. coronata f. sp. avenae. PMID:29463655
Structure and expression strategy of the genome of Culex pipiens densovirus, a mosquito densovirus with an ambisense organization.

PubMed

Baquerizo-Audiot, Elizabeth; Abd-Alla, Adly; Jousset, Françoise-Xavière; Cousserans, François; Tijssen, Peter; Bergoin, Max

2009-07-01

The genome of all densoviruses (DNVs) so far isolated from mosquitoes or mosquito cell lines consists of a 4-kb single-stranded DNA molecule with a monosense organization (genus Brevidensovirus, subfamily Densovirinae). We previously reported the isolation of a Culex pipiens DNV (CpDNV) that differs significantly from brevidensoviruses by (i) having a approximately 6-kb genome, (ii) lacking sequence homology, and (iii) lacking antigenic cross-reactivity with Brevidensovirus capsid polypeptides. We report here the sequence organization and transcription map of this virus. The cloned genome of CpDNV is 5,759 nucleotides (nt) long, and it possesses an inverted terminal repeat (ITR) of 285 nt and an ambisense organization of its genes. The nonstructural (NS) proteins NS-1, NS-2, and NS-3 are located in the 5' half of one strand and are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid polypeptides is located in the 5' half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNAs species are spliced off a 53-nt sequence. Capsid proteins are translated from an unspliced 2.3-kb mRNA driven by the P88 promoter. CpDNV thus appears as a new type of mosquito DNV, and based on the overall organization and expression modalities of its genome, it may represent the prototype of a new genus of DNV.
Methods to Monitor DNA Repair Defects and Genomic Instability in the Context of a Disrupted Nuclear Lamina.

PubMed

Gonzalo, Susana; Kreienkamp, Ray

2016-01-01

The organization of the genome within the nuclear space is viewed as an additional level of regulation of genome function, as well as a means to ensure genome integrity. Structural proteins associated with the nuclear envelope, in particular lamins (A- and B-type) and lamin-associated proteins, play an important role in genome organization. Interestingly, there is a whole body of evidence that links disruptions of the nuclear lamina with DNA repair defects and genomic instability. Here, we describe a few standard techniques that have been successfully utilized to identify mechanisms behind DNA repair defects and genomic instability in cells with an altered nuclear lamina. In particular, we describe protocols to monitor changes in the expression of DNA repair factors (Western blot) and their recruitment to sites of DNA damage (immunofluorescence); kinetics of DNA double-strand break repair after ionizing radiation (neutral comet assays); frequency of chromosomal aberrations (FISH, fluorescence in situ hybridization); and alterations in telomere homeostasis (Quantitative-FISH). These techniques have allowed us to shed some light onto molecular mechanisms by which alterations in A-type lamins induce genomic instability, which could contribute to the pathophysiology of aging and aging-related diseases.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

2016-01-04

PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome-wide identification, isolation and expression analysis of auxin response factor (ARF) gene family in sweet orange (Citrus sinensis)

PubMed Central

Li, Si-Bei; OuYang, Wei-Zhi; Hou, Xiao-Jin; Xie, Liang-Liang; Hu, Chun-Gen; Zhang, Jin-Zhi

2015-01-01

Auxin response factors (ARFs) are an important family of proteins in auxin-mediated response, with key roles in various physiological and biochemical processes. To date, a genome-wide overview of the ARF gene family in citrus was not available. A systematic analysis of this gene family in citrus was begun by carrying out a genome-wide search for the homologs of ARFs. A total of 19 nonredundant ARF genes (CiARF) were found and validated from the sweet orange. A comprehensive overview of the CiARFs was undertaken, including the gene structures, phylogenetic analysis, chromosome locations, conserved motifs of proteins, and cis-elements in promoters of CiARF. Furthermore, expression profiling using real-time PCR revealed many CiARF genes, albeit with different patterns depending on types of tissues and/or developmental stages. Comprehensive expression analysis of these genes was also performed under two hormone treatments using real-time PCR. Indole-3-acetic acid (IAA) and N-1-napthylphthalamic acid (NPA) treatment experiments revealed differential up-regulation and down-regulation, respectively, of the 19 citrus ARF genes in the callus of sweet orange. Our comprehensive analysis of ARF genes further elucidates the roles of CiARF family members during citrus growth and development process. PMID:25870601
K-bZIP Mediated SUMO-2/3 Specific Modification on the KSHV Genome Negatively Regulates Lytic Gene Expression and Viral Reactivation

PubMed Central

Yang, Wan-Shan; Hsu, Hung-Wei; Campbell, Mel; Cheng, Chia-Yang; Chang, Pei-Ching

2015-01-01

SUMOylation is associated with epigenetic regulation of chromatin structure and transcription. Epigenetic modifications of herpesviral genomes accompany the transcriptional switch of latent and lytic genes during the virus life cycle. Here, we report a genome-wide comparison of SUMO paralog modification on the KSHV genome. Using chromatin immunoprecipitation in conjunction with high-throughput sequencing, our study revealed highly distinct landscape changes of SUMO paralog genomic modifications associated with KSHV reactivation. A rapid and widespread deposition of SUMO-2/3, compared with SUMO-1, modification across the KSHV genome upon reactivation was observed. Interestingly, SUMO-2/3 enrichment was inversely correlated with H3K9me3 mark after reactivation, indicating that SUMO-2/3 may be responsible for regulating the expression of viral genes located in low heterochromatin regions during viral reactivation. RNA-sequencing analysis showed that the SUMO-2/3 enrichment pattern positively correlated with KSHV gene expression profiles. Activation of KSHV lytic genes located in regions with high SUMO-2/3 enrichment was enhanced by SUMO-2/3 knockdown. These findings suggest that SUMO-2/3 viral chromatin modification contributes to the diminution of viral gene expression during reactivation. Our previous study identified a SUMO-2/3-specific viral E3 ligase, K-bZIP, suggesting a potential role of this enzyme in regulating SUMO-2/3 enrichment and viral gene repression. Consistent with this prediction, higher K-bZIP binding on SUMO-2/3 enrichment region during reactivation was observed. Moreover, a K-bZIP SUMO E3 ligase dead mutant, K-bZIP-L75A, in the viral context, showed no SUMO-2/3 enrichment on viral chromatin and higher expression of viral genes located in SUMO-2/3 enriched regions during reactivation. Importantly, virus production significantly increased in both SUMO-2/3 knockdown and KSHV K-bZIP-L75A mutant cells. These results indicate that SUMO-2/3 modification of viral chromatin may function to counteract KSHV reactivation. As induction of herpesvirus reactivation may activate cellular antiviral regimes, our results suggest that development of viral SUMO E3 ligase specific inhibitors may be an avenue for anti-virus therapy. PMID:26197391
Breakpoint Features of Genomic Rearrangements in Neuroblastoma with Unbalanced Translocations and Chromothripsis

PubMed Central

Daveau, Romain; Combaret, Valérie; Pierre-Eugène, Cécile; Cazes, Alex; Louis-Brennetot, Caroline; Schleiermacher, Gudrun; Ferrand, Sandrine; Pierron, Gaëlle; Lermine, Alban; Frio, Thomas Rio; Raynal, Virginie; Vassal, Gilles; Barillot, Emmanuel; Delattre, Olivier; Janoueix-Lerosey, Isabelle

2013-01-01

Neuroblastoma is a pediatric cancer of the peripheral nervous system in which structural chromosome aberrations are emblematic of aggressive tumors. In this study, we performed an in-depth analysis of somatic rearrangements in two neuroblastoma cell lines and two primary tumors using paired-end sequencing of mate-pair libraries and RNA-seq. The cell lines presented with typical genetic alterations of neuroblastoma and the two tumors belong to the group of neuroblastoma exhibiting a profile of chromothripsis. Inter and intra-chromosomal rearrangements were identified in the four samples, allowing in particular characterization of unbalanced translocations at high resolution. Using complementary experiments, we further characterized 51 rearrangements at the base pair resolution that revealed 59 DNA junctions. In a subset of cases, complex rearrangements were observed with templated insertion of fragments of nearby sequences. Although we did not identify known particular motifs in the local environment of the breakpoints, we documented frequent microhomologies at the junctions in both chromothripsis and non-chromothripsis associated breakpoints. RNA-seq experiments confirmed expression of several predicted chimeric genes and genes with disrupted exon structure including ALK, NBAS, FHIT, PTPRD and ODZ4. Our study therefore indicates that both non-homologous end joining-mediated repair and replicative processes may account for genomic rearrangements in neuroblastoma. RNA-seq analysis allows the identification of the subset of abnormal transcripts expressed from genomic rearrangements that may be involved in neuroblastoma oncogenesis. PMID:23991058
Parallel computation of genome-scale RNA secondary structure to detect structural constraints on human genome.

PubMed

Kawaguchi, Risa; Kiryu, Hisanori

2016-05-06

RNA secondary structure around splice sites is known to assist normal splicing by promoting spliceosome recognition. However, analyzing the structural properties of entire intronic regions or pre-mRNA sequences has been difficult hitherto, owing to serious experimental and computational limitations, such as low read coverage and numerical problems. Our novel software, "ParasoR", is designed to run on a computer cluster and enables the exact computation of various structural features of long RNA sequences under the constraint of maximal base-pairing distance. ParasoR divides dynamic programming (DP) matrices into smaller pieces, such that each piece can be computed by a separate computer node without losing the connectivity information between the pieces. ParasoR directly computes the ratios of DP variables to avoid the reduction of numerical precision caused by the cancellation of a large number of Boltzmann factors. The structural preferences of mRNAs computed by ParasoR shows a high concordance with those determined by high-throughput sequencing analyses. Using ParasoR, we investigated the global structural preferences of transcribed regions in the human genome. A genome-wide folding simulation indicated that transcribed regions are significantly more structural than intergenic regions after removing repeat sequences and k-mer frequency bias. In particular, we observed a highly significant preference for base pairing over entire intronic regions as compared to their antisense sequences, as well as to intergenic regions. A comparison between pre-mRNAs and mRNAs showed that coding regions become more accessible after splicing, indicating constraints for translational efficiency. Such changes are correlated with gene expression levels, as well as GC content, and are enriched among genes associated with cytoskeleton and kinase functions. We have shown that ParasoR is very useful for analyzing the structural properties of long RNA sequences such as mRNAs, pre-mRNAs, and long non-coding RNAs whose lengths can be more than a million bases in the human genome. In our analyses, transcribed regions including introns are indicated to be subject to various types of structural constraints that cannot be explained from simple sequence composition biases. ParasoR is freely available at https://github.com/carushi/ParasoR .
Expansion of the CRISPR-Cas9 genome targeting space through the use of H1 promoter-expressed guide RNAs.

PubMed

Ranganathan, Vinod; Wahlin, Karl; Maruotti, Julien; Zack, Donald J

2014-08-08

The repurposed CRISPR-Cas9 system has recently emerged as a revolutionary genome-editing tool. Here we report a modification in the expression of the guide RNA (gRNA) required for targeting that greatly expands the targetable genome. gRNA expression through the commonly used U6 promoter requires a guanosine nucleotide to initiate transcription, thus constraining genomic-targeting sites to GN19NGG. We demonstrate the ability to modify endogenous genes using H1 promoter-expressed gRNAs, which can be used to target both AN19NGG and GN19NGG genomic sites. AN19NGG sites occur ~15% more frequently than GN19NGG sites in the human genome and the increase in targeting space is also enriched at human genes and disease loci. Together, our results enhance the versatility of the CRISPR technology by more than doubling the number of targetable sites within the human genome and other eukaryotic species.
Purification and Characterization of Recombinant Darbepoetin Alfa from Leishmania tarentolae.

PubMed

Kianmehr, Anvarsadat; Mahrooz, Abdolkarim; Oladnabi, Morteza; Safdari, Yaghoub; Ansari, Javad; Veisi, Kamal; Evazalipour, Mehdi; Shahbazmohammadi, Hamid; Omidinia, Eskandar

2016-09-01

Darbepoetin alfa is a biopharmaceutical glycoprotein that stimulates erythropoiesis and is used to treat anemia, which associated with renal failure and cancer chemotherapy. We herein describe the structural characterization of recombinant darbepoetin alfa produced by Leishmania tarentolae T7-TR host. The DNA expression cassette was integrated into the L. tarentolae genome through homologous recombination. Transformed clones were selected by antibiotic resistance, diagnostic PCRs, and protein expression analysis. The structure of recombinant darbepoetin alfa was analyzed by isoelectric focusing, ultraviolet-visible spectrum, and circular dichroism (CD) spectroscopy. Expression analysis showed the presence of a protein band at 40 kDa, and its expression level was 51.2 mg/ml of culture medium. Darbepoetin alfa have 5 isoforms with varying degree of sialylation. The UV absorption and CD spectra were analogous to original drug (Aranesp), which confirmed that the produced protein was darbepoetin alfa. Potency test results revealed that the purified protein was biologically active. In brief, the structural and biological characteristics of expressed darbepoetin alfa were very similar to Aranesp which has been normally expressed in CHO. Our data also suggest that produced protein has potential to be developed for clinical use.
Genome-wide specificity of DNA binding, gene regulation, and chromatin remodeling by TALE- and CRISPR/Cas9-based transcriptional activators

PubMed Central

Polstein, Lauren R.; Perez-Pinera, Pablo; Kocak, D. Dewran; Vockley, Christopher M.; Bledsoe, Peggy; Song, Lingyun; Safi, Alexias; Crawford, Gregory E.; Reddy, Timothy E.; Gersbach, Charles A.

2015-01-01

Genome engineering technologies based on the CRISPR/Cas9 and TALE systems are enabling new approaches in science and biotechnology. However, the specificity of these tools in complex genomes and the role of chromatin structure in determining DNA binding are not well understood. We analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators in human cells using ChIP-seq to assess DNA-binding specificity and RNA-seq to measure the specificity of perturbing the transcriptome. Additionally, DNase-seq was used to assess genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these transcription factors are highly specific in both DNA binding and gene regulation and are able to open targeted regions of closed chromatin independent of gene activation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. PMID:26025803
Expression profiles and functional associations of endogenous androgen receptor and caveolin-1 in prostate cancer cell lines.

PubMed

Bennett, Nigel C; Hooper, John D; Johnson, David W; Gobe, Glenda C

2014-05-01

In prostate cancer (PCa) patients, the protein target for androgen deprivation and blockade therapies is androgen receptor (AR). AR interacts with many proteins that function to either co-activate or co-repress its activity. Caveolin-1 (Cav-1) is not found in normal prostatic epithelium, but is found in PCa, and may be an AR co-regulator protein. We investigated cell line-specific signatures and associations of endogenous AR and Cav-1 in six PCa cell lines of known androgen sensitivity: LNCaP (androgen sensitive); 22Rv1 (androgen responsive); PC3, DU145, and ALVA41 (androgen non-reliant); and RWPE1 (non-malignant). Protein and mRNA expression profiles were compared and electron microscopy used to identify cells with caveolar structures. For cell lines expressing both AR and Cav-1, knockdown techniques using small interfering RNA against AR or Cav-1 were used to test whether diminished expression of one affected the other. Co-sedimentation of AR and Cav-1 was used to test their association. A reporter assay for AR genomic activity was utilized following Cav-1 knockdown. AR-expressing LNCaP and 22Rv1 cells had low endogenous Cav-1 mRNA and protein. Cell lines that expressed little or no AR (DU145, PC3, ALVA41, and RWPE1) expressed high endogenous levels of Cav-1. AR knockdown in LNCaP cells had little effect on Cav-1, but Cav-1 knockdown inhibited AR expression and genomic activity. These data show endogenous AR and Cav-1 mRNA and protein expression is inversely related in PCa cells, with Cav-1 acting on the androgen/AR signaling axis possibly as an AR co-activator, demonstrated by diminished AR genomic activity following Cav-1 knockdown. © 2013 Wiley Periodicals, Inc.
Resolving Heart Regeneration by Replacement Histone Profiling.

PubMed

Goldman, Joseph Aaron; Kuzu, Guray; Lee, Nutishia; Karasik, Jaclyn; Gemberling, Matthew; Foglia, Matthew J; Karra, Ravi; Dickson, Amy L; Sun, Fei; Tolstorukov, Michael Y; Poss, Kenneth D

2017-02-27

Chromatin regulation is a principal mechanism governing animal development, yet it is unclear to what extent structural changes in chromatin underlie tissue regeneration. Non-mammalian vertebrates such as zebrafish activate cardiomyocyte (CM) division after tissue damage to regenerate lost heart muscle. Here, we generated transgenic zebrafish expressing a biotinylatable H3.3 histone variant in CMs and derived cell-type-specific profiles of histone replacement. We identified an emerging program of putative enhancers that revise H3.3 occupancy during regeneration, overlaid upon a genome-wide reduction of H3.3 from promoters. In transgenic reporter lines, H3.3-enriched elements directed gene expression in subpopulations of CMs. Other elements increased H3.3 enrichment and displayed enhancer activity in settings of injury- and/or Neuregulin1-elicited CM proliferation. Dozens of consensus sequence motifs containing predicted transcription factor binding sites were enriched in genomic regions with regeneration-responsive H3.3 occupancy. Thus, cell-type-specific regulatory programs of tissue regeneration can be revealed by genome-wide H3.3 profiling. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-wide characterization of the SiDof gene family in foxtail millet (Setaria italica).

PubMed

Zhang, Li; Liu, Baoling; Zheng, Gewen; Zhang, Aiying; Li, Runzhi

2017-01-01

Dof (DNA binding with one finger) proteins, which constitute a class of transcription factors found exclusively in plants, are involved in numerous physiological and biochemical reactions affecting growth and development. A genome-wide analysis of SiDof genes was performed in this study. Thirty five SiDof genes were identified and those genes were unevenly distributed across nine chromosomes in the Seteria italica genome. Protein lengths, molecular weights, and theoretical isoelectric points of SiDofs all vary greatly. Gene structure analysis demonstrated that most SiDof genes lack introns. Phylogenetic analysis of SiDof proteins and Dof proteins from Arabidopsis thaliana, rice, sorghum, and Setaria viridis revealed six major groups. Analysis of RNA-Seq data indicated that SiDof gene expression levels varied across roots, stems, leaves, and spike. In addition, expression profiling of SiDof genes in response to stress suggested that SiDof 7 and SiDof 15 are involved in drought stress signalling. Overall, this study could provide novel information on SiDofs for further investigation in foxtail millet. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

An Integrated Encyclopedia of DNA Elements in the Human Genome

PubMed Central

2012-01-01

Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616
Fatty acid-binding protein genes of the ancient, air-breathing, ray-finned fish, spotted gar (Lepisosteus oculatus).

PubMed

Venkatachalam, Ananda B; Fontenot, Quenton; Farrara, Allyse; Wright, Jonathan M

2018-03-01

With the advent of high-throughput DNA sequencing technology, the genomic sequence of many disparate species has led to the relatively new discipline of genomics, the study of genome structure, function and evolution. Much work has been focused on the role of whole genome duplications (WGD) in the architecture of extant vertebrate genomes, particularly those of teleost fishes which underwent a WGD early in the teleost radiation >230 million years ago (mya). Our past work has focused on the fate of duplicated copies of a multigene family coding for the intracellular lipid-binding protein (iLBP) genes in the teleost fishes. To define the evolutionary processes that determined the fate of duplicated genes and generated the structure of extant fish genomes, however, requires comparative genomic analysis with a fish lineage that diverged before the teleost WGD, such as the spotted gar (Lepisosteus oculatus), an ancient, air-breathing, ray-finned fish. Here, we describe the genomic organization, chromosomal location and tissue-specific expression of a subfamily of the iLBP genes that code for fatty acid-binding proteins (Fabps) in spotted gar. Based on this work, we have defined the minimum suite of fabp genes prior to their duplication in the teleost lineages ~230-400 mya. Spotted gar, therefore, serves as an appropriate outgroup, or ancestral/ancient fish, that did not undergo the teleost-specific WGD. As such, analyses of the spatio-temporal regulation of spotted gar genes provides a foundation to determine whether the duplicated fabp genes have been retained in teleost genomes owing to either sub- or neofunctionalization. Copyright © 2017 Elsevier Inc. All rights reserved.
Conservation of mRNA secondary structures may filter out mutations in Escherichia coli evolution

PubMed Central

Chursov, Andrey; Frishman, Dmitrij; Shneider, Alexander

2013-01-01

Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins. PMID:23783573
The 193-base pair Gsg2 (haspin) promoter region regulates germ cell-specific expression bidirectionally and synchronously.

PubMed

Tokuhiro, Keizo; Miyagawa, Yasushi; Yamada, Shuichi; Hirose, Mika; Ohta, Hiroshi; Nishimune, Yoshitake; Tanaka, Hiromitsu

2007-03-01

Haspin is a unique protein kinase expressed predominantly in haploid male germ cells. The genomic structure of haspin (Gsg2) has revealed it to be intronless, and the entire transcription unit is in an intron of the integrin alphaE (Itgae) gene. Transcription occurs from a bidirectional promoter that also generates an alternatively spliced integrin alphaE-derived mRNA (Aed). In mice, the testis-specific alternative splicing of Aed is expressed bidirectionally downstream from the Gsg2 transcription initiation site, and a segment consisting of 26 bp transcribes both genomic DNA strands between Gsg2 and the Aed transcription initiation sites. To investigate the mechanisms for this unique gene regulation, we cloned and characterized the Gsg2 promoter region. The 193-bp genomic fragment from the 5' end of the Gsg2 and Aed genes, fused with EGFP and DsRed genes, drove the expression of both proteins in haploid germ cells of transgenic mice. This promoter element contained only a GC-rich sequence, and not the previously reported DNA sequences known to bind various transcription factors--with the exception of E2F1, TCFAP2A1 (AP2), and SP1. Here, we show that the 193-bp DNA sequence is sufficient for the specific, bidirectional, and synchronous expression in germ cells in the testis. We also demonstrate the existence of germ cell nuclear factors specifically bound to the promoter sequence. This activity may be regulated by binding to the promoter sequence with germ cell-specific nuclear complex(es) without regulation via DNA methylation.
Transcriptome of the quorum-sensing signal-degrading Rhodococcus erythropolis responds differentially to virulent and avirulent Pectobacterium atrosepticum

PubMed Central

Kwasiborski, A; Mondy, S; Chong, T-M; Barbey, C; Chan, K-G; Beury-Cirou, A; Latour, X; Faure, D

2015-01-01

Social bacteria use chemical communication to coordinate and synchronize gene expression via the quorum-sensing (QS) regulatory pathway. In Pectobacterium, a causative agent of the blackleg and soft-rot diseases on potato plants and tubers, expression of the virulence factors is collectively controlled by the QS-signals N-acylhomoserine lactones (NAHLs). Several soil bacteria, such as the actinobacterium Rhodococcus erythropolis, are able to degrade NAHLs, hence quench the chemical communication and virulence of Pectobacterium. Here, next-generation sequencing was used to investigate structural and functional genomics of the NAHL-degrading R. erythropolis strain R138. The R. erythropolis R138 genome (6.7 Mbp) contained a single circular chromosome, one linear (250 kbp) and one circular (84 kbp) plasmid. Growth of R. erythropolis and P. atrosepticum was not altered in mixed-cultures as compared with monocultures on potato tuber slices. HiSeq-transcriptomics revealed that no R. erythropolis genes were differentially expressed when R. erythropolis was cultivated in the presence vs absence of the avirulent P. atrosepticum mutant expI, which is defective for QS-signal synthesis. By contrast 50 genes (<1% of the R. erythropolis genome) were differentially expressed when R. erythropolis was cultivated in the presence vs absence of the NAHL-producing virulent P. atrosepticum. Among them, quantitative real-time reverse-transcriptase–PCR confirmed that the expression of some alkyl-sulfatase genes decreased in the presence of a virulent P. atrosepticum, as well as deprivation of organic sulfur such as methionine, which is a key precursor in the synthesis of NAHL by P. atrosepticum. PMID:25585922
Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

NASA Astrophysics Data System (ADS)

Kikuchi, Shoshi

2009-02-01

Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
Genome-wide identification of sweet orange (Citrus sinensis) histone modification gene families and their expression analysis during the fruit development and fruit-blue mold infection process.

PubMed

Xu, Jidi; Xu, Haidan; Liu, Yuanlong; Wang, Xia; Xu, Qiang; Deng, Xiuxin

2015-01-01

In eukaryotes, histone acetylation and methylation have been known to be involved in regulating diverse developmental processes and plant defense. These histone modification events are controlled by a series of histone modification gene families. To date, there is no study regarding genome-wide characterization of histone modification related genes in citrus species. Based on the two recent sequenced sweet orange genome databases, a total of 136 CsHMs (Citrus sinensis histone modification genes), including 47 CsHMTs (histone methyltransferase genes), 23 CsHDMs (histone demethylase genes), 50 CsHATs (histone acetyltransferase genes), and 16 CsHDACs (histone deacetylase genes) were identified. These genes were categorized to 11 gene families. A comprehensive analysis of these 11 gene families was performed with chromosome locations, phylogenetic comparison, gene structures, and conserved domain compositions of proteins. In order to gain an insight into the potential roles of these genes in citrus fruit development, 42 CsHMs with high mRNA abundance in fruit tissues were selected to further analyze their expression profiles at six stages of fruit development. Interestingly, a numbers of genes were expressed highly in flesh of ripening fruit and some of them showed the increasing expression levels along with the fruit development. Furthermore, we analyzed the expression patterns of all 136 CsHMs response to the infection of blue mold (Penicillium digitatum), which is the most devastating pathogen in citrus post-harvest process. The results indicated that 20 of them showed the strong alterations of their expression levels during the fruit-pathogen infection. In conclusion, this study presents a comprehensive analysis of the histone modification gene families in sweet orange and further elucidates their behaviors during the fruit development and the blue mold infection responses.
Modularity and evolutionary constraints in a baculovirus gene regulatory network

PubMed Central

2013-01-01

Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates that modularity may be a general feature of biological gene regulatory networks. PMID:24006890
Polytene Chromosomes - A Portrait of Functional Organization of the Drosophila Genome.

PubMed

Zykova, Tatyana Yu; Levitsky, Victor G; Belyaeva, Elena S; Zhimulev, Igor F

2018-04-01

This mini-review is devoted to the problem genetic meaning of main polytene chromosome structures - bands and interbands. Generally, densely packed chromatin forms black bands, moderately condensed regions form grey loose bands, whereas decondensed regions of the genome appear as interbands. Recent progress in the annotation of the Drosophila genome and epigenome has made it possible to compare the banding pattern and the structural organization of genes, as well as their activity. This was greatly aided by our ability to establish the borders of bands and interbands on the physical map, which allowed to perform comprehensive side-by-side comparisons of cytology, genetic and epigenetic maps and to uncover the association between the morphological structures and the functional domains of the genome. These studies largely conclude that interbands 5'-ends of housekeeping genes that are active across all cell types. Interbands are enriched with proteins involved in transcription and nucleosome remodeling, as well as with active histone modifications. Notably, most of the replication origins map to interband regions. As for grey loose bands adjacent to interbands, they typically host the bodies of house-keeping genes. Thus, the bipartite structure composed of an interband and an adjacent grey band functions as a standalone genetic unit. Finally, black bands harbor tissue-specific genes with narrow temporal and tissue expression profiles. Thus, the uniform and permanent activity of interbands combined with the inactivity of genes in bands forms the basis of the universal banding pattern observed in various Drosophila tissues.
Using expression genetics to study the neurobiology of ethanol and alcoholism.

PubMed

Farris, Sean P; Wolen, Aaron R; Miles, Michael F

2010-01-01

Recent simultaneous progress in human and animal model genetics and the advent of microarray whole genome expression profiling have produced prodigious data sets on genetic loci, potential candidate genes, and differential gene expression related to alcoholism and ethanol behaviors. Validated target genes or gene networks functioning in alcoholism are still of meager proportions. Genetical genomics, which combines genetic analysis of both traditional phenotypes and whole genome expression data, offers a potential methodology for characterizing brain gene networks functioning in alcoholism. This chapter will describe concepts, approaches, and recent findings in the field of genetical genomics as it applies to alcohol research. Copyright 2010 Elsevier Inc. All rights reserved.
Experimental evidence supports a sex-specific selective sieve in mitochondrial genome evolution.

PubMed

Innocenti, Paolo; Morrow, Edward H; Dowling, Damian K

2011-05-13

Mitochondria are maternally transmitted; hence, their genome can only make a direct and adaptive response to selection through females, whereas males represent an evolutionary dead end. In theory, this creates a sex-specific selective sieve, enabling deleterious mutations to accumulate in mitochondrial genomes if they exert male-specific effects. We tested this hypothesis, expressing five mitochondrial variants alongside a standard nuclear genome in Drosophila melanogaster, and found striking sexual asymmetry in patterns of nuclear gene expression. Mitochondrial polymorphism had few effects on nuclear gene expression in females but major effects in males, modifying nearly 10% of transcripts. These were mostly male-biased in expression, with enrichment hotspots in the testes and accessory glands. Our results suggest an evolutionary mechanism that results in mitochondrial genomes harboring male-specific mutation loads.
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava

PubMed Central

Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

2016-01-01

The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava. PMID:26904033
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava.

PubMed

Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

2016-01-01

The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.
Parvovirus B19 DNA CpG Dinucleotide Methylation and Epigenetic Regulation of Viral Expression

PubMed Central

Bonvicini, Francesca; Manaresi, Elisabetta; Di Furio, Francesca; De Falco, Luisa; Gallinella, Giorgio

2012-01-01

CpG DNA methylation is one of the main epigenetic modifications playing a role in the control of gene expression. For DNA viruses whose genome has the ability to integrate in the host genome or to maintain as a latent episome, a correlation has been found between the extent of DNA methylation and viral quiescence. No information is available for Parvovirus B19, a human pathogenic virus, which is capable of both lytic and persistent infections. Within Parvovirus B19 genome, the inverted terminal regions display all the characteristic signatures of a genomic CpG island; therefore we hypothesised a role of CpG dinucleotide methylation in the regulation of viral genome expression. The analysis of CpG dinucleotide methylation of Parvovirus B19 DNA was carried out by an aptly designed quantitative real-time PCR assay on bisulfite-modified DNA. The effects of CpG methylation on the regulation of viral genome expression were first investigated by transfection of either unmethylated or in vitro methylated viral DNA in a model cell line, showing that methylation of viral DNA was correlated to lower expression levels of the viral genome. Then, in the course of in vitro infections in different cellular environments, it was observed that absence of viral expression and genome replication were both correlated to increasing levels of CpG methylation of viral DNA. Finally, the presence of CpG methylation was documented in viral DNA present in bioptic samples, indicating the occurrence and a possible role of this epigenetic modification in the course of natural infections. The presence of an epigenetic level of regulation of viral genome expression, possibly correlated to the silencing of the viral genome and contributing to the maintenance of the virus in tissues, can be relevant to the balance and outcome of the different types of infection associated to Parvovirus B19. PMID:22413013
Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

PubMed Central

Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

2015-01-01

Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016
Leukotriene signaling in the extinct human subspecies Homo denisovan and Homo neanderthalensis. Structural and functional comparison with Homo sapiens.

PubMed

Adel, Susan; Kakularam, Kumar Reddy; Horn, Thomas; Reddanna, Pallu; Kuhn, Hartmut; Heydeck, Dagmar

2015-01-01

Mammalian lipoxygenases (LOXs) have been implicated in cell differentiation and in the biosynthesis of pro- and anti-inflammatory lipid mediators. The initial draft sequence of the Homo neanderthalensis genome (coverage of 1.3-fold) suggested defective leukotriene signaling in this archaic human subspecies since expression of essential proteins appeared to be corrupted. Meanwhile high quality genomic sequence data became available for two extinct human subspecies (H. neanderthalensis, Homo denisovan) and completion of the human 1000 genome project provided a comprehensive database characterizing the genetic variability of the human genome. For this study we extracted the nucleotide sequences of selected eicosanoid relevant genes (ALOX5, ALOX15, ALOX12, ALOX15B, ALOX12B, ALOXE3, COX1, COX2, LTA4H, LTC4S, ALOX5AP, CYSLTR1, CYSLTR2, BLTR1, BLTR2) from the corresponding databases. Comparison of the deduced amino acid sequences in connection with site-directed mutagenesis studies and structural modeling suggested that the major enzymes and receptors of leukotriene signaling as well as the two cyclooxygenase isoforms were fully functional in these two extinct human subspecies. Copyright © 2014 Elsevier Inc. All rights reserved.
Leukocyte common antigen-related phosphatase (LRP) gene structure: Conservation of the genomic organization of transmembrane protein tyrosine phosphatases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, E.C.C.; Mullersman, J.E.; Thomas, M.L.

1993-07-01

The leukocyte common antigen-related protein tyrosine phosphatase (LRP) is a widely expressed transmembrane glycoprotein thought to be involved in cell growth and differentiation. Similar to most other transmembrane protein tyrosine phosphatases, LRP contains two tandem cytoplasmic phosphatase domains. To understand further the regulation and evolution of LRP, the authors have isolated and characterized mouse [lambda] genomic clones. Thirteen genomic clones could be divided into two non-overlapping clusters. The first cluster contained the transcription initiation site and the exon encoding most of the 5[prime] untranslated region. The second cluster contained the remaining exons encoding the protein and the 3[prime] untranslated region.more » The gene consists of 22 exons spanning over 75 kb. The distance between exon 1 and exon 2 is at least 25 kb. Characterization of the 5[prime] ends of LRP mRNA by S1 nuclease protection identifies putative initiation start sites within a G/C-rich region. The upstream region does not contain a TATA box. Comparison of the LRP gene structure to the mammalian protein tyrosine phosphatase gene, CD45, shows striking similarities in size and genomic organization. 29 refs., 5 figs., 1 tab.« less
The d4 gene family in the human genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chestkov, A.V.; Baka, I.D.; Kost, M.V.

1996-08-15

The d4 domain, a novel zinc finger-like structural motif, was first revealed in the rat neuro-d4 protein. Here we demonstrate that the d4 domain is conserved in evolution and that three related genes form a d4 family in the human genome. The human neuro-d4 is very similar to rat neuro-d4 at both the amino acid and the nucleotide levels. Moreover, the same splice variants have been detected among rat and human neuro-d4 transcripts. This gene has been localized on chromosome 19, and two other genes, members of the d4 family isolated by screening of the human genomic library at lowmore » stringency, have been mapped to chromosomes 11 and 14. The gene on chromosome 11 is the homolog of the ubiquitously expressed mouse gene ubi-d4/requiem, which is required for cell death after deprivation of trophic factors. A gene with a conserved d4 domain has been found in the genome of the nematode Caenorhabditis elegans. The conservation of d4 proteins from nematodes to vertebrates suggests that they have a general importance, but a diversity of d4 proteins expressed in vertebrate nervous systems suggests that some family members have special functions. 11 refs., 2 figs.« less
Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack.

PubMed

Fügi, Matthias A; Gunasekera, Kapila; Ochsenreiter, Torsten; Guan, Xueli; Wenk, Markus R; Mäser, Pascal

2014-05-01

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas's disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile.
Into the Fourth Dimension: Dysregulation of Genome Architecture in Aging and Alzheimer’s Disease

PubMed Central

Winick-Ng, Warren; Rylett, R. Jane

2018-01-01

Alzheimer’s disease (AD) is a progressive neurodegenerative disease characterized by synapse dysfunction and cognitive impairment. Understanding the development and progression of AD is challenging, as the disease is highly complex and multifactorial. Both environmental and genetic factors play a role in AD pathogenesis, highlighted by observations of complex DNA modifications at the single gene level, and by new evidence that also implicates changes in genome architecture in AD patients. The four-dimensional structure of chromatin in space and time is essential for context-dependent regulation of gene expression in post-mitotic neurons. Dysregulation of epigenetic processes have been observed in the aging brain and in patients with AD, though there is not yet agreement on the impact of these changes on transcription. New evidence shows that proteins involved in genome organization have altered expression and localization in the AD brain, suggesting that the genomic landscape may play a critical role in the development of AD. This review discusses the role of the chromatin organizers and epigenetic modifiers in post-mitotic cells, the aging brain, and in the development and progression of AD. How these new insights can be used to help determine disease risk and inform treatment strategies will also be discussed. PMID:29541020

Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack

PubMed Central

Fügi, Matthias A.; Gunasekera, Kapila; Ochsenreiter, Torsten; Guan, Xueli; Wenk, Markus R.; Mäser, Pascal

2014-01-01

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas’s disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile. PMID:24627128
Genome-Wide Comparative In Silico Analysis of the RNA Helicase Gene Family in Zea mays and Glycine max: A Comparison with Arabidopsis and Oryza sativa

PubMed Central

Huang, Jinguang; Zheng, Chengchao

2013-01-01

RNA helicases are enzymes that are thought to unwind double-stranded RNA molecules in an energy-dependent fashion through the hydrolysis of NTP. RNA helicases are associated with all processes involving RNA molecules, including nuclear transcription, editing, splicing, ribosome biogenesis, RNA export, and organelle gene expression. The involvement of RNA helicase in response to stress and in plant growth and development has been reported previously. While their importance in Arabidopsis and Oryza sativa has been partially studied, the function of RNA helicase proteins is poorly understood in Zea mays and Glycine max. In this study, we identified a total of RNA helicase genes in Arabidopsis and other crop species genome by genome-wide comparative in silico analysis. We classified the RNA helicase genes into three subfamilies according to the structural features of the motif II region, such as DEAD-box, DEAH-box and DExD/H-box, and different species showed different patterns of alternative splicing. Secondly, chromosome location analysis showed that the RNA helicase protein genes were distributed across all chromosomes with different densities in the four species. Thirdly, phylogenetic tree analyses identified the relevant homologs of DEAD-box, DEAH-box and DExD/H-box RNA helicase proteins in each of the four species. Fourthly, microarray expression data showed that many of these predicted RNA helicase genes were expressed in different developmental stages and different tissues under normal growth conditions. Finally, real-time quantitative PCR analysis showed that the expression levels of 10 genes in Arabidopsis and 13 genes in Zea mays were in close agreement with the microarray expression data. To our knowledge, this is the first report of a comparative genome-wide analysis of the RNA helicase gene family in Arabidopsis, Oryza sativa, Zea mays and Glycine max. This study provides valuable information for understanding the classification and putative functions of the RNA helicase gene family in crop growth and development. PMID:24265739
Genome-wide comparative in silico analysis of the RNA helicase gene family in Zea mays and Glycine max: a comparison with Arabidopsis and Oryza sativa.

PubMed

Xu, Ruirui; Zhang, Shizhong; Huang, Jinguang; Zheng, Chengchao

2013-01-01

RNA helicases are enzymes that are thought to unwind double-stranded RNA molecules in an energy-dependent fashion through the hydrolysis of NTP. RNA helicases are associated with all processes involving RNA molecules, including nuclear transcription, editing, splicing, ribosome biogenesis, RNA export, and organelle gene expression. The involvement of RNA helicase in response to stress and in plant growth and development has been reported previously. While their importance in Arabidopsis and Oryza sativa has been partially studied, the function of RNA helicase proteins is poorly understood in Zea mays and Glycine max. In this study, we identified a total of RNA helicase genes in Arabidopsis and other crop species genome by genome-wide comparative in silico analysis. We classified the RNA helicase genes into three subfamilies according to the structural features of the motif II region, such as DEAD-box, DEAH-box and DExD/H-box, and different species showed different patterns of alternative splicing. Secondly, chromosome location analysis showed that the RNA helicase protein genes were distributed across all chromosomes with different densities in the four species. Thirdly, phylogenetic tree analyses identified the relevant homologs of DEAD-box, DEAH-box and DExD/H-box RNA helicase proteins in each of the four species. Fourthly, microarray expression data showed that many of these predicted RNA helicase genes were expressed in different developmental stages and different tissues under normal growth conditions. Finally, real-time quantitative PCR analysis showed that the expression levels of 10 genes in Arabidopsis and 13 genes in Zea mays were in close agreement with the microarray expression data. To our knowledge, this is the first report of a comparative genome-wide analysis of the RNA helicase gene family in Arabidopsis, Oryza sativa, Zea mays and Glycine max. This study provides valuable information for understanding the classification and putative functions of the RNA helicase gene family in crop growth and development.
Genomic determinants of epidermal appendage patterning and structure in domestic birds

PubMed Central

Boer, Elena F.; Van Hollebeke, Hannah F.; Shapiro, Michael D.

2017-01-01

Variation in regional identity, patterning, and structure of epidermal appendages contributes to skin diversity among many vertebrate groups, and is perhaps most striking in birds. In pioneering work on epidermal appendage patterning, John Saunders and his contemporaries took advantage of epidermal appendage diversity within and among domestic chicken breeds to establish the importance of mesoderm-ectoderm signaling in determining skin patterning. Diversity in chickens and other domestic birds, including pigeons, is driving a new wave of research to dissect the molecular genetic basis of epidermal appendage patterning. Domestic birds are not only outstanding models for embryonic manipulations, as Saunders recognized, but they are also ideal genetic models for discovering the specific genes that control normal development and the mutations that contribute to skin diversity. Here, we review recent genetic and genomic approaches to uncover the basis of epidermal macropatterning, micropatterning, and structural variation. We also present new results that confirm expression changes in two limb identity genes in feather-footed pigeons, a case of variation in appendage structure and identity. PMID:28347644
Immune subversion by chromatin manipulation: a 'new face' of host-bacterial pathogen interaction.

PubMed

Arbibe, Laurence

2008-08-01

Bacterial pathogens have evolved various strategies to avoid immune surveillance, depending of their in vivo'lifestyle'. The identification of few bacterial effectors capable to enter the nucleus and modifying chromatin structure in host raises the fascinating questions of how pathogens modulate chromatin structure and why. Chromatin is a dynamic structure that maintains the stability and accessibility of the host DNA genome to the transcription machinery. This review describes the various strategies used by pathogens to interface with host chromatin. In some cases, chromatin injury can be a strategy to take control of major cellular functions, such as the cell cycle. In other cases, manipulation of chromatin structure at specific genomic locations by modulating epigenetic information provides a way for the pathogen to impose its own transcriptional signature onto host cells. This emerging field should strongly influence our understanding of chromatin regulation at interphase nucleus and may provide invaluable openings to the control of immune gene expression in inflammatory and infectious diseases.
The Dynamic Architectural and Epigenetic Nuclear Landscape: Developing the Genomic Almanac of Biology and Disease

PubMed Central

Tai, Phillip W. L.; Zaidi, Sayyed K.; Wu, Hai; Grandy, Rodrigo A.; Montecino, Martin M.; van Wijnen, André J.; Lian, Jane B.; Stein, Gary S.; Stein, Janet L.

2014-01-01

Compaction of the eukaryotic genome into the confined space of the cell nucleus must occur faithfully throughout each cell cycle to retain gene expression fidelity. For decades, experimental limitations to study the structural organization of the interphase nucleus restricted our understanding of its contributions towards gene regulation and disease. However, within the past few years, our capability to visualize chromosomes in vivo with sophisticated fluorescence microscopy, and to characterize chromosomal regulatory environments via massively-parallel sequencing methodologies have drastically changed how we currently understand epigenetic gene control within the context of three-dimensional nuclear structure. The rapid rate at which information on nuclear structure is unfolding brings challenges to compare and contrast recent observations with historic findings. In this review, we discuss experimental breakthroughs that have influenced how we understand and explore the dynamic structure and function of the nucleus, and how we can incorporate historical perspectives with insights acquired from the ever-evolving advances in molecular biology and pathology. PMID:24242872
Novel mechanism of conjoined gene formation in the human genome.

PubMed

Kim, Ryong Nam; Kim, Aeri; Choi, Sang-Haeng; Kim, Dae-Soo; Nam, Seong-Hyeuk; Kim, Dae-Won; Kim, Dong-Wook; Kang, Aram; Kim, Min-Young; Park, Kun-Hyang; Yoon, Byoung-Ha; Lee, Kang Seon; Park, Hong-Seog

2012-03-01

Recently, conjoined genes (CGs) have emerged as important genetic factors necessary for understanding the human genome. However, their formation mechanism and precise structures have remained mysterious. Based on a detailed structural analysis of 57 human CG transcript variants (CGTVs, discovered in this study) and all (833) known CGs in the human genome, we discovered that the poly(A) signal site from the upstream parent gene region is completely removed via the skipping or truncation of the final exon; consequently, CG transcription is terminated at the poly(A) signal site of the downstream parent gene. This result led us to propose a novel mechanism of CG formation: the complete removal of the poly(A) signal site from the upstream parent gene is a prerequisite for the CG transcriptional machinery to continue transcribing uninterrupted into the intergenic region and downstream parent gene. The removal of the poly(A) signal sequence from the upstream gene region appears to be caused by a deletion or truncation mutation in the human genome rather than post-transcriptional trans-splicing events. With respect to the characteristics of CG sequence structures, we found that intergenic regions are hot spots for novel exon creation during CGTV formation and that exons farther from the intergenic regions are more highly conserved in the CGTVs. Interestingly, many novel exons newly created within the intergenic and intragenic regions originated from transposable element sequences. Additionally, the CGTVs showed tumor tissue-biased expression. In conclusion, our study provides novel insights into the CG formation mechanism and expands the present concepts of the genetic structural landscape, gene regulation, and gene formation mechanisms in the human genome.
Genomic Expression Patterns in Menstrually-Related Migraine in Adolescents

PubMed Central

Hershey, Andrew; Horn, Paul; Kabbouche, Marielle; O'Brien, Hope; Powers, Scott

2011-01-01

Background Exacerbation of migraine with menses is common in adolescent girls and women with migraine, occurring in up to 60% of females with migraine. These migraines are oftentimes longer and more disabling and may be related to estrogen levels and hormonal fluctuations. Objective This study identifies the unique genomic expression pattern of menstrually-related migraine (MRM) in comparison to migraine occurring outside the menstrual period and headache free controls. Methods Whole blood samples were obtained from female subjects having an acute migraine during their menstrual period (MRM) or outside of their menstrual period (nonMRM) and controls (C) – females having a menstrual period without any history of headache. The mRNA was isolated from these samples and genomic profile was assessed. Affymetrix Human Exon ST 1.0 arrays were used to examine the genomic expression pattern differences between these three groups. Results Blood genomic expression patterns were obtained on 56 subjects (MRM = 18, nonMRM = 18 and C = 20). Unique genomic expression patterns were observed for both MRM and nonMRM. For MRM, 77 genes were identified that were unique to MRM, while 61 genes were commonly expressed for MRM and nonMRM and 127 genes appeared to have a unique expression pattern for nonMRM. In addition, there were 279 genes that differentially expressed for MRM compared to nonMRM that were not differentially expressed for nonMRM. Gene ontology of these samples indicated many of these groups of genes were functionally related and included categories of immunomodulation/inflammation, mitochondrial function and DNA homeostasis. Conclusions Blood genomic patterns can accurately differentiate MRM from nonMRM. These results indicate that MRM involves a unique molecular biology pathway that can be identified with a specific biomarker and suggest that individuals with MRM have a different underlying genetic etiology. PMID:22220971
Molecular cloning, expression pattern, and 3D structural prediction of the cold inducible RNA-binding protein (CIRP) in Japanese flounder ( Paralichthys olivaceus)

NASA Astrophysics Data System (ADS)

Yang, Xiao; Gao, Jinning; Ma, Liman; Li, Zan; Wang, Wenji; Wang, Zhongkai; Yu, Haiyang; Qi, Jie; Wang, Xubo; Wang, Zhigang; Zhang, Quanqi

2015-02-01

Cold-inducible RNA-binding protein (CIRP) is a kind of RNA binding proteins that plays important roles in many physiological processes. The CIRP has been widely studied in mammals and amphibians since it was first cloned from mammals. On the contrary, there are little reports in teleosts. In this study, the Po CIRP gene of the Japanese flounder was cloned and sequenced. The genomic sequence consists of seven exons and six introns. The putative PoCIRP protein of flounder was 198 amino acid residues long containing the RNA recognition motif (RRM). Phylogenetic analysis showed that the flounder PoCIRP is highly conserved with other teleost CIRPs. The 5' flanking sequence was cloned by genome walking and many transcription factor binding sites were identified. There is a CpGs region located in promoter and exon I region and the methylation state is low. Quantitative real-time PCR analysis uncovered that Po CIRP gene was widely expressed in adult tissues with the highest expression level in the ovary. The mRNA of the Po CIRP was maternally deposited and the expression level of the gene was regulated up during the gastrula and neurula stages. In order to gain the information how the protein interacts with mRNA, we performed the modeling of the 3D structure of the flounder PoCIRP. The results showed a cleft existing the surface of the molecular. Taken together, the results indicate that the CIRP is a multifunctional molecular in teleosts and the findings about the structure provide valuable information for understanding the basis of this protein's function.
Heterogeneous expression pattern of tandem duplicated sHsps genes during fruit ripening in two tomato species

NASA Astrophysics Data System (ADS)

Arce, DP; Krsticevic, FJ; Ezpeleta, J.; Ponce, SD; Pratta, GR; Tapia, E.

2016-04-01

The small heat shock proteins (sHSPs) have been found to play a critical role in physiological stress conditions in protecting proteins from irreversible aggregation. To characterize the gene expression profile of four sHsps with a tandem gene structure arrangement in the domesticated Solanum lycopersicum (Heinz 1706) genome and its wild close relative Solanum pimpinellifolium (LA1589), differential gene expression analysis using RNA-Seq was conducted in three ripening stages in both cultivars fruits. Gene promoter analysis was performed to explain the heterogeneous pattern of gene expression found for these tandem duplicated sHsps. In silico analysis results contribute to refocus wet experiment analysis in tomato sHsp family proteins.
Genome-Wide Analyses of the NAC Transcription Factor Gene Family in Pepper (Capsicum annuum L.): Chromosome Location, Phylogeny, Structure, Expression Patterns, Cis-Elements in the Promoter, and Interaction Network

PubMed Central

Diao, Weiping; Snyder, John C.; Liu, Jinbing; Pan, Baogui; Guo, Guangjun; Ge, Wei; Dawood, Mohammad Hasan Salman Ali

2018-01-01

The NAM, ATAF1/2, and CUC2 (NAC) transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.). The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC) proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6). In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III). Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9%) of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA), and methyl jasmonate (MeJA)] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with different types of stress. Our results also showed that CaNAC36 plays an important role in the interaction network, interacting with 48 genes. Most of these genes are in the mitogen-activated protein kinase (MAPK) family. Taken together, our results provide a platform for further studies to identify the biological functions of CaNAC genes. PMID:29596349
Genome-Wide Identification and Expression Analysis of the Cation Diffusion Facilitator Gene Family in Turnip Under Diverse Metal Ion Stresses.

PubMed

Li, Xiong; Wu, Yuansheng; Li, Boqun; He, Wenqi; Yang, Yonghong; Yang, Yongping

2018-01-01

The cation diffusion facilitator (CDF) family is one of the gene families involved in metal ion uptake and transport in plants, but the understanding of the definite roles and mechanisms of most CDF genes remain limited. In the present study, we identified 18 candidate CDF genes from the turnip genome and named them BrrMTP1.1 - BrrMTP12 . Then, we performed a comparative genomic analysis on the phylogenetic relationships, gene structures and chromosome distributions, conserved domains, and motifs of turnip CDFs. The constructed phylogenetic tree indicated that the BrrMTPs were divided into seven groups (groups 1, 5, 6, 7, 8, 9, and 12) and formed three major clusters (Zn-CDFs, Fe/Zn-CDFs, and Mn-CDFs). Moreover, the structural characteristics of the BrrMTP members in the same group were similar but varied among groups. To investigate the potential roles of BrrMTPs in turnip, we conducted an expression analysis on all BrrMTP genes under Mg, Zn, Cu, Mn, Fe, Co, Na, and Cd stresses. Results showed that the expression levels of all BrrMTP members were induced by at least one metal ion, indicating that these genes may be related to the tolerance or transport of those metal ions. Based on the roles of different metal ions for plants, we hypothesized that BrrMTP genes are possibly involved in heavy metal accumulation and tolerance to salt stress apart from their roles in the maintenance of mineral nutrient homeostasis in turnip. These findings are helpful to understand the roles of MTPs in plants and provide preliminary information for the study of the functions of BrrMTP genes.
A 14-3-3 Family Protein from Wild Soybean (Glycine Soja) Regulates ABA Sensitivity in Arabidopsis

PubMed Central

Sun, Xiaoli; Sun, Mingzhe; Jia, Bowei; Chen, Chao; Qin, Zhiwei; Yang, Kejun; Shen, Yang; Meiping, Zhang; Mingyang, Cong; Zhu, Yanming

2015-01-01

It is widely accepted that the 14-3-3 family proteins are key regulators of multiple stress signal transduction cascades. By conducting genome-wide analysis, researchers have identified the soybean 14-3-3 family proteins; however, until now, there is still no direct genetic evidence showing the involvement of soybean 14-3-3s in ABA responses. Hence, in this study, based on the latest Glycine max genome on Phytozome v10.3, we initially analyzed the evolutionary relationship, genome organization, gene structure and duplication, and three-dimensional structure of soybean 14-3-3 family proteins systematically. Our results suggested that soybean 14-3-3 family was highly evolutionary conserved and possessed segmental duplication in evolution. Then, based on our previous functional characterization of a Glycine soja 14-3-3 protein GsGF14o in drought stress responses, we further investigated the expression characteristics of GsGF14o in detail, and demonstrated its positive roles in ABA sensitivity. Quantitative real-time PCR analyses in Glycine soja seedlings and GUS activity assays in PGsGF14O:GUS transgenic Arabidopsis showed that GsGF14o expression was moderately and rapidly induced by ABA treatment. As expected, GsGF14o overexpression in Arabidopsis augmented the ABA inhibition of seed germination and seedling growth, promoted the ABA induced stomata closure, and up-regulated the expression levels of ABA induced genes. Moreover, through yeast two hybrid analyses, we further demonstrated that GsGF14o physically interacted with the AREB/ABF transcription factors in yeast cells. Taken together, results presented in this study strongly suggested that GsGF14o played an important role in regulation of ABA sensitivity in Arabidopsis. PMID:26717241
Genome-wide identification and characterization of polygalacturonase genes in Cucumis sativus and Citrullus lanatus.

PubMed

Yu, Youjian; Liang, Ying; Lv, Meiling; Wu, Jian; Lu, Gang; Cao, Jiashu

2014-01-01

Polygalacturonase (PG, EC3.2.1.15), one of the hydrolytic enzymes associated with the modification of pectin network in plant cell wall, has an important role in various cell-separation processes that are essential for plant development. PGs are encoded by a large gene family in plants. However, information on this gene family in plant development remains limited. In the present study, 53 and 62 putative members of the PG gene family in cucumber and watermelon genomes, respectively, were identified by genome-wide search to explore the composition, structure, and evolution of the PG family in Cucurbitaceae crops. The results showed that tandem duplication could be an important factor that contributes to the expansion of the PG genes in the two crops. The phylogenetic and evolutionary analyses suggested that PGs could be classified into seven clades, and that the exon/intron structures and intron phases were conserved within but divergent between clades. At least 24 ancestral PGs were detected in the common ancestor of Arabidopsis and Cucumis sativus. Expression profile analysis by quantitative real-time polymerase chain reaction demonstrated that most CsPGs exhibit specific or high expression pattern in one of the organs/tissues. The 16 CsPGs associated with fruit development could be divided into three subsets based on their specific expression patterns and the cis-elements of fruit-specific, endosperm/seed-specific, and ethylene-responsive exhibited in their promoter regions. Our comparative analysis provided some basic information on the PG gene family, which would be valuable for further functional analysis of the PG genes during plant development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
SASH1: a candidate tumor suppressor gene on chromosome 6q24.3 is downregulated in breast cancer.

PubMed

Zeller, Constanze; Hinzmann, Bernd; Seitz, Susanne; Prokoph, Helmuth; Burkhard-Goettges, Elke; Fischer, Jörg; Jandrig, Burkhard; Schwarz, Lope-Estevez; Rosenthal, André; Scherneck, Siegfried

2003-05-15

Loss of heterozygosity (LOH) and in silico expression analysis were applied to identify genes significantly downregulated in breast cancer within the genomic interval 6q23-25. Systematic comparison of candidate EST sequences with genomic sequences from this interval revealed the genomic structure of a potential target gene on 6q24.3, which we called SAM and SH3 domain containing 1 (SASH1). Loss of the gene-internal marker D6S311, found in 30% of primary breast cancer, was significantly correlated with poor survival and increase in tumor size. Two SASH1 transcripts of approximately 4.4 and 7.5 kb exist and are predominantly transcribed in the human breast, lung, thyroid, spleen, placenta and thymus. In breast cancer cell lines, SASH1 is only expressed at low levels. SASH1 is downregulated in the majority (74%) of breast tumors in comparison with corresponding normal breast epithelial tissues. In addition, SASH1 is also downregulated in tumors of the lung and thyroid. Analysis of the protein domain structure revealed that SASH1 is a member of a recently described family of SH3/SAM adapter molecules and thus suggests a role in signaling pathways. We assume that SASH1 is a new tumor suppressor gene possibly involved in tumorigenesis of breast and other solid cancers. We were unable to find mutations in the coding region of the gene in primary breast cancers showing LOH within the critical region. We therefore hypothesize that other mechanisms as for instance methylation of the promoter region of SASH1 are responsible for the loss of expression of SASH1 in primary and metastatic breast cancer.
Cell-free protein synthesis for structure determination by X-ray crystallography.

PubMed

Watanabe, Miki; Miyazono, Ken-ichi; Tanokura, Masaru; Sawasaki, Tatsuya; Endo, Yaeta; Kobayashi, Ichizo

2010-01-01

Structure determination has been difficult for those proteins that are toxic to the cells and cannot be prepared in a large amount in vivo. These proteins, even when biologically very interesting, tend to be left uncharacterized in the structural genomics projects. Their cell-free synthesis can bypass the toxicity problem. Among the various cell-free systems, the wheat-germ-based system is of special interest due to the following points: (1) Because the gene is placed under a plant translational signal, its toxic expression in a bacterial host is reduced. (2) It has only little codon preference and, especially, little discrimination between methionine and selenomethionine (SeMet), which allows easy preparation of selenomethionylated proteins for crystal structure determination by SAD and MAD methods. (3) Translation is uncoupled from transcription, so that the toxicity of the translation product on DNA and its transcription, if any, can be bypassed. We have shown that the wheat-germ-based cell-free protein synthesis is useful for X-ray crystallography of one of the 4-bp cutter restriction enzymes, which are expected to be very toxic to all forms of cells retaining the genome. Our report on its structure represents the first report of structure determination by X-ray crystallography using protein overexpressed with the wheat-germ-based cell-free protein expression system. This will be a method of choice for cytotoxic proteins when its cost is not a problem. Its use will become popular when the crystal structure determination technology has evolved to require only a tiny amount of protein.
Genome-wide classification, evolutionary analysis and gene expression patterns of the kinome in Gossypium

PubMed Central

Yan, Jun; Li, Guilin; Guo, Xingqi; Li, Yang; Cao, Xuecheng

2018-01-01

The protein kinase (PK, kinome) family is one of the largest families in plants and regulates almost all aspects of plant processes, including plant development and stress responses. Despite their important functions, comprehensive functional classification, evolutionary analysis and expression patterns of the cotton PK gene family has yet to be performed on PK genes. In this study, we identified the cotton kinomes in the Gossypium raimondii, Gossypium arboretum, Gossypium hirsutum and Gossypium barbadense genomes and classified them into 7 groups and 122–24 subfamilies using software HMMER v3.0 scanning and neighbor-joining (NJ) phylogenetic analysis. Some conserved exon-intron structures were identified not only in cotton species but also in primitive plants, ferns and moss, suggesting the significant function and ancient origination of these PK genes. Collinearity analysis revealed that 16.6 million years ago (Mya) cotton-specific whole genome duplication (WGD) events may have played a partial role in the expansion of the cotton kinomes, whereas tandem duplication (TD) events mainly contributed to the expansion of the cotton RLK group. Synteny analysis revealed that tetraploidization of G. hirsutum and G. barbadense contributed to the expansion of G. hirsutum and G. barbadense PKs. Global expression analysis of cotton PKs revealed stress-specific and fiber development-related expression patterns, suggesting that many cotton PKs might be involved in the regulation of the stress response and fiber development processes. This study provides foundational information for further studies on the evolution and molecular function of cotton PKs. PMID:29768506
Versatile Cosmid Vectors for the Isolation, Expression, and Rescue of Gene Sequences: Studies with the Human α -globin Gene Cluster

NASA Astrophysics Data System (ADS)

Lau, Yun-Fai; Kan, Yuet Wai

1983-09-01

We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.
Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta.

PubMed

Kanost, Michael R; Arrese, Estela L; Cao, Xiaolong; Chen, Yun-Ru; Chellapilla, Sanjay; Goldsmith, Marian R; Grosse-Wilde, Ewald; Heckel, David G; Herndon, Nicolae; Jiang, Haobo; Papanicolaou, Alexie; Qu, Jiaxin; Soulages, Jose L; Vogel, Heiko; Walters, James; Waterhouse, Robert M; Ahn, Seung-Joon; Almeida, Francisca C; An, Chunju; Aqrawi, Peshtewani; Bretschneider, Anne; Bryant, William B; Bucks, Sascha; Chao, Hsu; Chevignon, Germain; Christen, Jayne M; Clarke, David F; Dittmer, Neal T; Ferguson, Laura C F; Garavelou, Spyridoula; Gordon, Karl H J; Gunaratna, Ramesh T; Han, Yi; Hauser, Frank; He, Yan; Heidel-Fischer, Hanna; Hirsh, Ariana; Hu, Yingxia; Jiang, Hongbo; Kalra, Divya; Klinner, Christian; König, Christopher; Kovar, Christie; Kroll, Ashley R; Kuwar, Suyog S; Lee, Sandy L; Lehman, Rüdiger; Li, Kai; Li, Zhaofei; Liang, Hanquan; Lovelace, Shanna; Lu, Zhiqiang; Mansfield, Jennifer H; McCulloch, Kyle J; Mathew, Tittu; Morton, Brian; Muzny, Donna M; Neunemann, David; Ongeri, Fiona; Pauchet, Yannick; Pu, Ling-Ling; Pyrousis, Ioannis; Rao, Xiang-Jun; Redding, Amanda; Roesel, Charles; Sanchez-Gracia, Alejandro; Schaack, Sarah; Shukla, Aditi; Tetreau, Guillaume; Wang, Yang; Xiong, Guang-Hua; Traut, Walther; Walsh, Tom K; Worley, Kim C; Wu, Di; Wu, Wenbi; Wu, Yuan-Qing; Zhang, Xiufeng; Zou, Zhen; Zucker, Hannah; Briscoe, Adriana D; Burmester, Thorsten; Clem, Rollie J; Feyereisen, René; Grimmelikhuijzen, Cornelis J P; Hamodrakas, Stavros J; Hansson, Bill S; Huguet, Elisabeth; Jermiin, Lars S; Lan, Que; Lehman, Herman K; Lorenzen, Marce; Merzendorfer, Hans; Michalopoulos, Ioannis; Morton, David B; Muthukrishnan, Subbaratnam; Oakeshott, John G; Palmer, Will; Park, Yoonseong; Passarelli, A Lorena; Rozas, Julio; Schwartz, Lawrence M; Smith, Wendy; Southgate, Agnes; Vilcinskas, Andreas; Vogt, Richard; Wang, Ping; Werren, John; Yu, Xiao-Qiang; Zhou, Jing-Jiang; Brown, Susan J; Scherer, Steven E; Richards, Stephen; Blissard, Gary W

2016-09-01

Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hepatitis E: Molecular Virology and Pathogenesis

PubMed Central

Panda, Subrat K.; Varma, Satya P.K.

2013-01-01

Hepatitis E virus is a single, positive-sense, capped and poly A tailed RNA virus classified under the family Hepeviridae. Enteric transmission, acute self-limiting hepatitis, frequent epidemic and sporadic occurrence, high mortality in affected pregnants are hallmarks of hepatitis E infection. Lack of an efficient culture system and resulting reductionist approaches for the study of replication and pathogenesis of HEV made it to be a less understood agent. Early studies on animal models, sub-genomic expression of open reading frames (ORF) and infectious cDNA clones have helped in elucidating the genome organization, important stages in HEV replication and pathogenesis. The genome contains three ORF's and three untranslated regions (UTR). The 5′ distal ORF, ORF1 is translated by host ribosomes in a cap dependent manner to form the non-structural polyprotein including the viral replicase. HEV replicates via a negative-sense RNA intermediate which helps in the formation of the positive-sense genomic RNA and a single bi-cistronic sub-genomic RNA. The 3′ distal ORF's including the major structural protein pORF2 and the multifunctional host interacting protein pORF3 are translated from the sub-genomic RNA. Pathogenesis in HEV infections is not well articulated, and remains a concern due to the many aspects like host dependent and genotype specific variations. Animal HEV, zoonosis, chronicity in immunosuppressed patients, and rapid decompensation in affected chronic liver diseased patients warrants detailed investigation of the underlying pathogenesis. Recent advances about structure, entry, egress and functional characterization of ORF1 domains has furthered our understanding about HEV. This article is an effort to review our present understanding about molecular biology and pathogenesis of HEV. PMID:25755485

Widespread signatures of local mRNA folding structure selection in four Dengue virus serotypes

PubMed Central

2015-01-01

Background It is known that mRNA folding can affect and regulate various gene expression steps both in living organisms and in viruses. Previous studies have recognized functional RNA structures in the genome of the Dengue virus. However, these studies usually focused either on the viral untranslated regions or on very specific and limited regions at the beginning of the coding sequences, in a limited number of strains, and without considering evolutionary selection. Results Here we performed the first large scale comprehensive genomics analysis of selection for local mRNA folding strength in the Dengue virus coding sequences, based on a total of 1,670 genomes and 4 serotypes. Our analysis identified clusters of positions along the coding regions that may undergo a conserved evolutionary selection for strong or weak local folding maintained across different viral variants. Specifically, 53-66 clusters for strong folding and 49-73 clusters for weak folding (depending on serotype) aggregated of positions with a significant conservation of folding energy signals (related to partially overlapping local genomic regions) were recognized. In addition, up to 7% of these positions were found to be conserved in more than 90% of the viral genomes. Although some of the identified positions undergo frequent synonymous / non-synonymous substitutions, the selection for folding strength therein is preserved, and thus cannot be trivially explained based on sequence conservation alone. Conclusions The fact that many of the positions with significant folding related signals are conserved among different Dengue variants suggests that a better understanding of the mRNA structures in the corresponding regions may promote the development of prospective anti- Dengue vaccination strategies. The comparative genomics approach described here can be employed in the future for detecting functional regions in other pathogens with very high mutations rates. PMID:26449467
Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations.

PubMed

Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin

2014-01-01

The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
Genetic Structures of Copy Number Variants Revealed by Genotyping Single Sperm

PubMed Central

Luo, Minjie; Cui, Xiangfeng; Fredman, David; Brookes, Anthony J.; Azaro, Marco A.; Greenawalt, Danielle M.; Hu, Guohong; Wang, Hui-Yun; Tereshchenko, Irina V.; Lin, Yong; Shentu, Yue; Gao, Richeng; Shen, Li; Li, Honghua

2009-01-01

Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis. PMID:19384415
Expression Quantitative Trait Locus Mapping across Water Availability Environments Reveals Contrasting Associations with Genomic Features in Arabidopsis[C][W][OPEN

PubMed Central

Lowry, David B.; Logan, Tierney L.; Santuari, Luca; Hardtke, Christian S.; Richards, James H.; DeRose-Wilson, Leah J.; McKay, John K.; Sen, Saunak; Juenger, Thomas E.

2013-01-01

The regulation of gene expression is crucial for an organism’s development and response to stress, and an understanding of the evolution of gene expression is of fundamental importance to basic and applied biology. To improve this understanding, we conducted expression quantitative trait locus (eQTL) mapping in the Tsu-1 (Tsushima, Japan) × Kas-1 (Kashmir, India) recombinant inbred line population of Arabidopsis thaliana across soil drying treatments. We then used genome resequencing data to evaluate whether genomic features (promoter polymorphism, recombination rate, gene length, and gene density) are associated with genes responding to the environment (E) or with genes with genetic variation (G) in gene expression in the form of eQTLs. We identified thousands of genes that responded to soil drying and hundreds of main-effect eQTLs. However, we identified very few statistically significant eQTLs that interacted with the soil drying treatment (GxE eQTL). Analysis of genome resequencing data revealed associations of several genomic features with G and E genes. In general, E genes had lower promoter diversity and local recombination rates. By contrast, genes with eQTLs (G) had significantly greater promoter diversity and were located in genomic regions with higher recombination. These results suggest that genomic architecture may play an important a role in the evolution of gene expression. PMID:24045022
Structural Protein VP2 of African Horse Sickness Virus Is Not Essential for Virus Replication In Vitro

PubMed Central

van de Water, Sandra G. P.; Potgieter, Christiaan A.; van Rijn, Piet A.

2016-01-01

ABSTRACT The Reoviridae family consists of nonenveloped multilayered viruses with a double-stranded RNA genome consisting of 9 to 12 genome segments. The Orbivirus genus of the Reoviridae family contains African horse sickness virus (AHSV), bluetongue virus, and epizootic hemorrhagic disease virus, which cause notifiable diseases and are spread by biting Culicoides species. Here, we used reverse genetics for AHSV to study the role of outer capsid protein VP2, encoded by genome segment 2 (Seg-2). Expansion of a previously found deletion in Seg-2 indicates that structural protein VP2 of AHSV is not essential for virus replication in vitro. In addition, in-frame replacement of RNA sequences in Seg-2 by that of green fluorescence protein (GFP) resulted in AHSV expressing GFP, which further confirmed that VP2 is not essential for virus replication. In contrast to virus replication without VP2 expression in mammalian cells, virus replication in insect cells was strongly reduced, and virus release from insect cells was completely abolished. Further, the other outer capsid protein, VP5, was not copurified with virions for virus mutants without VP2 expression. AHSV without VP5 expression, however, could not be recovered, indicating that outer capsid protein VP5 is essential for virus replication in vitro. Our results demonstrate for the first time that a structural viral protein is not essential for orbivirus replication in vitro, which opens new possibilities for research on other members of the Reoviridae family. IMPORTANCE Members of the Reoviridae family cause major health problems worldwide, ranging from lethal diarrhea caused by rotavirus in humans to economic losses in livestock production caused by different orbiviruses. The Orbivirus genus contains many virus species, of which bluetongue virus, epizootic hemorrhagic disease virus, and African horse sickness virus (AHSV) cause notifiable diseases according to the World Organization of Animal Health. Recently, it has been shown that nonstructural proteins NS3/NS3a and NS4 are not essential for virus replication in vitro, whereas it is generally assumed that structural proteins VP1 to -7 of these nonenveloped, architecturally complex virus particles are essential. Here we demonstrate for the first time that structural protein VP2 of AHSV is not essential for virus replication in vitro. Our findings are very important for virologists working in the field of nonenveloped viruses, in particular reoviruses. PMID:27903804
Differential expression of photosynthesis-related genes in pentaploid interspecific hybrid and its decaploid of Fragaria spp.

PubMed

Wang, Tao; Huang, Dongya; Chen, Baoyu; Mao, Nini; Qiao, Yushan; Ji, Muxiang

2018-03-01

Polyploidization always induces a series of changes in genome, transcriptome and epigenetics, of which changes in gene expression are the immediate causes of genotype alterations of polyploid plants. In our previous study on strawberry polyploidization, genes related to photosynthesis were found to undergo changes in gene expression and DNA methylation. Therefore, we chose 11 genes that were closely related to plant photosynthesis and analysed their expression during strawberry hybridization and chromosome doubling. Most genes of pentaploids showed expression levels between parents and were more similar to F. × ananassa. Gene expression levels of decaploids were higher than those of pentaploids and F. × ananassa. Different types of photosynthesis-related genes responded differently to hybridization and chromosome doubling. Chloroplast genes and regulatory genes showed complex responses. Structural genes of the photosynthetic system were expressed at a constant level and displayed a clear dosage effect. The methylation levels of one CG site on SIGE, which regulates expression of chloroplast genes, were negatively correlated with gene expression. In pentaploids and decaploids, more transcripts were from F. × ananassa than from F. viridis. The ratio of transcripts from from F. × ananassa to those from F. viridis was close to the ratio (4:1) of the genome of F. × ananassa to that of F. viridis in pentaploids and decaploids, but there were also some exceptions with obvious deviation.
Selecting soluble/foldable protein domains through single-gene or genomic ORF filtering: structure of the head domain of Burkholderia pseudomallei antigen BPSL2063.

PubMed

Gourlay, Louise J; Peano, Clelia; Deantonio, Cecilia; Perletti, Lucia; Pietrelli, Alessandro; Villa, Riccardo; Matterazzo, Elena; Lassaux, Patricia; Santoro, Claudio; Puccio, Simone; Sblattero, Daniele; Bolognesi, Martino

2015-11-01

The 1.8 Å resolution crystal structure of a conserved domain of the potential Burkholderia pseudomallei antigen and trimeric autotransporter BPSL2063 is presented as a structural vaccinology target for melioidosis vaccine development. Since BPSL2063 (1090 amino acids) hosts only one conserved domain, and the expression/purification of the full-length protein proved to be problematic, a domain-filtering library was generated using β-lactamase as a reporter gene to select further BPSL2063 domains. As a result, two domains (D1 and D2) were identified and produced in soluble form in Escherichia coli. Furthermore, as a general tool, a genomic open reading frame-filtering library from the B. pseudomallei genome was also constructed to facilitate the selection of domain boundaries from the entire ORFeome. Such an approach allowed the selection of three potential protein antigens that were also produced in soluble form. The results imply the further development of ORF-filtering methods as a tool in protein-based research to improve the selection and production of soluble proteins or domains for downstream applications such as X-ray crystallography.
Multiscale Embedded Gene Co-expression Network Analysis

PubMed Central

Song, Won-Min; Zhang, Bin

2015-01-01

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778
Multiscale Embedded Gene Co-expression Network Analysis.

PubMed

Song, Won-Min; Zhang, Bin

2015-11-01

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Characterization of the fusion core in zebrafish endogenous retroviral envelope protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, Jian; State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, Hubei 430071; Zhang, Huaidong

2015-05-08

Zebrafish endogenous retrovirus (ZFERV) is the unique endogenous retrovirus in zebrafish, as yet, containing intact open reading frames of its envelope protein gene in zebrafish genome. Similarly, several envelope proteins of endogenous retroviruses in human and other mammalian animal genomes (such as syncytin-1 and 2 in human, syncytin-A and B in mouse) were identified and shown to be functional in induction of cell–cell fusion involved in placental development. ZFERV envelope protein (Env) gene appears to be also functional in vivo because it is expressible. After sequence alignment, we found ZFERV Env shares similar structural profiles with syncytin and other type Imore » viral envelopes, especially in the regions of N- and C-terminal heptad repeats (NHR and CHR) which were crucial for membrane fusion. We expressed the regions of N + C protein in the ZFERV Env (residues 459–567, including predicted NHR and CHR) to characterize the fusion core structure. We found N + C protein could form a stable coiled-coil trimer that consists of three helical NHR regions forming a central trimeric core, and three helical CHR regions packing into the grooves on the surface of the central core. The structural characterization of the fusion core revealed the possible mechanism of fusion mediated by ZFERV Env. These results gave comprehensive explanation of how the ancient virus infects the zebrafish and integrates into the genome million years ago, and showed a rational clue for discovery of physiological significance (e.g., medicate cell–cell fusion). - Highlights: • ZFERV Env shares similar structural profiles with syncytin and other type I viral envelopes. • The fusion core of ZFERV Env forms stable coiled-coil trimer including three NHRs and three CHRs. • The structural mechanism of viral entry mediated by ZFERV Env is disclosed. • The results are helpful for further discovery of physiological function of ZFERV Env in zebrafish.« less
A gene expression resource generated by genome-wide lacZ profiling in the mouse

PubMed Central

Tuck, Elizabeth; Estabel, Jeanne; Oellrich, Anika; Maguire, Anna Karin; Adissu, Hibret A.; Souter, Luke; Siragher, Emma; Lillistone, Charlotte; Green, Angela L.; Wardle-Jones, Hannah; Carragher, Damian M.; Karp, Natasha A.; Smedley, Damian; Adams, Niels C.; Bussell, James N.; Adams, David J.; Ramírez-Solis, Ramiro; Steel, Karen P.; Galli, Antonella; White, Jacqueline K.

2015-01-01

ABSTRACT Knowledge of the expression profile of a gene is a critical piece of information required to build an understanding of the normal and essential functions of that gene and any role it may play in the development or progression of disease. High-throughput, large-scale efforts are on-going internationally to characterise reporter-tagged knockout mouse lines. As part of that effort, we report an open access adult mouse expression resource, in which the expression profile of 424 genes has been assessed in up to 47 different organs, tissues and sub-structures using a lacZ reporter gene. Many specific and informative expression patterns were noted. Expression was most commonly observed in the testis and brain and was most restricted in white adipose tissue and mammary gland. Over half of the assessed genes presented with an absent or localised expression pattern (categorised as 0-10 positive structures). A link between complexity of expression profile and viability of homozygous null animals was observed; inactivation of genes expressed in ≥21 structures was more likely to result in reduced viability by postnatal day 14 compared with more restricted expression profiles. For validation purposes, this mouse expression resource was compared with Bgee, a federated composite of RNA-based expression data sets. Strong agreement was observed, indicating a high degree of specificity in our data. Furthermore, there were 1207 observations of expression of a particular gene in an anatomical structure where Bgee had no data, indicating a large amount of novelty in our data set. Examples of expression data corroborating and extending genotype-phenotype associations and supporting disease gene candidacy are presented to demonstrate the potential of this powerful resource. PMID:26398943
Hierarchical Dirichlet process model for gene expression clustering

PubMed Central

2013-01-01

Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments. PMID:23587447
Gramene 2013: comparative plant genomics resources.

PubMed

Monaco, Marcela K; Stein, Joshua; Naithani, Sushma; Wei, Sharon; Dharmawardhana, Palitha; Kumari, Sunita; Amarasinghe, Vindhya; Youens-Clark, Ken; Thomason, James; Preece, Justin; Pasternak, Shiran; Olson, Andrew; Jiao, Yinping; Lu, Zhenyuan; Bolser, Dan; Kerhornou, Arnaud; Staines, Dan; Walts, Brandon; Wu, Guanming; D'Eustachio, Peter; Haw, Robin; Croft, David; Kersey, Paul J; Stein, Lincoln; Jaiswal, Pankaj; Ware, Doreen

2014-01-01

Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology.
Genome-wide Annotation and Comparative Analysis of Long Terminal Repeat Retrotransposons between Pear Species of P. bretschneideri and P. Communis

PubMed Central

Yin, Hao; Du, Jianchang; Wu, Jun; Wei, Shuwei; Xu, Yingxiu; Tao, Shutian; Wu, Juyou; Zhang, Shaoling

2015-01-01

Recent sequencing of the Oriental pear (P. bretschneideri Rehd.) genome and the availability of the draft genome sequence of Occidental pear (P. communis L.), has provided a good opportunity to characterize the abundance, distribution, timing, and evolution of long terminal repeat retrotransposons (LTR-RTs) in these two important fruit plants. Here, a total of 7247 LTR-RTs, which can be classified into 148 families, have been identified in the assembled Oriental pear genome. Unlike in other plant genomes, approximately 90% of these elements were found to be randomly distributed along the pear chromosomes. Further analysis revealed that the amplification timeframe of elements varies dramatically in different families, super-families and lineages, and the Copia-like elements have highest activity in the recent 0.5 million years (Mys). The data also showed that two genomes evolved with similar evolutionary rates after their split from the common ancestor ~0.77–1.66 million years ago (Mya). Overall, the data provided here will be a valuable resource for further investigating the impact of transposable elements on gene structure, expression, and epigenetic modification in the pear genomes. PMID:26631625
An integrated clinical and genomic information system for cancer precision medicine.

PubMed

Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk

2018-04-20

Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.
Suppression of HBV replication by the expression of nickase- and nuclease dead-Cas9.

PubMed

Kurihara, Takeshi; Fukuhara, Takasuke; Ono, Chikako; Yamamoto, Satomi; Uemura, Kentaro; Okamoto, Toru; Sugiyama, Masaya; Motooka, Daisuke; Nakamura, Shota; Ikawa, Masato; Mizokami, Masashi; Maehara, Yoshihiko; Matsuura, Yoshiharu

2017-07-21

Complete removal of hepatitis B virus (HBV) DNA from nuclei is difficult by the current therapies. Recent reports have shown that a novel genome-editing tool using Cas9 with a single-guide RNA (sgRNA) system can cleave the HBV genome in vitro and in vivo. However, induction of a double-strand break (DSB) on the targeted genome by Cas9 risks undesirable off-target cleavage on the host genome. Nickase-Cas9 cleaves a single strand of DNA, and thereby two sgRNAs are required for inducing DSBs. To avoid Cas9-induced off-target mutagenesis, we examined the effects of the expressions of nickase-Cas9 and nuclease dead Cas9 (d-Cas9) with sgRNAs on HBV replication. The expression of nickase-Cas9 with a pair of sgRNAs cleaved the target HBV genome and suppressed the viral-protein expression and HBV replication in vitro. Moreover, nickase-Cas9 with the sgRNA pair cleaved the targeted HBV genome in mouse liver. Interestingly, d-Cas9 expression with the sgRNAs also suppressed HBV replication in vitro without cleaving the HBV genome. These results suggest the possible use of nickase-Cas9 and d-Cas9 with a pair of sgRNAs for eliminating HBV DNA from the livers of chronic hepatitis B patients with low risk of undesirable off-target mutation on the host genome.
Identification, variation and transcription of pneumococcal repeat sequences

PubMed Central

2011-01-01

Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
Proteomic strategy for the identification of critical actors in reorganization of the post-meiotic male genome.

PubMed

Govin, Jerome; Gaucher, Jonathan; Ferro, Myriam; Debernardi, Alexandra; Garin, Jerome; Khochbin, Saadi; Rousseaux, Sophie

2012-01-01

After meiosis, during the final stages of spermatogenesis, the haploid male genome undergoes major structural changes, resulting in a shift from a nucleosome-based genome organization to the sperm-specific, highly compacted nucleoprotamine structure. Recent data support the idea that region-specific programming of the haploid male genome is of high importance for the post-fertilization events and for successful embryo development. Although these events constitute a unique and essential step in reproduction, the mechanisms by which they occur have remained completely obscure and the factors involved have mostly remained uncharacterized. Here, we sought a strategy to significantly increase our understanding of proteins controlling the haploid male genome reprogramming, based on the identification of proteins in two specific pools: those with the potential to bind nucleic acids (basic proteins) and proteins capable of binding basic proteins (acidic proteins). For the identification of acidic proteins, we developed an approach involving a transition-protein (TP)-based chromatography, which has the advantage of retaining not only acidic proteins due to the charge interactions, but also potential TP-interacting factors. A second strategy, based on an in-depth bioinformatic analysis of the identified proteins, was then applied to pinpoint within the lists obtained, male germ cells expressed factors relevant to the post-meiotic genome organization. This approach reveals a functional network of DNA-packaging proteins and their putative chaperones and sheds a new light on the way the critical transitions in genome organizations could take place. This work also points to a new area of research in male infertility and sperm quality assessments.
Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium.

PubMed

Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei

2015-02-01

WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.
Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets

PubMed Central

Macosko, Evan Z.; Basu, Anindita; Satija, Rahul; Nemesh, James; Shekhar, Karthik; Goldman, Melissa; Tirosh, Itay; Bialas, Allison R.; Kamitaki, Nolan; Martersteck, Emily M.; Trombetta, John J.; Weitz, David A.; Sanes, Joshua R.; Shalek, Alex K.; Regev, Aviv; McCarroll, Steven A.

2015-01-01

Summary Cells, the basic units of biological structure and function, vary broadly in type and state. Single-cell genomics can characterize cell identity and function, but limitations of ease and scale have prevented its broad application. Here we describe Drop-Seq, a strategy for quickly profiling thousands of individual cells by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell’s RNAs, and sequencing them all together. Drop-Seq analyzes mRNA transcripts from thousands of individual cells simultaneously while remembering transcripts’ cell of origin. We analyzed transcriptomes from 44,808 mouse retinal cells and identified 39 transcriptionally distinct cell populations, creating a molecular atlas of gene expression for known retinal cell classes and novel candidate cell subtypes. Drop-Seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution. PMID:26000488

TREATING HEMOGLOBINOPATHIES USING GENE CORRECTION APPROACHES: PROMISES AND CHALLENGES

PubMed Central

Cottle, Renee N.; Lee, Ciaran M.; Bao, Gang

2016-01-01

Hemoglobinopathies are genetic disorders caused by aberrant hemoglobin expression or structure changes, resulting in severe mortality and health disparities worldwide. Sickle cell disease (SCD) and β-thalassemia, the most common forms of hemoglobinopathies, are typically treated using transfusions and pharmacological agents. Allogeneic hematopoietic stem cell transplantation is the only curative therapy, but has limited clinical applicability. Although gene therapy approaches have been proposed based on the insertion and forced expression of wild-type or anti-sickling β-globin variants, safety concerns may impede their clinical application. A novel curative approach is nuclease-based gene correction, which involves the application of precision genome editing tools to correct the disease-causing mutation. This review describes the development and potential application of gene therapy and precision genome editing approaches for treating SCD and β-thalassemia. The opportunities and challenges in advancing a curative therapy for hemoglobinopathies are also discussed. PMID:27314256
Inference of gene regulatory networks from genome-wide knockout fitness data

PubMed Central

Wang, Liming; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

2013-01-01

Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis. Availability: MATLAB code and datasets are available to download at http://www.duke.edu/∼lw174/Fitness.zip and http://genomics.lbl.gov/supplemental/fitness-bioinf/ Contact: wangx@ee.columbia.edu or mssamoilov@lbl.gov Supplementary information: Supplementary data are available at Bioinformatics online PMID:23271269
The Innate Immune Database (IIDB)

PubMed Central

Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid

2008-01-01

Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser. Conclusion We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at . PMID:18321385
Triplex DNA-binding proteins are associated with clinical outcomes revealed by proteomic measurements in patients with colorectal cancer

PubMed Central

2012-01-01

Background Tri- and tetra-nucleotide repeats in mammalian genomes can induce formation of alternative non-B DNA structures such as triplexes and guanine (G)-quadruplexes. These structures can induce mutagenesis, chromosomal translocations and genomic instability. We wanted to determine if proteins that bind triplex DNA structures are quantitatively or qualitatively different between colorectal tumor and adjacent normal tissue and if this binding activity correlates with patient clinical characteristics. Methods Extracts from 63 human colorectal tumor and adjacent normal tissues were examined by gel shifts (EMSA) for triplex DNA-binding proteins, which were correlated with clinicopathological tumor characteristics using the Mann-Whitney U, Spearman’s rho, Kaplan-Meier and Mantel-Cox log-rank tests. Biotinylated triplex DNA and streptavidin agarose affinity binding were used to purify triplex-binding proteins in RKO cells. Western blotting and reverse-phase protein array were used to measure protein expression in tissue extracts. Results Increased triplex DNA-binding activity in tumor extracts correlated significantly with lymphatic disease, metastasis, and reduced overall survival. We identified three multifunctional splicing factors with biotinylated triplex DNA affinity: U2AF65 in cytoplasmic extracts, and PSF and p54nrb in nuclear extracts. Super-shift EMSA with anti-U2AF65 antibodies produced a shifted band of the major EMSA H3 complex, identifying U2AF65 as the protein present in the major EMSA band. U2AF65 expression correlated significantly with EMSA H3 values in all extracts and was higher in extracts from Stage III/IV vs. Stage I/II colon tumors (p = 0.024). EMSA H3 values and U2AF65 expression also correlated significantly with GSK3 beta, beta-catenin, and NF- B p65 expression, whereas p54nrb and PSF expression correlated with c-Myc, cyclin D1, and CDK4. EMSA values and expression of all three splicing factors correlated with ErbB1, mTOR, PTEN, and Stat5. Western blots confirmed that full-length and truncated beta-catenin expression correlated with U2AF65 expression in tumor extracts. Conclusions Increased triplex DNA-binding activity in vitro correlates with lymph node disease, metastasis, and reduced overall survival in colorectal cancer, and increased U2AF65 expression is associated with total and truncated beta-catenin expression in high-stage colorectal tumors. PMID:22682314
Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma

PubMed Central

Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

2017-01-01

Objective This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Methods Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Results Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification (P=0.009) or deletion (P=0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly (P=1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Conclusion Chromosomal CNVs may contribute to their transcript expression in cervical cancer. PMID:29312578
Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma.

PubMed

Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

2017-12-12

This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification ( P =0.009) or deletion ( P =0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly ( P =1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Chromosomal CNVs may contribute to their transcript expression in cervical cancer.
Lineage-specific expansions of retroviral insertions within the genomes of African great apes but not humans and orangutans.

PubMed

Yohn, Chris T; Jiang, Zhaoshi; McGrath, Sean D; Hayden, Karen E; Khaitovich, Philipp; Johnson, Matthew E; Eichler, Marla Y; McPherson, John D; Zhao, Shaying; Pääbo, Svante; Eichler, Evan E

2005-04-01

Retroviral infections of the germline have the potential to episodically alter gene function and genome structure during the course of evolution. Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes. We unambiguously map 287 retroviral integration sites and determine that approximately 95.8% of the insertions occur at non-orthologous regions between closely related species. Phylogenetic analysis of the endogenous retrovirus reveals that the gorilla and chimpanzee elements share a monophyletic origin with a subset of the Old World monkey retroviral elements, but that the average sequence divergence exceeds neutral expectation for a strictly nuclear inherited DNA molecule. Within the chimpanzee, there is a significant integration bias against genes, with only 14 of these insertions mapping within intronic regions. Six out of ten of these genes, for which there are expression data, show significant differences in transcript expression between human and chimpanzee. Our data are consistent with a retroviral infection that bombarded the genomes of chimpanzees and gorillas independently and concurrently, 3-4 million years ago. We speculate on the potential impact of such recent events on the evolution of humans and great apes.
Noncoding RNAs in DNA Repair and Genome Integrity

PubMed Central

Wan, Guohui; Liu, Yunhua; Han, Cecil; Zhang, Xinna

2014-01-01

Abstract Significance: The well-studied sequences in the human genome are those of protein-coding genes, which account for only 1%–2% of the total genome. However, with the advent of high-throughput transcriptome sequencing technology, we now know that about 90% of our genome is extensively transcribed and that the vast majority of them are transcribed into noncoding RNAs (ncRNAs). It is of great interest and importance to decipher the functions of these ncRNAs in humans. Recent Advances: In the last decade, it has become apparent that ncRNAs play a crucial role in regulating gene expression in normal development, in stress responses to internal and environmental stimuli, and in human diseases. Critical Issues: In addition to those constitutively expressed structural RNA, such as ribosomal and transfer RNAs, regulatory ncRNAs can be classified as microRNAs (miRNAs), Piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), small nucleolar RNAs (snoRNAs), and long noncoding RNAs (lncRNAs). However, little is known about the biological features and functional roles of these ncRNAs in DNA repair and genome instability, although a number of miRNAs and lncRNAs are regulated in the DNA damage response. Future Directions: A major goal of modern biology is to identify and characterize the full profile of ncRNAs with regard to normal physiological functions and roles in human disorders. Clinically relevant ncRNAs will also be evaluated and targeted in therapeutic applications. Antioxid. Redox Signal. 20, 655–677. PMID:23879367
Genome-wide analysis and expression profiling of the GRF gene family in oilseed rape (Brassica napus L.).

PubMed

Ma, Jin-Qi; Jian, Hong-Ju; Yang, Bo; Lu, Kun; Zhang, Ao-Xiang; Liu, Pu; Li, Jia-Na

2017-07-15

Growth regulating-factors (GRFs) are plant-specific transcription factors that help regulate plant growth and development. Genome-wide identification and evolutionary analyses of GRF gene families have been performed in Arabidopsis thaliana, Zea mays, Oryza sativa, and Brassica rapa, but a comprehensive analysis of the GRF gene family in oilseed rape (Brassica napus) has not yet been reported. In the current study, we identified 35 members of the BnGRF family in B. napus. We analyzed the chromosomal distribution, phylogenetic relationships (Bayesian Inference and Neighbor Joining method), gene structures, and motifs of the BnGRF family members, as well as the cis-acting regulatory elements in their promoters. We also analyzed the expression patterns of 15 randomly selected BnGRF genes in various tissues and in plant varieties with different harvest indices and gibberellic acid (GA) responses. The expression levels of BnGRFs under GA treatment suggested the presence of possible negative feedback regulation. The evolutionary patterns and expression profiles of BnGRFs uncovered in this study increase our understanding of the important roles played by these genes in oilseed rape. Copyright © 2017. Published by Elsevier B.V.
kappa-Opioid receptor in humans: cDNA and genomic cloning, chromosomal assignment, functional expression, pharmacology, and expression pattern in the central nervous system.

PubMed Central

Simonin, F; Gavériaux-Ruff, C; Befort, K; Matthes, H; Lannes, B; Micheletti, G; Mattéi, M G; Charron, G; Bloch, B; Kieffer, B

1995-01-01

Using the mouse delta-opioid receptor cDNA as a probe, we have isolated genomic clones encoding the human mu- and kappa-opioid receptor genes. Their organization appears similar to that of the human delta receptor gene, with exon-intron boundaries located after putative transmembrane domains 1 and 4. The kappa gene was mapped at position q11-12 in human chromosome 8. A full-length cDNA encoding the human kappa-opioid receptor has been isolated. The cloned receptor expressed in COS cells presents a typical kappa 1 pharmacological profile and is negatively coupled to adenylate cyclase. The expression of kappa-opioid receptor mRNA in human brain, as estimated by reverse transcription-polymerase chain reaction, is consistent with the involvement of kappa-opioid receptors in pain perception, neuroendocrine physiology, affective behavior, and cognition. In situ hybridization studies performed on human fetal spinal cord demonstrate the presence of the transcript specifically in lamina II of the dorsal horn. Some divergences in structural, pharmacological, and anatomical properties are noted between the cloned human and rodent receptors. Images Fig. 3 Fig. 4 PMID:7624359
Structure and expression of genes for a class of cysteine-rich proteins of the cuticle layers of differentiating wool and hair follicles

PubMed Central

1990-01-01

The major histological components of the hair follicle are the hair cortex and cuticle. The hair cuticle cells encase and protect the cortex and undergo a different developmental program to that of the cortex. We report the molecular characterization of a set of evolutionarily conserved hair genes which are transcribed in the hair cuticle late in follicle development. Two genes were isolated and characterized, one expressed in the human follicle and one in the sheep follicle. Each gene encodes a small protein of 16 kD, containing greater than 50 cysteine residues, ranging from 31 to 36 mol% cysteine. Their high cysteine content and in vitro expression data identify them as ultra-high-sulfur (UHS) keratin proteins. The predicted proteins are composed almost entirely of cysteine-rich and glycine-rich repeats. Genomic blots reveal that the UHS keratin proteins are encoded by related multigene families in both the human and sheep genomes. Tissue in situ hybridization demonstrates that the expression of both genes is localized to the hair fiber cuticle and occurs at a late stage in fiber morphogenesis. PMID:1703541
Orthogonal control of expression mean and variance by epigenetic features at different genomic loci

DOE PAGES

Dey, Siddharth S.; Foley, Jonathan E.; Limsirichai, Prajit; ...

2015-05-05

While gene expression noise has been shown to drive dramatic phenotypic variations, the molecular basis for this variability in mammalian systems is not well understood. Gene expression has been shown to be regulated by promoter architecture and the associated chromatin environment. However, the exact contribution of these two factors in regulating expression noise has not been explored. Using a dual-reporter lentiviral model system, we deconvolved the influence of the promoter sequence to systematically study the contribution of the chromatin environment at different genomic locations in regulating expression noise. By integrating a large-scale analysis to quantify mRNA levels by smFISH andmore » protein levels by flow cytometry in single cells, we found that mean expression and noise are uncorrelated across genomic locations. Furthermore, we showed that this independence could be explained by the orthogonal control of mean expression by the transcript burst size and noise by the burst frequency. Finally, we showed that genomic locations displaying higher expression noise are associated with more repressed chromatin, thereby indicating the contribution of the chromatin environment in regulating expression noise.« less
Scanning of Transposable Elements and Analyzing Expression of Transposase Genes of Sweet Potato [Ipomoea batatas

PubMed Central

Tao, Xiang; Lai, Xian-Jun; Zhang, Yi-Zheng; Tan, Xue-Mei; Wang, Haiyan

2014-01-01

Background Transposable elements (TEs) are the most abundant genomic components in eukaryotes and affect the genome by their replications and movements to generate genetic plasticity. Sweet potato performs asexual reproduction generally and the TEs may be an important genetic factor for genome reorganization. Complete identification of TEs is essential for the study of genome evolution. However, the TEs of sweet potato are still poorly understood because of its complex hexaploid genome and difficulty in genome sequencing. The recent availability of the sweet potato transcriptome databases provides an opportunity for discovering and characterizing the expressed TEs. Methodology/Principal Findings We first established the integrated-transcriptome database by de novo assembling four published sweet potato transcriptome databases from three cultivars in China. Using sequence-similarity search and analysis, a total of 1,405 TEs including 883 retrotransposons and 522 DNA transposons were predicted and categorized. Depending on mapping sets of RNA-Seq raw short reads to the predicted TEs, we compared the quantities, classifications and expression activities of TEs inter- and intra-cultivars. Moreover, the differential expressions of TEs in seven tissues of Xushu 18 cultivar were analyzed by using Illumina digital gene expression (DGE) tag profiling. It was found that 417 TEs were expressed in one or more tissues and 107 in all seven tissues. Furthermore, the copy number of 11 transposase genes was determined to be 1–3 copies in the genome of sweet potato by Real-time PCR-based absolute quantification. Conclusions/Significance Our result provides a new method for TE searching on species with transcriptome sequences while lacking genome information. The searching, identification and expression analysis of TEs will provide useful TE information in sweet potato, which are valuable for the further studies of TE-mediated gene mutation and optimization in asexual reproduction. It contributes to elucidating the roles of TEs in genome evolution. PMID:24608103
Genomic Imprinting Was Evolutionarily Conserved during Wheat Polyploidization[OPEN

PubMed Central

Yang, Guanghui; Liu, Zhenshan; Gao, Lulu; Yu, Kuohai; Feng, Man; Peng, Huiru; Sun, Qixin; Ni, Zhongfu

2018-01-01

Genomic imprinting is an epigenetic phenomenon that causes genes to be differentially expressed depending on their parent of origin. To evaluate the evolutionary conservation of genomic imprinting and the effects of ploidy on this process, we investigated parent-of-origin-specific gene expression patterns in the endosperm of diploid (Aegilops spp), tetraploid, and hexaploid wheat (Triticum spp) at various stages of development via high-throughput transcriptome sequencing. We identified 91, 135, and 146 maternally or paternally expressed genes (MEGs or PEGs, respectively) in diploid, tetraploid, and hexaploid wheat, respectively, 52.7% of which exhibited dynamic expression patterns at different developmental stages. Gene Ontology enrichment analysis suggested that MEGs and PEGs were involved in metabolic processes and DNA-dependent transcription, respectively. Nearly half of the imprinted genes exhibited conserved expression patterns during wheat hexaploidization. In addition, 40% of the homoeolog pairs originating from whole-genome duplication were consistently maternally or paternally biased in the different subgenomes of hexaploid wheat. Furthermore, imprinted expression was found for 41.2% and 50.0% of homolog pairs that evolved by tandem duplication after genome duplication in tetraploid and hexaploid wheat, respectively. These results suggest that genomic imprinting was evolutionarily conserved between closely related Triticum and Aegilops species and in the face of polyploid hybridization between species in these genera. PMID:29298834
Crystal structure of enterococcus faecalis sly A-like transcriptional factor.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, R.; Zhang, R.; Zagnitko, O.

2003-05-30

The crystal structure of a SlyA transcriptional regulator at 1.6 {angstrom} resolution is presented, and structural relationships between members of the MarR/SlyA family are discussed. The SlyA family, which includes SlyA, Rap, Hor, and RovA proteins, is widely distributed in bacterial and archaeal genomes. Current evidence suggests that SlyA-like factors act as repressors, activators, and modulators of gene transcription. These proteins have been shown to up-regulate the expression of molecular chaperones, acid-resistance proteins, and cytolysin, and down-regulate several biosynthetic enzymes. The structure of SlyA from Enterococcus faecalis, determined as a part of an ongoing structural genomics initiative (www.mcsg.anl.gov), revealed themore » same winged helix DNA-binding motif that was recently found in the MarR repressor from Escherichia coli and the MexR repressor from Pseudomonas aeruginosa, a sequence homologue of MarR. Phylogenetic analysis of the MarR/SlyA family suggests that Sly is placed between the SlyA and MarR subfamilies and shows significant sequence similarity to members of both subfamilies.« less
Probing the Structures of Viral RNA Regulatory Elements with SHAPE and Related Methodologies

PubMed Central

Rausch, Jason W.; Sztuba-Solinska, Joanna; Le Grice, Stuart F. J.

2018-01-01

Viral RNAs were selected by evolution to possess maximum functionality in a minimal sequence. Depending on the classification of the virus and the type of RNA in question, viral RNAs must alternately be replicated, spliced, transcribed, transported from the nucleus into the cytoplasm, translated and/or packaged into nascent virions, and in most cases, provide the sequence and structural determinants to facilitate these processes. One consequence of this compact multifunctionality is that viral RNA structures can be exquisitely complex, often involving intermolecular interactions with RNA or protein, intramolecular interactions between sequence segments separated by several thousands of nucleotides, or specialized motifs such as pseudoknots or kissing loops. The fluidity of viral RNA structure can also present a challenge when attempting to characterize it, as genomic RNAs especially are likely to sample numerous conformations at various stages of the virus life cycle. Here we review advances in chemoenzymatic structure probing that have made it possible to address such challenges with respect to cis-acting elements, full-length viral genomes and long non-coding RNAs that play a major role in regulating viral gene expression. PMID:29375504
Two Δ6-desaturase-like genes in common carp (Cyprinus carpio var. Jian): structure characterization, mRNA expression, temperature and nutritional regulation.

PubMed

Ren, Hong-tao; Zhang, Guang-qin; Li, Jian-lin; Tang, Yong-kai; Li, Hong-xia; Yu, Ju-hua; Xu, Pao

2013-08-01

Δ6-Desaturase is the rate-limiting enzyme involved in highly unsaturated fatty acid (HUFA) biosynthesis. There is very little information on the evolution and functional characterization of Δ6Fad-a and Δ6Fad-b in common carp (Cyprinus carpio var. Jian). In the present study, the genomic sequences and structures of two putative Δ6-desaturase-like genes in common carp genome were obtained. We investigated the mRNA expression patterns of Δ6Fad-a and Δ6Fad-b in tissue, hatching carp embryos, larvae by temperature shock and juveniles under nutritional regulation. Our results showed that the two Δ6Fad genes had identical coding exon structures, being comprised of 12 coding exons, and with introns of distinct size and sequence composition. They were not allelic variants of a single gene. Both Δ6Fad genes were highly expressed in liver, intestine (pyloric caeca) and brain. The Δ6Fad-a and Δ6Fad-b mRNAs showed an increase in expression from newly hatched to 25 days after hatching. The expression levels of Δ6Fad-a were obviously regulated by temperature, whereas Δ6Fad-b was not affected by temperature. The regulation of Δ6Fad-a and Δ6Fad-b in response to dietary fatty acid composition was determined in liver, brain and intestine (pyloric caeca) of common carp fed with diets: diet1with fish oil (FO) rich in n-3 HUFA, diet2 with corn oil (CO, 18:2n-6) and diet3 with linseed oil (LO, 18:3n-3). The differential expression of Δ6Fad-a and Δ6Fad-b genes in liver, brain and intestine in common carps was fed with different oil sources, respectively. Further work is in progress to determine the mechanism of differential expression of the Δ6Fad-a and Δ6Fad-b genes in different tissues and the roles of transcription factors in regulating HUFA synthesis. Copyright © 2013 Elsevier B.V. All rights reserved.
Combining functional and structural genomics to sample the essential Burkholderia structome.

PubMed

Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

2013-01-01

The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request.
Tissue-specific NETs alter genome organization and regulation even in a heterologous system.

PubMed

de Las Heras, Jose I; Zuleger, Nikolaj; Batrakou, Dzmitry G; Czapiewski, Rafal; Kerr, Alastair R W; Schirmer, Eric C

2017-01-02

Different cell types exhibit distinct patterns of 3D genome organization that correlate with changes in gene expression in tissue and differentiation systems. Several tissue-specific nuclear envelope transmembrane proteins (NETs) have been found to influence the spatial positioning of genes and chromosomes that normally occurs during tissue differentiation. Here we study 3 such NETs: NET29, NET39, and NET47, which are expressed preferentially in fat, muscle and liver, respectively. We found that even when exogenously expressed in a heterologous system they can specify particular genome organization patterns and alter gene expression. Each NET affected largely different subsets of genes. Notably, the liver-specific NET47 upregulated many genes in HT1080 fibroblast cells that are normally upregulated in hepatogenesis, showing that tissue-specific NETs can favor expression patterns associated with the tissue where the NET is normally expressed. Similarly, global profiling of peripheral chromatin after exogenous expression of these NETs using lamin B1 DamID revealed that each NET affected the nuclear positioning of distinct sets of genomic regions with a significant tissue-specific component. Thus NET influences on genome organization can contribute to gene expression changes associated with differentiation even in the absence of other factors and overt cellular differentiation changes.
TEcandidates: Prediction of genomic origin of expressed Transposable Elements using RNA-seq data.

PubMed

Valdebenito-Maturana, Braulio; Riadi, Gonzalo

2018-06-01

In recent years, Transposable Elements (TEs) have been related to gene regulation. However, estimating the origin of expression of TEs through RNA-seq is complicated by multimapping reads coming from their repetitive sequences. Current approaches that address multimapping reads are focused in expression quantification and not in finding the origin of expression. Addressing the genomic origin of expressed TEs could further aid in understanding the role that TEs might have in the cell. We have developed a new pipeline called TEcandidates, based on de novo transcriptome assembly to assess the instances of TEs being expressed, along with their location, to include in downstream DE analysis. TEcandidates takes as input the RNA-seq data, the genome sequence and the TE annotation file, and returns a list of coordinates of candidate TEs being expressed, the TEs that have been removed, and the genome sequence with removed TEs as masked. This masked genome is suited to include TEs in downstream expression analysis, as the ambiguity of reads coming from TEs is significantly reduced in the mapping step of the analysis. The script which runs the pipeline can be downloaded at http://www.mobilomics.org/tecandidates/downloads or http://github.com/TEcandidates/TEcandidates. griadi@utalca.cl. Supplementary data are available at Bioinformatics online.

Evaluation of high throughput gene expression platforms using a genomic biomarker signature for prediction of skin sensitization.

PubMed

Forreryd, Andy; Johansson, Henrik; Albrekt, Ann-Sofie; Lindstedt, Malin

2014-05-16

Allergic contact dermatitis (ACD) develops upon exposure to certain chemical compounds termed skin sensitizers. To reduce the occurrence of skin sensitizers, chemicals are regularly screened for their capacity to induce sensitization. The recently developed Genomic Allergen Rapid Detection (GARD) assay is an in vitro alternative to animal testing for identification of skin sensitizers, classifying chemicals by evaluating transcriptional levels of a genomic biomarker signature. During assay development and biomarker identification, genome-wide expression analysis was applied using microarrays covering approximately 30,000 transcripts. However, the microarray platform suffers from drawbacks in terms of low sample throughput, high cost per sample and time consuming protocols and is a limiting factor for adaption of GARD into a routine assay for screening of potential sensitizers. With the purpose to simplify assay procedures, improve technical parameters and increase sample throughput, we assessed the performance of three high throughput gene expression platforms--nCounter®, BioMark HD™ and OpenArray®--and correlated their performance metrics against our previously generated microarray data. We measured the levels of 30 transcripts from the GARD biomarker signature across 48 samples. Detection sensitivity, reproducibility, correlations and overall structure of gene expression measurements were compared across platforms. Gene expression data from all of the evaluated platforms could be used to classify most of the sensitizers from non-sensitizers in the GARD assay. Results also showed high data quality and acceptable reproducibility for all platforms but only medium to poor correlations of expression measurements across platforms. In addition, evaluated platforms were superior to the microarray platform in terms of cost efficiency, simplicity of protocols and sample throughput. We evaluated the performance of three non-array based platforms using a limited set of transcripts from the GARD biomarker signature. We demonstrated that it was possible to achieve acceptable discriminatory power in terms of separation between sensitizers and non-sensitizers in the GARD assay while reducing assay costs, simplify assay procedures and increase sample throughput by using an alternative platform, providing a first step towards the goal to prepare GARD for formal validation and adaption of the assay for industrial screening of potential sensitizers.
Loss of neuronal 3D chromatin organization causes transcriptional and behavioural deficits related to serotonergic dysfunction.

PubMed

Ito, Satomi; Magalska, Adriana; Alcaraz-Iborra, Manuel; Lopez-Atalaya, Jose P; Rovira, Victor; Contreras-Moreira, Bruno; Lipinski, Michal; Olivares, Roman; Martinez-Hernandez, Jose; Ruszczycki, Blazej; Lujan, Rafael; Geijo-Barrientos, Emilio; Wilczynski, Grzegorz M; Barco, Angel

2014-07-18

The interior of the neuronal cell nucleus is a highly organized three-dimensional (3D) structure where regions of the genome that are linearly millions of bases apart establish sub-structures with specialized functions. To investigate neuronal chromatin organization and dynamics in vivo, we generated bitransgenic mice expressing GFP-tagged histone H2B in principal neurons of the forebrain. Surprisingly, the expression of this chimeric histone in mature neurons caused chromocenter declustering and disrupted the association of heterochromatin with the nuclear lamina. The loss of these structures did not affect neuronal viability but was associated with specific transcriptional and behavioural deficits related to serotonergic dysfunction. Overall, our results demonstrate that the 3D organization of chromatin within neuronal cells provides an additional level of epigenetic regulation of gene expression that critically impacts neuronal function. This in turn suggests that some loci associated with neuropsychiatric disorders may be particularly sensitive to changes in chromatin architecture.
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

PubMed

Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

2016-04-01

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
Nucleolus association of chromosomal domains is largely maintained in cellular senescence despite massive nuclear reorganisation.

PubMed

Dillinger, Stefan; Straub, Tobias; Németh, Attila

2017-01-01

Mammalian chromosomes are organized in structural and functional domains of 0.1-10 Mb, which are characterized by high self-association frequencies in the nuclear space and different contact probabilities with nuclear sub-compartments. They exhibit distinct chromatin modification patterns, gene expression levels and replication timing. Recently, nucleolus-associated chromosomal domains (NADs) have been discovered, yet their precise genomic organization and dynamics are still largely unknown. Here, we use nucleolus genomics and single-cell experiments to address these questions in human embryonic fibroblasts during replicative senescence. Genome-wide mapping reveals 1,646 NADs in proliferating cells, which cover about 38% of the annotated human genome. They are mainly heterochromatic and correlate with late replicating loci. Using Hi-C data analysis, we show that interactions of NADs dominate interphase chromosome contacts in the 10-50 Mb distance range. Interestingly, only minute changes in nucleolar association are observed upon senescence. These spatial rearrangements in subdomains smaller than 100 kb are accompanied with local transcriptional changes. In contrast, large centromeric and pericentromeric satellite repeat clusters extensively dissociate from nucleoli in senescent cells. Accordingly, H3K9me3-marked heterochromatin gets remodelled at the perinucleolar space as revealed by immunofluorescence analyses. Collectively, this study identifies connections between the nucleolus, 3D genome structure, and cellular aging at the level of interphase chromosome organization.
Nucleolus association of chromosomal domains is largely maintained in cellular senescence despite massive nuclear reorganisation

PubMed Central

Dillinger, Stefan

2017-01-01

Mammalian chromosomes are organized in structural and functional domains of 0.1–10 Mb, which are characterized by high self-association frequencies in the nuclear space and different contact probabilities with nuclear sub-compartments. They exhibit distinct chromatin modification patterns, gene expression levels and replication timing. Recently, nucleolus-associated chromosomal domains (NADs) have been discovered, yet their precise genomic organization and dynamics are still largely unknown. Here, we use nucleolus genomics and single-cell experiments to address these questions in human embryonic fibroblasts during replicative senescence. Genome-wide mapping reveals 1,646 NADs in proliferating cells, which cover about 38% of the annotated human genome. They are mainly heterochromatic and correlate with late replicating loci. Using Hi-C data analysis, we show that interactions of NADs dominate interphase chromosome contacts in the 10–50 Mb distance range. Interestingly, only minute changes in nucleolar association are observed upon senescence. These spatial rearrangements in subdomains smaller than 100 kb are accompanied with local transcriptional changes. In contrast, large centromeric and pericentromeric satellite repeat clusters extensively dissociate from nucleoli in senescent cells. Accordingly, H3K9me3-marked heterochromatin gets remodelled at the perinucleolar space as revealed by immunofluorescence analyses. Collectively, this study identifies connections between the nucleolus, 3D genome structure, and cellular aging at the level of interphase chromosome organization. PMID:28575119
SINEs as driving forces in genome evolution.

PubMed

Schmitz, J

2012-01-01

SINEs are short interspersed elements derived from cellular RNAs that repetitively retropose via RNA intermediates and integrate more or less randomly back into the genome. SINEs propagate almost entirely vertically within their host cells and, once established in the germline, are passed on from generation to generation. As non-autonomous elements, their reverse transcription (from RNA to cDNA) and genomic integration depends on the activity of the enzymatic machinery of autonomous retrotransposons, such as long interspersed elements (LINEs). SINEs are widely distributed in eukaryotes, but are especially effectively propagated in mammalian species. For example, more than a million Alu-SINE copies populate the human genome (approximately 13% of genomic space), and few master copies of them are still active. In the organisms where they occur, SINEs are a challenge to genomic integrity, but in the long term also can serve as beneficial building blocks for evolution, contributing to phenotypic heterogeneity and modifying gene regulatory networks. They substantially expand the genomic space and introduce structural variation to the genome. SINEs have the potential to mutate genes, to alter gene expression, and to generate new parts of genes. A balanced distribution and controlled activity of such properties is crucial to maintaining the organism's dynamic and thriving evolution. Copyright © 2012 S. Karger AG, Basel.
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis

PubMed Central

Gianola, Daniel; Fariello, Maria I.; Naya, Hugo; Schön, Chris-Carolin

2016-01-01

Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. PMID:27520956
Identification and expression profile analysis of the sucrose phosphate synthase gene family in Litchi chinensis Sonn.

PubMed Central

Wang, Dan; Zhao, Jietang; Hu, Bing; Li, Jiaqi; Qin, Yaqi; Chen, Linhuan; Qin, Yonghua

2018-01-01

Sucrose phosphate synthase (SPS, EC 2.4.1.14) is a key enzyme that regulates sucrose biosynthesis in plants. SPS is encoded by different gene families which display differential expression patterns and functional divergence. Genome-wide identification and expression analyses of SPS gene families have been performed in Arabidopsis, rice, and sugarcane, but a comprehensive analysis of the SPS gene family in Litchi chinensis Sonn. has not yet been reported. In the current study, four SPS gene (LcSPS1, LcSPS2, LcSPS3, and LcSPS4) were isolated from litchi. The genomic organization analysis indicated the four litchi SPS genes have very similar exon-intron structures. Phylogenetic tree showed LcSPS1-4 were grouped into different SPS families (LcSPS1 and LcSPS2 in A family, LcSPS3 in B family, and LcSPS4 in C family). LcSPS1 and LcSPS4 were strongly expressed in the flowers, while LcSPS3 most expressed in mature leaves. RT-qPCR results showed that LcSPS genes expressed differentially during aril development between cultivars with different hexose/sucrose ratios. A higher level of expression of LcSPS genes was detected in Wuheli, which accumulates higher sucrose in the aril at mature. The tissue- and developmental stage-specific expression of LcSPS1-4 genes uncovered in this study increase our understanding of the important roles played by these genes in litchi fruits. PMID:29473005
Neighboring Genes Show Correlated Evolution in Gene Expression

PubMed Central

Ghanbarian, Avazeh T.; Hurst, Laurence D.

2015-01-01

When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543
Latency Entry of Herpes Simplex Virus 1 Is Determined by the Interaction of Its Genome with the Nuclear Environment

PubMed Central

Cohen, Camille; Streichenberger, Nathalie; Texier, Pascale; Takissian, Julie; Rousseau, Antoine; Poccardi, Nolwenn; Welsch, Jérémy; Corpet, Armelle; Schaeffer, Laurent; Labetoulle, Marc; Lomonte, Patrick

2016-01-01

Herpes simplex virus 1 (HSV-1) establishes latency in trigeminal ganglia (TG) sensory neurons of infected individuals. The commitment of infected neurons toward the viral lytic or latent transcriptional program is likely to depend on both viral and cellular factors, and to differ among individual neurons. In this study, we used a mouse model of HSV-1 infection to investigate the relationship between viral genomes and the nuclear environment in terms of the establishment of latency. During acute infection, viral genomes show two major patterns: replication compartments or multiple spots distributed in the nucleoplasm (namely “multiple-acute”). Viral genomes in the “multiple-acute” pattern are systematically associated with the promyelocytic leukemia (PML) protein in structures designated viral DNA-containing PML nuclear bodies (vDCP-NBs). To investigate the viral and cellular features that favor the acquisition of the latency-associated viral genome patterns, we infected mouse primary TG neurons from wild type (wt) mice or knock-out mice for type 1 interferon (IFN) receptor with wt or a mutant HSV-1, which is unable to replicate due to the synthesis of a non-functional ICP4, the major virus transactivator. We found that the inability of the virus to initiate the lytic program combined to its inability to synthesize a functional ICP0, are the two viral features leading to the formation of vDCP-NBs. The formation of the “multiple-latency” pattern is favored by the type 1 IFN signaling pathway in the context of neurons infected by a virus able to replicate through the expression of a functional ICP4 but unable to express functional VP16 and ICP0. Analyses of TGs harvested from HSV-1 latently infected humans showed that viral genomes and PML occupy similar nuclear areas in infected neurons, eventually forming vDCP-NB-like structures. Overall our study designates PML protein and PML-NBs to be major cellular components involved in the control of HSV-1 latency, probably during the entire life of an individual. PMID:27618691
Emergent Self-Organized Criticality in Gene Expression Dynamics: Temporal Development of Global Phase Transition Revealed in a Cancer Cell Line

PubMed Central

Tsuchiya, Masa; Giuliani, Alessandro; Hashimoto, Midori; Erenpreisa, Jekaterina; Yoshikawa, Kenichi

2015-01-01

Background The underlying mechanism of dynamic control of the genome-wide expression is a fundamental issue in bioscience. We addressed it in terms of phase transition by a systemic approach based on both density analysis and characteristics of temporal fluctuation for the time-course mRNA expression in differentiating MCF-7 breast cancer cells. Methodology In a recent work, we suggested criticality as an essential aspect of dynamic control of genome-wide gene expression. Criticality was evident by a unimodal-bimodal transition through flattened unimodal expression profile. The flatness on the transition suggests the existence of a critical transition at which up- and down-regulated expression is balanced. Mean field (averaging) behavior of mRNAs based on the temporal expression changes reveals a sandpile type of transition in the flattened profile. Furthermore, around the transition, a self-similar unimodal-bimodal transition of the whole expression occurs in the density profile of an ensemble of mRNA expression. These singular and scaling behaviors identify the transition as the expression phase transition driven by self-organized criticality (SOC). Principal Findings Emergent properties of SOC through a mean field approach are revealed: i) SOC, as a form of genomic phase transition, consolidates distinct critical states of expression, ii) Coupling of coherent stochastic oscillations between critical states on different time-scales gives rise to SOC, and iii) Specific gene clusters (barcode genes) ranging in size from kbp to Mbp reveal similar SOC to genome-wide mRNA expression and ON-OFF synchronization to critical states. This suggests that the cooperative gene regulation of topological genome sub-units is mediated by the coherent phase transitions of megadomain-scaled conformations between compact and swollen chromatin states. Conclusion and Significance In summary, our study provides not only a systemic method to demonstrate SOC in whole-genome expression, but also introduces novel, physically grounded concepts for a breakthrough in the study of biological regulation. PMID:26067993
Clinically relevant morphological structures in breast cancer represent transcriptionally distinct tumor cell populations with varied degrees of epithelial-mesenchymal transition and CD44+CD24- stemness

PubMed Central

Denisov, Evgeny V.; Skryabin, Nikolay A.; Gerashchenko, Tatiana S.; Tashireva, Lubov A.; Wilhelm, Jochen; Buldakov, Mikhail A.; Sleptcov, Aleksei A.; Lebedev, Igor N.; Vtorushin, Sergey V.; Zavyalova, Marina V.; Cherdyntseva, Nadezhda V.; Perelmuter, Vladimir M.

2017-01-01

Intratumor morphological heterogeneity in breast cancer is represented by different morphological structures (tubular, alveolar, solid, trabecular, and discrete) and contributes to poor prognosis; however, the mechanisms involved remain unclear. In this study, we performed 3D imaging, laser microdissection-assisted array comparative genomic hybridization and gene expression microarray analysis of different morphological structures and examined their association with the standard immunohistochemistry scorings and CD44+CD24- cancer stem cells. We found that the intratumor morphological heterogeneity is not associated with chromosomal aberrations. By contrast, morphological structures were characterized by specific gene expression profiles and signaling pathways and significantly differed in progesterone receptor and Ki-67 expression. Most importantly, we observed significant differences between structures in the number of expressed genes of the epithelial and mesenchymal phenotypes and the association with cancer invasion pathways. Tubular (tube-shaped) and alveolar (spheroid-shaped) structures were transcriptionally similar and demonstrated co-expression of epithelial and mesenchymal markers. Solid (large shapeless) structures retained epithelial features but demonstrated an increase in mesenchymal traits and collective cell migration hallmarks. Mesenchymal genes and cancer invasion pathways, as well as Ki-67 expression, were enriched in trabecular (one/two rows of tumor cells) and discrete groups (single cells and/or arrangements of 2-5 cells). Surprisingly, the number of CD44+CD24- cells was found to be the lowest in discrete groups and the highest in alveolar and solid structures. Overall, our findings indicate the association of intratumor morphological heterogeneity in breast cancer with the epithelial-mesenchymal transition and CD44+CD24- stemness and the appeal of this heterogeneity as a model for the study of cancer invasion. PMID:28977854
Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

DOE PAGES

Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; ...

2014-12-04

Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on freshcut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genesmore » involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea’s extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.« less
Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

PubMed Central

Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J.; Held, Benjamin; Canessa, Paulo; Larrondo, Luis F.; Schmoll, Monika; Druzhinina, Irina S.; Kubicek, Christian P.; Gaskell, Jill A.; Kersten, Phil; St. John, Franz; Glasner, Jeremy; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit; Mgbeahuruike, Anthony C.; Kovalchuk, Andriy; Asiegbu, Fred O.; Lackner, Gerald; Hoffmeister, Dirk; Rencoret, Jorge; Gutiérrez, Ana; Sun, Hui; Lindquist, Erika; Barry, Kerrie; Riley, Robert; Grigoriev, Igor V.; Henrissat, Bernard; Kües, Ursula; Berka, Randy M.; Martínez, Angel T.; Covert, Sarah F.; Blanchette, Robert A.; Cullen, Daniel

2014-01-01

Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes. PMID:25474575
Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko

Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on freshcut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genesmore » involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea’s extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.« less
A-WINGS: an integrated genome database for Pleurocybella porrigens (Angel's wing oyster mushroom, Sugihiratake).

PubMed

Yamamoto, Naoki; Suzuki, Tomohiro; Kobayashi, Masaaki; Dohra, Hideo; Sasaki, Yohei; Hirai, Hirofumi; Yokoyama, Koji; Kawagishi, Hirokazu; Yano, Kentaro

2014-12-03

The angel's wing oyster mushroom (Pleurocybella porrigens, Sugihiratake) is a well-known delicacy. However, its potential risk in acute encephalopathy was recently revealed by a food poisoning incident. To disclose the genes underlying the accident and provide mechanistic insight, we seek to develop an information infrastructure containing omics data. In our previous work, we sequenced the genome and transcriptome using next-generation sequencing techniques. The next step in achieving our goal is to develop a web database to facilitate the efficient mining of large-scale omics data and identification of genes specifically expressed in the mushroom. This paper introduces a web database A-WINGS (http://bioinf.mind.meiji.ac.jp/a-wings/) that provides integrated genomic and transcriptomic information for the angel's wing oyster mushroom. The database contains structure and functional annotations of transcripts and gene expressions. Functional annotations contain information on homologous sequences from NCBI nr and UniProt, Gene Ontology, and KEGG Orthology. Digital gene expression profiles were derived from RNA sequencing (RNA-seq) analysis in the fruiting bodies and mycelia. The omics information stored in the database is freely accessible through interactive and graphical interfaces by search functions that include 'GO TREE VIEW' browsing, keyword searches, and BLAST searches. The A-WINGS database will accelerate omics studies on specific aspects of the angel's wing oyster mushroom and the family Tricholomataceae.
Identification and Characterization of a PRDM14 Homolog in Japanese Flounder (Paralichthys olivaceus).

PubMed

Fan, Lin; Jiang, Jiajun; Gao, Jinning; Song, Huayu; Liu, Jinxiang; Yang, Likun; Li, Zan; Chen, Yan; Zhang, Quanqi; Wang, Xubo

2015-04-23

PRDM14 is a PR (PRDI-BF1-RIZ1 homologous) domain protein with six zinc fingers and essential roles in genome-wide epigenetic reprogramming. This protein is required for the establishment of germ cells and the maintenance of the embryonic stem cell ground state. In this study, we cloned the full-length cDNA and genomic DNA of the Paralichthys olivaceus prdm14 (Po-prdm14) gene and isolated the 5' regulatory region of Po-prdm14 by whole-genome sequencing. Peptide sequence alignment, gene structure analysis, and phylogenetic analysis revealed that Po-PRDM14 was homologous to mammalian PRDM14. Results of real-time quantitative polymerase chain reaction amplification (RT-qPCR) and in situ hybridization (ISH) in embryos demonstrated that Po-prdm14 was highly expressed between the morula and late gastrula stages, with its expression peaking in the early gastrula stage. Relatively low expression of Po-prdm14 was observed in the other developmental stages. ISH of gonadal tissues revealed that the transcripts were located in the nucleus of the oocytes in the ovaries but only in the spermatogonia and not the spermatocytes in the testes. We also presume that the Po-prdm14 transcription factor binding sites and their conserved binding region among vertebrates. The combined results suggest that Po-PRDM14 has a conserved function in teleosts and mammals.
Metagenomic and Metatranscriptomic Analyses Reveal the Structure and Dynamics of a Dechlorinating Community Containing Dehalococcoides mccartyi and Corrinoid-Providing Microorganisms under Cobalamin-Limited Conditions

DOE PAGES

Men, Yujie; Yu, Ke; Bælum, Jacob; ...

2017-02-10

The aim of this paper is to obtain a systems-level understanding of the interactions between Dehalococcoides and corrinoid-supplying microorganisms by analyzing community structures and functional compositions, activities, and dynamics in trichloroethene (TCE)-dechlorinating enrichments. Metagenomes and metatranscriptomes of the dechlorinating enrichments with and without exogenous cobalamin were compared. Seven putative draft genomes were binned from the metagenomes. At an early stage (2 days), more transcripts of genes in the Veillonellaceae bin-genome were detected in the metatranscriptome of the enrichment without exogenous cobalamin than in the one with the addition of cobalamin. Among these genes, sporulation-related genes exhibited the highest differential expressionmore » when cobalamin was not added, suggesting a possible release route of corrinoids from corrinoid producers. Other differentially expressed genes include those involved in energy conservation and nutrient transport (including cobalt transport). The most highly expressed corrinoid de novo biosynthesis pathway was also assigned to the Veillonellaceae bin-genome. Targeted quantitative PCR (qPCR) analyses confirmed higher transcript abundances of those corrinoid biosynthesis genes in the enrichment without exogenous cobalamin than in the enrichment with cobalamin. Furthermore, the corrinoid salvaging and modification pathway of Dehalococcoides was upregulated in response to the cobalamin stress. Finally, this study provides important insights into the microbial interactions and roles played by members of dechlorinating communities under cobalamin-limited conditions.« less
Metagenomic and Metatranscriptomic Analyses Reveal the Structure and Dynamics of a Dechlorinating Community Containing Dehalococcoides mccartyi and Corrinoid-Providing Microorganisms under Cobalamin-Limited Conditions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Men, Yujie; Yu, Ke; Bælum, Jacob

The aim of this paper is to obtain a systems-level understanding of the interactions between Dehalococcoides and corrinoid-supplying microorganisms by analyzing community structures and functional compositions, activities, and dynamics in trichloroethene (TCE)-dechlorinating enrichments. Metagenomes and metatranscriptomes of the dechlorinating enrichments with and without exogenous cobalamin were compared. Seven putative draft genomes were binned from the metagenomes. At an early stage (2 days), more transcripts of genes in the Veillonellaceae bin-genome were detected in the metatranscriptome of the enrichment without exogenous cobalamin than in the one with the addition of cobalamin. Among these genes, sporulation-related genes exhibited the highest differential expressionmore » when cobalamin was not added, suggesting a possible release route of corrinoids from corrinoid producers. Other differentially expressed genes include those involved in energy conservation and nutrient transport (including cobalt transport). The most highly expressed corrinoid de novo biosynthesis pathway was also assigned to the Veillonellaceae bin-genome. Targeted quantitative PCR (qPCR) analyses confirmed higher transcript abundances of those corrinoid biosynthesis genes in the enrichment without exogenous cobalamin than in the enrichment with cobalamin. Furthermore, the corrinoid salvaging and modification pathway of Dehalococcoides was upregulated in response to the cobalamin stress. Finally, this study provides important insights into the microbial interactions and roles played by members of dechlorinating communities under cobalamin-limited conditions.« less
Genome-wide identification and expression analysis of the ClTCP transcription factors in Citrullus lanatus.

PubMed

Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan

2016-04-12

The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.