Science.gov

Sample records for regulatory sequence common

  1. Population sequencing of two endocannabinoid metabolic genes identifies rare and common regulatory variants associated with extreme obesity and metabolite level

    PubMed Central

    2010-01-01

    Background Targeted re-sequencing of candidate genes in individuals at the extremes of a quantitative phenotype distribution is a method of choice to gain information on the contribution of rare variants to disease susceptibility. The endocannabinoid system mediates signaling in the brain and peripheral tissues involved in the regulation of energy balance, is highly active in obese patients, and represents a strong candidate pathway to examine for genetic association with body mass index (BMI). Results We sequenced two intervals (covering 188 kb) encoding the endocannabinoid metabolic enzymes fatty-acid amide hydrolase (FAAH) and monoglyceride lipase (MGLL) in 147 normal controls and 142 extremely obese cases. After applying quality filters, we called 1,393 high quality single nucleotide variants, 55% of which are rare, and 143 indels. Using single marker tests and collapsed marker tests, we identified four intervals associated with BMI: the FAAH promoter, the MGLL promoter, MGLL intron 2, and MGLL intron 3. Two of these intervals are composed of rare variants and the majority of the associated variants are located in promoter sequences or in predicted transcriptional enhancers, suggesting a regulatory role. The set of rare variants in the FAAH promoter associated with BMI is also associated with increased level of FAAH substrate anandamide, further implicating a functional role in obesity. Conclusions Our study, which is one of the first reports of a sequence-based association study using next-generation sequencing of candidate genes, provides insights into study design and analysis approaches and demonstrates the importance of examining regulatory elements rather than exclusively focusing on exon sequences. PMID:21118518

  2. Coordinate cytokine regulatory sequences

    DOEpatents

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  3. Polymorphism in regulatory gene sequences

    PubMed Central

    Mitchison, N A

    2001-01-01

    The extensive polymorphism revealed in non-coding gene-regulatory sequences, particularly in the immune system, suggests that this type of genetic variation is functionally and evolutionarily far more important than has been suspected, and provides a lead to new therapeutic strategies. PMID:11178274

  4. Formation of Regulatory Modules by Local Sequence Duplication

    PubMed Central

    Nourmohammad, Armita; Lässig, Michael

    2011-01-01

    Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms. PMID:21998564

  5. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  6. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  7. RSAT 2011: regulatory sequence analysis tools

    PubMed Central

    Thomas-Chollier, Morgane; Defrance, Matthieu; Medina-Rivera, Alejandra; Sand, Olivier; Herrmann, Carl; Thieffry, Denis

    2011-01-01

    RSAT (Regulatory Sequence Analysis Tools) comprises a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. Thirteen new programs have been added to the 30 described in the 2008 NAR Web Software Issue, including an automated sequence retrieval from EnsEMBL (retrieve-ensembl-seq), two novel motif discovery algorithms (oligo-diff and info-gibbs), a 100-times faster version of matrix-scan enabling the scanning of genome-scale sequence sets, and a series of facilities for random model generation and statistical evaluation (random-genome-fragments, random-motifs, random-sites, implant-sites, sequence-probability, permute-matrix). Our most recent work also focused on motif comparison (compare-matrices) and evaluation of motif quality (matrix-quality) by combining theoretical and empirical measures to assess the predictive capability of position-specific scoring matrices. To process large collections of peak sequences obtained from ChIP-seq or related technologies, RSAT provides a new program (peak-motifs) that combines several efficient motif discovery algorithms to predict transcription factor binding motifs, match them against motif databases and predict their binding sites. Availability (web site, stand-alone programs and SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services): http://rsat.ulb.ac.be/rsat/. PMID:21715389

  8. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  9. Poisson approach to clustering analysis of regulatory sequences.

    PubMed

    Wang, Haiying; Zheng, Huiru; Hu, Jinglu

    2008-01-01

    The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.

  10. Common Carrier Access to Cable Communications: Regulatory and Economic Issues.

    ERIC Educational Resources Information Center

    Kestenbaum, Lionel

    The implications of cable television (CATV) common carrier access and economic and regulatory issues associated with it are examined in this paper. The first section provides a discussion of the feasibility and legal basis of common carrier access; the next section contrasts common carrier access with existing over-the-air television broadcasting…

  11. Proving universal common ancestry with similar sequences

    PubMed Central

    Martins, Leonardo de Oliveira; Posada, David

    2013-01-01

    Douglas Theobald recently developed an interesting test putatively capable of quantifying the evidence for a Universal Common Ancestry uniting the three domains of life (Eukarya, Archaea and Bacteria) against hypotheses of Independent Origins for some of these domains. We review here his model, in particular in relation to the treatment of Horizontal Gene Transfer (HGT) and to the quality of sequence alignment. PMID:23814665

  12. Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

    PubMed

    Martin, Alicia R; Costa, Helio A; Lappalainen, Tuuli; Henn, Brenna M; Kidd, Jeffrey M; Yee, Muh-Ching; Grubert, Fabian; Cann, Howard M; Snyder, Michael; Montgomery, Stephen B; Bustamante, Carlos D

    2014-08-01

    Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP). The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics

  13. Regulatory sequences of duck hepatitis B virus C gene transcription.

    PubMed Central

    Schneider, R; Will, H

    1991-01-01

    The regulatory elements involved in transcription of the C gene of duck hepatitis B virus (DHBV) were investigated. Several DHBV DNA fragments were assayed for C gene promoter, enhancer, and silencer activity by using a chloramphenicol acetyltransferase (CAT) reporter gene and transfection of established liver and nonliver cell lines. A major transcript initiating at nucleotide positions 2532 and 2533 and three minor transcripts initiating at positions 2453/2454 and 2461 were identified in cells containing these constructs. These positions correspond to the 5' end of the C mRNA and were close to that of the pre-C mRNAs, respectively, found in infected livers. The pre-C mRNAs were only detected when sequences located between the initiation sites of the pre-C and C mRNAs were deleted. These sequences downregulated, in an orientation-independent fashion, a heterologous promoter and were found to contain a consensus motif common to negative transcriptional regulatory elements previously characterized in other cellular and viral genes. C gene promoter activity was only observed in highly differentiated liver cells and was dependent on a short DHBV DNA fragment containing an enhancer core consensus motif. These data indicate that transcription of the DHBV C gene is regulated by positive, negative, and differentiation factor-responsive elements. Images PMID:1920612

  14. Conventional and Regulatory CD4+ T Cells That Share Identical TCRs Are Derived from Common Clones.

    PubMed

    Wolf, Kyle J; Emerson, Ryan O; Pingel, Jeanette; Buller, R Mark; DiPaolo, Richard J

    2016-01-01

    Results from studies comparing the diversity and specificity of the TCR repertoires expressed by conventional (Tconv) and regulatory (Treg) CD4+ T cell have varied depending on the experimental system employed. We developed a new model in which T cells express a single fixed TCRα chain, randomly rearranged endogenous TCRβ chains, and a Foxp3-GFP reporter. We purified CD4+Foxp3- and CD4+Foxp3+ cells, then performed biased controlled multiplex PCR and high throughput sequencing of endogenous TCRβ chains. We identified >7,000 different TCRβ sequences in the periphery of 5 individual mice. On average, ~12% of TCR sequences were expressed by both conventional and regulatory populations within individual mice. The CD4+ T cells that expressed shared TCR sequences were present at higher frequencies compared to T cells expressing non-shared TCRs. Furthermore, nearly all (>90%) of the TCR sequences that were shared within mice were identical at the DNA sequence level, indicating that conventional and regulatory T cells that express shared TCRs are derived from common clones. Analysis of TCR repertoire overlap in the thymus reveals that a large proportion of Tconv and Treg sharing observed in the periphery is due to clonal expansion in the thymus. Together these data show that there are a limited number of TCR sequences shared between Tconv and Tregs. Also, Tconv and Tregs sharing identical TCRs are found at relatively high frequencies and are derived from common progenitors, of which a large portion are generated in the thymus.

  15. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA

    PubMed Central

    Maurano, Matthew T.; Humbert, Richard; Rynes, Eric; Thurman, Robert E.; Haugen, Eric; Wang, Hao; Reynolds, Alex P.; Sandstrom, Richard; Qu, Hongzhu; Brody, Jennifer; Shafer, Anthony; Neri, Fidencio; Lee, Kristen; Kutyavin, Tanya; Stehling-Sun, Sandra; Johnson, Audra K.; Canfield, Theresa K.; Giste, Erika; Diegel, Morgan; Bates, Daniel; Hansen, R. Scott; Neph, Shane; Sabo, Peter J.; Heimfeld, Shelly; Raubitschek, Antony; Ziegler, Steven; Cotsapas, Chris; Sotoodehnia, Nona; Glass, Ian; Sunyaev, Shamil R.; Kaul, Rajinder; Stamatoyannopoulos, John A.

    2013-01-01

    Genome-wide association studies (GWAS) have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by DNase I hypersensitive sites (DHSs). 88% of such DHSs are active during fetal development, and are enriched for gestational exposure-related phenotypes. We identify distant gene targets for hundreds of DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrate tissue-selective enrichment of more weakly disease-associated variants within DHSs, and the de novo identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease, and provide pathogenic insights into diverse disorders. PMID:22955828

  16. Systematic localization of common disease-associated variation in regulatory DNA.

    PubMed

    Maurano, Matthew T; Humbert, Richard; Rynes, Eric; Thurman, Robert E; Haugen, Eric; Wang, Hao; Reynolds, Alex P; Sandstrom, Richard; Qu, Hongzhu; Brody, Jennifer; Shafer, Anthony; Neri, Fidencio; Lee, Kristen; Kutyavin, Tanya; Stehling-Sun, Sandra; Johnson, Audra K; Canfield, Theresa K; Giste, Erika; Diegel, Morgan; Bates, Daniel; Hansen, R Scott; Neph, Shane; Sabo, Peter J; Heimfeld, Shelly; Raubitschek, Antony; Ziegler, Steven; Cotsapas, Chris; Sotoodehnia, Nona; Glass, Ian; Sunyaev, Shamil R; Kaul, Rajinder; Stamatoyannopoulos, John A

    2012-09-07

    Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.

  17. Regulatory sequence of cupin family gene

    DOEpatents

    Hood, Elizabeth; Teoh, Thomas

    2017-07-25

    This invention is in the field of plant biology and agriculture and relates to novel seed specific promoter regions. The present invention further provide methods of producing proteins and other products of interest and methods of controlling expression of nucleic acid sequences of interest using the seed specific promoter regions.

  18. Bacterial Response Regulators: Versatile Regulatory Strategies from Common Domains

    PubMed Central

    Gao, Rong; Mack, Timothy R.; Stock, Ann M.

    2013-01-01

    Response regulators (RRs) comprise a major family of signaling proteins in prokaryotes. A modular architecture which consists of a conserved receiver domain and a variable effector domain allows RRs to function as phosphorylation-regulated switches that couple a wide variety of cellular behaviors to environmental cues. Recently, advances have been made in understanding RR functions both at genome-wide and molecular levels. Global techniques have been developed to analyze RR input and output, expanding the scope of characterization of these versatile components. Meanwhile, structural studies have revealed that despite common structures and mechanisms of function within individual domains, a range of interactions between receiver and effector domains confer great diversity in regulatory strategies, optimizing individual RRs for the specific regulatory needs of different signaling systems. PMID:17433693

  19. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  20. Identification of DVA interneuron regulatory sequences in Caenorhabditis elegans.

    PubMed

    Puckett Robinson, Carmie; Schwarz, Erich M; Sternberg, Paul W

    2013-01-01

    The identity of each neuron is determined by the expression of a distinct group of genes comprising its terminal gene battery. The regulatory sequences that control the expression of such terminal gene batteries in individual neurons is largely unknown. The existence of a complete genome sequence for C. elegans and draft genomes of other nematodes let us use comparative genomics to identify regulatory sequences directing expression in the DVA interneuron. Using phylogenetic comparisons of multiple Caenorhabditis species, we identified conserved non-coding sequences in 3 of 10 genes (fax-1, nmr-1, and twk-16) that direct expression of reporter transgenes in DVA and other neurons. The conserved region and flanking sequences in an 85-bp intronic region of the twk-16 gene directs highly restricted expression in DVA. Mutagenesis of this 85 bp region shows that it has at least four regions. The central 53 bp region contains a 29 bp region that represses expression and a 24 bp region that drives broad neuronal expression. Two short flanking regions restrict expression of the twk-16 gene to DVA. A shared GA-rich motif was identified in three of these genes but had opposite effects on expression when mutated in the nmr-1 and twk-16 DVA regulatory elements. We identified by multi-species conservation regulatory regions within three genes that direct expression in the DVA neuron. We identified four contiguous regions of sequence of the twk-16 gene enhancer with positive and negative effects on expression, which combined to restrict expression to the DVA neuron. For this neuron a single binding site may thus not achieve sufficient specificity for cell specific expression. One of the positive elements, an 8-bp sequence required for expression was identified in silico by sequence comparisons of seven nematode species, demonstrating the potential resolution of expanded multi-species phylogenetic comparisons.

  1. Identification of DVA Interneuron Regulatory Sequences in Caenorhabditis elegans

    PubMed Central

    Puckett Robinson, Carmie; Schwarz, Erich M.; Sternberg, Paul W.

    2013-01-01

    Background The identity of each neuron is determined by the expression of a distinct group of genes comprising its terminal gene battery. The regulatory sequences that control the expression of such terminal gene batteries in individual neurons is largely unknown. The existence of a complete genome sequence for C. elegans and draft genomes of other nematodes let us use comparative genomics to identify regulatory sequences directing expression in the DVA interneuron. Methodology/Principal Findings Using phylogenetic comparisons of multiple Caenorhabditis species, we identified conserved non-coding sequences in 3 of 10 genes (fax-1, nmr-1, and twk-16) that direct expression of reporter transgenes in DVA and other neurons. The conserved region and flanking sequences in an 85-bp intronic region of the twk-16 gene directs highly restricted expression in DVA. Mutagenesis of this 85 bp region shows that it has at least four regions. The central 53 bp region contains a 29 bp region that represses expression and a 24 bp region that drives broad neuronal expression. Two short flanking regions restrict expression of the twk-16 gene to DVA. A shared GA-rich motif was identified in three of these genes but had opposite effects on expression when mutated in the nmr-1 and twk-16 DVA regulatory elements. Conclusions/Significance We identified by multi-species conservation regulatory regions within three genes that direct expression in the DVA neuron. We identified four contiguous regions of sequence of the twk-16 gene enhancer with positive and negative effects on expression, which combined to restrict expression to the DVA neuron. For this neuron a single binding site may thus not achieve sufficient specificity for cell specific expression. One of the positive elements, an 8-bp sequence required for expression was identified in silico by sequence comparisons of seven nematode species, demonstrating the potential resolution of expanded multi-species phylogenetic comparisons. PMID

  2. Learning gene regulatory networks from next generation sequencing data.

    PubMed

    Jia, Bochao; Xu, Suwa; Xiao, Guanghua; Lamba, Vishal; Liang, Faming

    2017-03-10

    In recent years, next generation sequencing (NGS) has gradually replaced microarray as the major platform in measuring gene expressions. Compared to microarray, NGS has many advantages, such as less noise and higher throughput. However, the discreteness of NGS data also challenges the existing statistical methodology. In particular, there still lacks an appropriate statistical method for reconstructing gene regulatory networks using NGS data in the literature. The existing local Poisson graphical model method is not consistent and can only infer certain local structures of the network. In this article, we propose a random effect model-based transformation to continuize NGS data and then we transform the continuized data to Gaussian via a semiparametric transformation and apply an equivalent partial correlation selection method to reconstruct gene regulatory networks. The proposed method is consistent. The numerical results indicate that the proposed method can lead to much more accurate inference of gene regulatory networks than the local Poisson graphical model and other existing methods. The proposed data-continuized transformation fills the theoretical gap for how to transform discrete data to continuous data and facilitates NGS data analysis. The proposed data-continuized transformation also makes it feasible to integrate different types of data, such as microarray and RNA-seq data, in reconstruction of gene regulatory networks.

  3. Interrogating transcriptional regulatory sequences in Tol2-mediated Xenopus transgenics.

    PubMed

    Loots, Gabriela G; Bergmann, Anne; Hum, Nicholas R; Oldenburg, Catherine E; Wills, Andrea E; Hu, Na; Ovcharenko, Ivan; Harland, Richard M

    2013-01-01

    Identifying gene regulatory elements and their target genes in vertebrates remains a significant challenge. It is now recognized that transcriptional regulatory sequences are critical in orchestrating dynamic controls of tissue-specific gene expression during vertebrate development and in adult tissues, and that these elements can be positioned at great distances in relation to the promoters of the genes they control. While significant progress has been made in mapping DNA binding regions by combining chromatin immunoprecipitation and next generation sequencing, functional validation remains a limiting step in improving our ability to correlate in silico predictions with biological function. We recently developed a computational method that synergistically combines genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to predict tissue-specific enhancers in the human genome. We applied this method to 270 genes highly expressed in skeletal muscle and predicted 190 putative cis-regulatory modules. Furthermore, we optimized Tol2 transgenic constructs in Xenopus laevis to interrogate 20 of these elements for their ability to function as skeletal muscle-specific transcriptional enhancers during embryonic development. We found 45% of these elements expressed only in the fast muscle fibers that are oriented in highly organized chevrons in the Xenopus laevis tadpole. Transcription factor binding site analysis identified >2 Mef2/MyoD sites within ~200 bp regions in 6 of the validated enhancers, and systematic mutagenesis of these sites revealed that they are critical for the enhancer function. The data described herein introduces a new reporter system suitable for interrogating tissue-specific cis-regulatory elements which allows monitoring of enhancer activity in real time, throughout early stages of embryonic development, in Xenopus.

  4. Interrogating Transcriptional Regulatory Sequences in Tol2-Mediated Xenopus Transgenics

    PubMed Central

    Loots, Gabriela G.; Bergmann, Anne; Hum, Nicholas R.; Oldenburg, Catherine E.; Wills, Andrea E.; Hu, Na; Ovcharenko, Ivan; Harland, Richard M.

    2013-01-01

    Identifying gene regulatory elements and their target genes in vertebrates remains a significant challenge. It is now recognized that transcriptional regulatory sequences are critical in orchestrating dynamic controls of tissue-specific gene expression during vertebrate development and in adult tissues, and that these elements can be positioned at great distances in relation to the promoters of the genes they control. While significant progress has been made in mapping DNA binding regions by combining chromatin immunoprecipitation and next generation sequencing, functional validation remains a limiting step in improving our ability to correlate in silico predictions with biological function. We recently developed a computational method that synergistically combines genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to predict tissue-specific enhancers in the human genome. We applied this method to 270 genes highly expressed in skeletal muscle and predicted 190 putative cis-regulatory modules. Furthermore, we optimized Tol2 transgenic constructs in Xenopus laevis to interrogate 20 of these elements for their ability to function as skeletal muscle-specific transcriptional enhancers during embryonic development. We found 45% of these elements expressed only in the fast muscle fibers that are oriented in highly organized chevrons in the Xenopus laevis tadpole. Transcription factor binding site analysis identified >2 Mef2/MyoD sites within ∼200 bp regions in 6 of the validated enhancers, and systematic mutagenesis of these sites revealed that they are critical for the enhancer function. The data described herein introduces a new reporter system suitable for interrogating tissue-specific cis-regulatory elements which allows monitoring of enhancer activity in real time, throughout early stages of embryonic development, in Xenopus. PMID:23874664

  5. Pleiotropy constrains the evolution of protein but not regulatory sequences in a transcription regulatory network influencing complex social behaviors

    PubMed Central

    Molodtsova, Daria; Harpur, Brock A.; Kent, Clement F.; Seevananthan, Kajendra; Zayed, Amro

    2014-01-01

    It is increasingly apparent that genes and networks that influence complex behavior are evolutionary conserved, which is paradoxical considering that behavior is labile over evolutionary timescales. How does adaptive change in behavior arise if behavior is controlled by conserved, pleiotropic, and likely evolutionary constrained genes? Pleiotropy and connectedness are known to constrain the general rate of protein evolution, prompting some to suggest that the evolution of complex traits, including behavior, is fuelled by regulatory sequence evolution. However, we seldom have data on the strength of selection on mutations in coding and regulatory sequences, and this hinders our ability to study how pleiotropy influences coding and regulatory sequence evolution. Here we use population genomics to estimate the strength of selection on coding and regulatory mutations for a transcriptional regulatory network that influences complex behavior of honey bees. We found that replacement mutations in highly connected transcription factors and target genes experience significantly stronger negative selection relative to weakly connected transcription factors and targets. Adaptively evolving proteins were significantly more likely to reside at the periphery of the regulatory network, while proteins with signs of negative selection were near the core of the network. Interestingly, connectedness and network structure had minimal influence on the strength of selection on putative regulatory sequences for both transcription factors and their targets. Our study indicates that adaptive evolution of complex behavior can arise because of positive selection on protein-coding mutations in peripheral genes, and on regulatory sequence mutations in both transcription factors and their targets throughout the network. PMID:25566318

  6. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus.

    PubMed

    Garfield, David; Haygood, Ralph; Nielsen, William J; Wray, Gregory A

    2012-01-01

    Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.

  7. Sequence diversity of the Trypanosoma cruzi complement regulatory protein family.

    PubMed

    Beucher, M; Norris, K A

    2008-02-01

    As a central component of innate immunity, complement activation is a critical mechanism of containment and clearance of microbial pathogens in advance of the development of acquired immunity. Several pathogens restrict complement activation through the acquisition of host proteins that regulate complement activation or through the production of their own complement regulatory molecules (M. K. Liszewski, M. K. Leung, R. Hauhart, R. M. Buller, P. Bertram, X. Wang, A. M. Rosengard, G. J. Kotwal, and J. P. Atkinson, J. Immunol. 176:3725-3734, 2006; J. Lubinski, L. Wang, D. Mastellos, A. Sahu, J. D. Lambris, and H. M. Friedman, J. Exp. Med. 190:1637-1646, 1999). The infectious stage of the protozoan parasite Trypanosoma cruzi produces a surface-anchored complement regulatory protein (CRP) that functions to inhibit alternative and classical pathway complement activation (K. A. Norris, B. Bradt, N. R. Cooper, and M. So, J. Immunol. 147:2240-2247, 1991). This study addresses the genomic complexity of the T. cruzi CRP and its relationship to the T. cruzi supergene family comprising active trans-sialidase (TS) and TS-like proteins. The TS superfamily consists of several functionally distinct subfamilies that share a characteristic sialidase domain at their amino termini. These TS families include active TS, adhesions, CRPs, and proteins of unknown functions (G. A. Cross and G. B. Takle, Annu. Rev. Microbiol. 47:385-411, 1993). A sequence comparison search of GenBank using BLASTP revealed several full-length paralogs of CRP. These proteins share significant homology at their amino termini and a strong spatial conservation of cysteine residues. Alternative pathway complement regulation was confirmed for CRP paralogs with 58% (low) and 83% (high) identity to AAB49414. CRPs are functionally similar to the microbial and mammalian proteins that regulate complement activation. Sequence alignment of mammalian complement control proteins to CRP showed that these sequences are

  8. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  9. Complete MHC Haplotype Sequencing for Common Disease Gene Mapping

    PubMed Central

    Stewart, C. Andrew; Horton, Roger; Allcock, Richard J.N.; Ashurst, Jennifer L.; Atrazhev, Alexey M.; Coggill, Penny; Dunham, Ian; Forbes, Simon; Halls, Karen; Howson, Joanna M.M.; Humphray, Sean J.; Hunt, Sarah; Mungall, Andrew J.; Osoegawa, Kazutoyo; Palmer, Sophie; Roberts, Anne N.; Rogers, Jane; Sims, Sarah; Wang, Yu; Wilming, Laurens G.; Elliott, John F.; de Jong, Pieter J.; Sawcer, Stephen; Todd, John A.; Trowsdale, John; Beck, Stephan

    2004-01-01

    The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification. PMID:15140828

  10. Abnormality of regulatory T cells in common variable immunodeficiency.

    PubMed

    Azizi, Gholamreza; Hafezi, Nasim; Mohammadi, Hamed; Yazdani, Reza; Alinia, Tina; Tavakol, Marzieh; Aghamohammadi, Asghar; Mirshafiey, Abbas

    2017-05-01

    Common variable immunodeficiency (CVID) is a heterogeneous group of primary antibody deficiencies (PAD) which is defined by recurrent infections, hypogammaglobulinemia and defects in B-cell differentiation into plasma cells and memory B cells. T cell abnormalities have also been described in CVID patients. Several studies reported that Treg frequencies and their functional characteristics are disturbed and might account for the aberrant immune responses observed in CVID patients. The aim of this review is to describe phenotypic and functional characteristics of Treg cells, and to review the literature with respect to the reported Treg defects and its association with the clinical manifestation in CVID. Copyright © 2016. Published by Elsevier Inc.

  11. Sex determination strategies in 2012: towards a common regulatory model?

    PubMed Central

    2012-01-01

    Sex determination is a complicated process involving large-scale modifications in gene expression affecting virtually every tissue in the body. Although the evolutionary origin of sex remains controversial, there is little doubt that it has developed as a process of optimizing metabolic control, as well as developmental and reproductive functions within a given setting of limited resources and environmental pressure. Evidence from various model organisms supports the view that sex determination may occur as a result of direct environmental induction or genetic regulation. The first process has been well documented in reptiles and fish, while the second is the classic case for avian species and mammals. Both of the latter have developed a variety of sex-specific/sex-related genes, which ultimately form a complete chromosome pair (sex chromosomes/gonosomes). Interestingly, combinations of environmental and genetic mechanisms have been described among different classes of animals, thus rendering the possibility of a unidirectional continuous evolutionary process from the one type of mechanism to the other unlikely. On the other hand, common elements appear throughout the animal kingdom, with regard to a) conserved key genes and b) a central role of sex steroid control as a prerequisite for ultimately normal sex differentiation. Studies in invertebrates also indicate a role of epigenetic chromatin modification, particularly with regard to alternative splicing options. This review summarizes current evidence from research in this hot field and signifies the need for further study of both normal hormonal regulators of sexual phenotype and patterns of environmental disruption. PMID:22357269

  12. Sex determination strategies in 2012: towards a common regulatory model?

    PubMed

    Angelopoulou, Roxani; Lavranos, Giagkos; Manolakou, Panagiota

    2012-02-22

    Sex determination is a complicated process involving large-scale modifications in gene expression affecting virtually every tissue in the body. Although the evolutionary origin of sex remains controversial, there is little doubt that it has developed as a process of optimizing metabolic control, as well as developmental and reproductive functions within a given setting of limited resources and environmental pressure. Evidence from various model organisms supports the view that sex determination may occur as a result of direct environmental induction or genetic regulation. The first process has been well documented in reptiles and fish, while the second is the classic case for avian species and mammals. Both of the latter have developed a variety of sex-specific/sex-related genes, which ultimately form a complete chromosome pair (sex chromosomes/gonosomes). Interestingly, combinations of environmental and genetic mechanisms have been described among different classes of animals, thus rendering the possibility of a unidirectional continuous evolutionary process from the one type of mechanism to the other unlikely. On the other hand, common elements appear throughout the animal kingdom, with regard to a) conserved key genes and b) a central role of sex steroid control as a prerequisite for ultimately normal sex differentiation. Studies in invertebrates also indicate a role of epigenetic chromatin modification, particularly with regard to alternative splicing options. This review summarizes current evidence from research in this hot field and signifies the need for further study of both normal hormonal regulators of sexual phenotype and patterns of environmental disruption.

  13. Filling the gaps - the generation of full genomic sequences for 15 common and well-documented HLA class I alleles using next-generation sequencing technology.

    PubMed

    Lind, C; Ferriola, D; Mackiewicz, K; Papazoglou, A; Sasson, A; Monos, D

    2013-03-01

    Many common and well-documented (CWD) HLA alleles have only been partially characterized. The DNA sequence of these incomplete alleles, as published in the IMGT/HLA database, is most often limited to exons that code for the extracellular domains of the mature protein. Here we describe the application of next-generation sequencing technology to obtain full length genomic sequence from a single long-range PCR amplicon for 15 common and well-documented HLA Class I alleles. This technology is well suited to fill in the gaps of the current HLA allele sequence database which is largely incomplete. A more comprehensive catalog of HLA allele sequences would be beneficial in the evaluation of mismatches in transplantation, studies of population genetics, the evolution of HLAs, regulatory mechanisms and HLA expression, and issues related to the genomic organization of the MHC. Copyright © 2012. Published by Elsevier Inc.

  14. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  15. Novel players in the AP2-miR172 regulatory network for common bean nodulation.

    PubMed

    Íñiguez, Luis P; Nova-Franco, Bárbara; Hernández, Georgina

    2015-01-01

    The intricate regulatory network for floral organogenesis in plants that includes AP2/ERF, SPL and AGL transcription factors, miR172 and miR156 along with other components is well documented, though its complexity and size keep increasing. The miR172/AP2 node was recently proposed as essential regulator in the legume-rhizobia nitrogen-fixing symbiosis. Research from our group contributed to demonstrate the control of common bean (Phaseolus vulgaris) nodulation by miR172c/AP2-1, however no other components of such regulatory network have been reported. Here we propose AGLs as new protagonists in the regulation of common bean nodulation and discuss the relevance of future deeper analysis of the complex AP2 regulatory network for nodule organogenesis in legumes.

  16. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  17. Sequence similarity network reveals common ancestry of multidomain proteins.

    PubMed

    Song, Nan; Joseph, Jacob M; Davis, George B; Durand, Dannie

    2008-05-16

    We address the problem of homology identification in complex multidomain families with varied domain architectures. The challenge is to distinguish sequence pairs that share common ancestry from pairs that share an inserted domain but are otherwise unrelated. This distinction is essential for accuracy in gene annotation, function prediction, and comparative genomics. There are two major obstacles to multidomain homology identification: lack of a formal definition and lack of curated benchmarks for evaluating the performance of new methods. We offer preliminary solutions to both problems: 1) an extension of the traditional model of homology to include domain insertions; and 2) a manually curated benchmark of well-studied families in mouse and human. We further present Neighborhood Correlation, a novel method that exploits the local structure of the sequence similarity network to identify homologs with great accuracy based on the observation that gene duplication and domain shuffling leave distinct patterns in the sequence similarity network. In a rigorous, empirical comparison using our curated data, Neighborhood Correlation outperforms sequence similarity, alignment length, and domain architecture comparison. Neighborhood Correlation is well suited for automated, genome-scale analyses. It is easy to compute, does not require explicit knowledge of domain architecture, and classifies both single and multidomain homologs with high accuracy. Homolog predictions obtained with our method, as well as our manually curated benchmark and a web-based visualization tool for exploratory analysis of the network neighborhood structure, are available at http://www.neighborhoodcorrelation.org. Our work represents a departure from the prevailing view that the concept of homology cannot be applied to genes that have undergone domain shuffling. In contrast to current approaches that either focus on the homology of individual domains or consider only families with identical domain

  18. Noninvasive prenatal diagnosis of common aneuploidies by semiconductor sequencing

    PubMed Central

    Liao, Can; Yin, Ai-hua; Peng, Chun-fang; Fu, Fang; Yang, Jie-xia; Li, Ru; Chen, Yang-yi; Luo, Dong-hong; Zhang, Yong-ling; Ou, Yan-mei; Li, Jian; Wu, Jing; Mai, Ming-qin; Hou, Rui; Wu, Frances; Luo, Hongrong; Li, Dong-zhi; Liu, Hai-liang; Zhang, Xiao-zhuang; Zhang, Kang

    2014-01-01

    Massively parallel sequencing (MPS) of cell-free fetal DNA from maternal plasma has revolutionized our ability to perform noninvasive prenatal diagnosis. This approach avoids the risk of fetal loss associated with more invasive diagnostic procedures. The present study developed an effective method for noninvasive prenatal diagnosis of common chromosomal aneuploidies using a benchtop semiconductor sequencing platform (SSP), which relies on the MPS platform but offers advantages over existing noninvasive screening techniques. A total of 2,275 pregnant subjects was included in the study; of these, 515 subjects who had full karyotyping results were used in a retrospective analysis, and 1,760 subjects without karyotyping were analyzed in a prospective study. In the retrospective study, all 55 fetal trisomy 21 cases were identified using the SSP with a sensitivity and specificity of 99.94% and 99.46%, respectively. The SSP also detected 16 trisomy 18 cases with 100% sensitivity and 99.24% specificity and 3 trisomy 13 cases with 100% sensitivity and 100% specificity. Furthermore, 15 fetuses with sex chromosome aneuploidies (10 45,X, 2 47,XYY, 2 47,XXX, and 1 47,XXY) were detected. In the prospective study, nine fetuses with trisomy 21, three with trisomy 18, three with trisomy 13, and one with 45,X were detected. To our knowledge, this is the first large-scale clinical study to systematically identify chromosomal aneuploidies based on cell-free fetal DNA using the SSP and provides an effective strategy for large-scale noninvasive screening for chromosomal aneuploidies in a clinical setting. PMID:24799683

  19. Noninvasive prenatal diagnosis of common aneuploidies by semiconductor sequencing.

    PubMed

    Liao, Can; Yin, Ai-hua; Peng, Chun-fang; Fu, Fang; Yang, Jie-xia; Li, Ru; Chen, Yang-yi; Luo, Dong-hong; Zhang, Yong-ling; Ou, Yan-mei; Li, Jian; Wu, Jing; Mai, Ming-qin; Hou, Rui; Wu, Frances; Luo, Hongrong; Li, Dong-zhi; Liu, Hai-liang; Zhang, Xiao-zhuang; Zhang, Kang

    2014-05-20

    Massively parallel sequencing (MPS) of cell-free fetal DNA from maternal plasma has revolutionized our ability to perform noninvasive prenatal diagnosis. This approach avoids the risk of fetal loss associated with more invasive diagnostic procedures. The present study developed an effective method for noninvasive prenatal diagnosis of common chromosomal aneuploidies using a benchtop semiconductor sequencing platform (SSP), which relies on the MPS platform but offers advantages over existing noninvasive screening techniques. A total of 2,275 pregnant subjects was included in the study; of these, 515 subjects who had full karyotyping results were used in a retrospective analysis, and 1,760 subjects without karyotyping were analyzed in a prospective study. In the retrospective study, all 55 fetal trisomy 21 cases were identified using the SSP with a sensitivity and specificity of 99.94% and 99.46%, respectively. The SSP also detected 16 trisomy 18 cases with 100% sensitivity and 99.24% specificity and 3 trisomy 13 cases with 100% sensitivity and 100% specificity. Furthermore, 15 fetuses with sex chromosome aneuploidies (10 45,X, 2 47,XYY, 2 47,XXX, and 1 47,XXY) were detected. In the prospective study, nine fetuses with trisomy 21, three with trisomy 18, three with trisomy 13, and one with 45,X were detected. To our knowledge, this is the first large-scale clinical study to systematically identify chromosomal aneuploidies based on cell-free fetal DNA using the SSP and provides an effective strategy for large-scale noninvasive screening for chromosomal aneuploidies in a clinical setting.

  20. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  1. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  2. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences

    PubMed Central

    Hughes, Jim R.; Cheng, Jan-Fang; Ventress, Nicki; Prabhakar, Shyam; Clark, Kevin; Anguita, Eduardo; De Gobbi, Marco; de Jong, Pieter; Rubin, Eddy; Higgs, Douglas R.

    2005-01-01

    An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of ≈238kb containing the human α globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs. PMID:15998734

  3. Signal sequence detection given noisy, common background image sets.

    NASA Technical Reports Server (NTRS)

    Harger, R. O.

    1972-01-01

    The optimum processing (likelihood functional) is found for a set of M images, each the sum of a member of a signal sequence due to an object to be detected and its parameters estimated, a sample function of a noise field, and a sample function of a common background field. The noise fields are independent, zero mean, white Gaussian fields, all independent of the background field. The latter is assumed to be either (1) completely unknown or of known mean and covariance functions with (2) a certain fluctuation property or (3) Gaussian. Three equivalent forms of the optimum processing are found: (1) a summation of generalized matched filterings of the images, (2) a summation of matched filtering of certain generalized differences of the images, and (3) a summation of 'estimator-correlator' type filterings. The detection performance and optimum signal/image selection under the Neyman-Pearson criterion is given, and is shown that optimum processor and signal design can completely eliminate any effect of the background on detectability.

  4. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  5. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  6. Regulatory convergence and harmonization: barriers to effective use and adoption of common standards.

    PubMed

    Pombo, María Luz; Porrás, Analía; Saidon, Patricia Claudia; Cascio, Stephanie M

    2016-05-01

    Objective To evaluate 1) the level of use and adoption of eight Technical Documents (TDs) published by the Pan American Network for Drug Regulatory Harmonization (PANDRH) member states and 2) identify the hurdles that can prevent countries from successfully adopting a common standard. Methods An in-depth analysis of the incorporation of PANDRH Technical Document No. 1 ("TDNo1") recommendations in member states' national requirements was carried out. Results The results illustrate the role of PANDRH in promoting convergence/harmonization among its members. Conclusions The study results show that the rate of use of TDs varied greatly by product/area and country. Timing, TD content, and product/area, and, more importantly, national capacities, are critical determinants of the level of TD guideline adoption. While PANDRH TDs have proven instrumental for the harmonization/convergence of member states' national requirements, as shown by the level of convergence across a majority of the national requirements issued for vaccine licensing, several countries had yet to incorporate common standards due, in large part, to weak national regulatory capacities. Therefore, harmonization/convergence initiatives should include the strengthening of national regulatory capacities as part of their core strategy, which will, in turn, allow for the incorporation and deployment of common standards in all participating countries.

  7. Genetic Diagnosis Using Whole Exome Sequencing in Common Variable Immunodeficiency

    PubMed Central

    Maffucci, Patrick; Filion, Charles A.; Boisson, Bertrand; Itan, Yuval; Shang, Lei; Casanova, Jean-Laurent; Cunningham-Rundles, Charlotte

    2016-01-01

    Whole exome sequencing (WES) has proven an effective tool for the discovery of genetic defects in patients with primary immunodeficiencies (PIDs). However, success in dissecting the genetic etiology of common variable immunodeficiency (CVID) has been limited. We outline a practical framework for using WES to identify causative genetic defects in these subjects. WES was performed on 50 subjects diagnosed with CVID who had at least one of the following criteria: early onset, autoimmune/inflammatory manifestations, low B lymphocytes, and/or familial history of hypogammaglobulinemia. Following alignment and variant calling, exomes were screened for mutations in 269 PID-causing genes. Variants were filtered based on the mode of inheritance and reported frequency in the general population. Each variant was assessed by study of familial segregation and computational predictions of deleteriousness. Out of 433 variations in PID-associated genes, we identified 17 probable disease-causing mutations in 15 patients (30%). These variations were rare or private and included monoallelic mutations in NFKB1, STAT3, CTLA4, PIK3CD, and IKZF1, and biallelic mutations in LRBA and STXBP2. Forty-two other damaging variants were found but were not considered likely disease-causing based on the mode of inheritance and/or patient phenotype. WES combined with analysis of PID-associated genes is a cost-effective approach to identify disease-causing mutations in CVID patients with severe phenotypes and was successful in 30% of our cohort. As targeted therapeutics are becoming the mainstay of treatment for non-infectious manifestations in CVID, this approach will improve management of patients with more severe phenotypes. PMID:27379089

  8. Regulatory sequences for expressing genes in oomycete fungi.

    PubMed

    Judelson, H S; Tyler, B M; Michelmore, R W

    1992-07-01

    Promoter and terminator sequences from a range of species were tested for activity in the oomycetes, a group of lower fungi that bear an uncertain taxonomic affinity to other organisms and in which little is known of the sequences required for transcription. Transient assays, using the reporter gene beta-glucuronidase (GUS), were used to examine the function of these promoters and terminators in the plant pathogens Phytophthora infestans and P. megasperma f. sp. glycinea, and in the saprophytic water mold, Achlya ambisexualis. Oomycete promoters, isolated from the ham34 and hsp70 genes of Bremia lactucae and the actin gene of P. megasperma f. sp. glycinea, resulted in high levels of GUS accumulation in each of the three oomycetes. In contrast, little or no activity was detected when promoters from higher fungi (four ascomycetes and one basidiomycete), plants, and animals were tested. The terminator from the ham34 gene resulted in much higher levels of GUS accumulation than did others, although an oomycete terminator was not absolutely required for expression. Transcript mapping of RNA from stable transformants confirmed accurate initiation from the B. lactucae hsp70 promoter and termination within 3' ham34 sequences in P. infestans. Our results indicate that the transcriptional machinery of the oomycetes differs significantly from that of the higher fungi, but that enough conservation exists within the class to allow vectors developed from one oomycete species to be used for others.

  9. A web site for the computational analysis of yeast regulatory sequences.

    PubMed

    van Helden, J; André, B; Collado-Vides, J

    2000-01-30

    A series of computer programs were developed for the analysis of regulatory sequences, with a special focus on yeast. These tools are publicly available on the web (http://copan.cifn.unam. mx/Computational_Biology/yeast-tools or http://www.ucmb.ulb.ac. be/bioinformatics/rsa-tools/). Basically, three classical problems can be addressed: (a) search for known regulatory patterns in the upstream regions of known genes; (b) discovery of unknown regulatory patterns within a set of upstream regions known to be co-regulated; (c) search for unknown genes potentially regulated by a known transcription factor. Each of these tasks can be performed on basis of a simple (string) or more refined (matrix) description of the regulatory patterns. A feature-map program automatically generates visual representations of the positions at which patterns were found. The site also provides a series of general utilities, such as generation of random sequence, automatic drawing of XY graphs, interconversions between sequence formats, etc. Several tools are linked together to allow their sequential utilization (piping), but each one can also be used independently by filling the web form with external data. This widens the scope of the site to the analysis of non-regulatory and/or non-yeast sequences. Copyright 2000 John Wiley & Sons, Ltd.

  10. Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms

    PubMed Central

    Guo, Michael H.; Nandakumar, Satish K.; Ulirsch, Jacob C.; Zekavat, Seyedeh M.; Buenrostro, Jason D.; Natarajan, Pradeep; Salem, Rany M.; Chiarle, Roberto; Mitt, Mario; Kals, Mart; Pärn, Kalle; Fischer, Krista; Milani, Lili; Mägi, Reedik; Palta, Priit; Gabriel, Stacey B.; Metspalu, Andres; Lander, Eric S.; Kathiresan, Sekar; Hirschhorn, Joel N.; Esko, Tõnu; Sankaran, Vijay G.

    2017-01-01

    Genetic variants affecting hematopoiesis can influence commonly measured blood cell traits. To identify factors that affect hematopoiesis, we performed association studies for blood cell traits in the population-based Estonian Biobank using high-coverage whole-genome sequencing (WGS) in 2,284 samples and SNP genotyping in an additional 14,904 samples. Using up to 7,134 samples with available phenotype data, our analyses identified 17 associations across 14 blood cell traits. Integration of WGS-based fine-mapping and complementary epigenomic datasets provided evidence for causal mechanisms at several loci, including at a previously undiscovered basophil count-associated locus near the master hematopoietic transcription factor CEBPA. The fine-mapped variant at this basophil count association near CEBPA overlapped an enhancer active in common myeloid progenitors and influenced its activity. In situ perturbation of this enhancer by CRISPR/Cas9 mutagenesis in hematopoietic stem and progenitor cells demonstrated that it is necessary for and specifically regulates CEBPA expression during basophil differentiation. We additionally identified basophil count-associated variation at another more pleiotropic myeloid enhancer near GATA2, highlighting regulatory mechanisms for ordered expression of master hematopoietic regulators during lineage specification. Our study illustrates how population-based genetic studies can provide key insights into poorly understood cell differentiation processes of considerable physiologic relevance. PMID:28031487

  11. Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms.

    PubMed

    Guo, Michael H; Nandakumar, Satish K; Ulirsch, Jacob C; Zekavat, Seyedeh M; Buenrostro, Jason D; Natarajan, Pradeep; Salem, Rany M; Chiarle, Roberto; Mitt, Mario; Kals, Mart; Pärn, Kalle; Fischer, Krista; Milani, Lili; Mägi, Reedik; Palta, Priit; Gabriel, Stacey B; Metspalu, Andres; Lander, Eric S; Kathiresan, Sekar; Hirschhorn, Joel N; Esko, Tõnu; Sankaran, Vijay G

    2017-01-17

    Genetic variants affecting hematopoiesis can influence commonly measured blood cell traits. To identify factors that affect hematopoiesis, we performed association studies for blood cell traits in the population-based Estonian Biobank using high-coverage whole-genome sequencing (WGS) in 2,284 samples and SNP genotyping in an additional 14,904 samples. Using up to 7,134 samples with available phenotype data, our analyses identified 17 associations across 14 blood cell traits. Integration of WGS-based fine-mapping and complementary epigenomic datasets provided evidence for causal mechanisms at several loci, including at a previously undiscovered basophil count-associated locus near the master hematopoietic transcription factor CEBPA The fine-mapped variant at this basophil count association near CEBPA overlapped an enhancer active in common myeloid progenitors and influenced its activity. In situ perturbation of this enhancer by CRISPR/Cas9 mutagenesis in hematopoietic stem and progenitor cells demonstrated that it is necessary for and specifically regulates CEBPA expression during basophil differentiation. We additionally identified basophil count-associated variation at another more pleiotropic myeloid enhancer near GATA2, highlighting regulatory mechanisms for ordered expression of master hematopoietic regulators during lineage specification. Our study illustrates how population-based genetic studies can provide key insights into poorly understood cell differentiation processes of considerable physiologic relevance.

  12. Molecular sled sequences are common in mammalian proteins

    PubMed Central

    Xiong, Kan; Blainey, Paul C.

    2016-01-01

    Recent work revealed a new class of molecular machines called molecular sleds, which are small basic molecules that bind and slide along DNA with the ability to carry cargo along DNA. Here, we performed biochemical and single-molecule flow stretching assays to investigate the basis of sliding activity in molecular sleds. In particular, we identified the functional core of pVIc, the first molecular sled characterized; peptide functional groups that control sliding activity; and propose a model for the sliding activity of molecular sleds. We also observed widespread DNA binding and sliding activity among basic polypeptide sequences that implicate mammalian nuclear localization sequences and many cell penetrating peptides as molecular sleds. These basic protein motifs exhibit weak but physiologically relevant sequence-nonspecific DNA affinity. Our findings indicate that many mammalian proteins contain molecular sled sequences and suggest the possibility that substantial undiscovered sliding activity exists among nuclear mammalian proteins. PMID:26857546

  13. Poster — Thur Eve — 50: Common Regulatory Non-Compliances and How to Avoid Them

    SciTech Connect

    Heimann, M.

    2014-08-15

    The Accelerators and Class II Facilities Division (ACFD) of the Canadian Nuclear Safety Commission (CNSC), is responsible for the oversight of radiotherapy facilities containing Class II prescribed equipment in Canada. Over the past several years, ACFD has been performing compliance inspections of Class II nuclear facilities across the country (medical and otherwise), and in that time, has issued several hundred corrective actions to licensees due to non-compliance with regulatory requirements. Recently, a study was done to determine the most common regulatory non-compliances. The purpose of this poster presentation is to disseminate information to the licensee community about the nature of these non-compliances, and how they can be avoided by licensees in the future.

  14. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12.

    PubMed

    Thieffry, D; Salgado, H; Huerta, A M; Collado-Vides, J

    1998-06-01

    As one of the best-characterized free-living organisms, Escherichia coli and its recently completed genomic sequence offer a special opportunity to exploit systematically the variety of regulatory data available in the literature in order to make a comprehensive set of regulatory predictions in the whole genome. The complete genome sequence of E.coli was analyzed for the binding of transcriptional regulators upstream of coding sequences. The biological information contained in RegulonDB (Huerta, A.M. et al., Nucleic Acids Res.,26,55-60, 1998) for 56 different transcriptional proteins was the support to implement a stringent strategy combining string search and weight matrices. We estimate that our search included representatives of 15-25% of the total number of regulatory binding proteins in E.coli. This search was performed on the set of 4288 putative regulatory regions, each 450 bp long. Within the regions with predicted sites, 89% are regulated by one protein and 81% involve only one site. These numbers are reasonably consistent with the distribution of experimental regulatory sites. Regulatory sites are found in 603 regions corresponding to 16% of operon regions and 10% of intra-operonic regions. Additional evidence gives stronger support to some of these predictions, including the position of the site, biological consistency with the function of the downstream gene, as well as genetic evidence for the regulatory interaction. The predictions described here were incorporated into the map presented in the paper describing the complete E.coli genome (Blattner,F.R. et al., Science, 277, 1453-1461, 1997). The complete set of predictions in GenBank format is available at the url: http://www. cifn.unam.mx/Computational_Biology/E.coli-predictions ecoli-reg@cifn.unam.mx, collado@cifn.unam.mx

  15. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

    PubMed

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-10-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    PubMed

    Petrovski, Slavé; Gussow, Ayal B; Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H; Allen, Andrew S; Goldstein, David B

    2015-09-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, nc

  17. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  18. Use of H19 Gene Regulatory Sequences in DNA-Based Therapy for Pancreatic Cancer

    PubMed Central

    Scaiewicz, V.; Sorin, V.; Fellig, Y.; Birman, T.; Mizrahi, A.; Galula, J.; Abu-lail, R.; Shneider, T.; Ohana, P.; Buscail, L.; Hochberg, A.; Czerniak, A.

    2010-01-01

    Pancreatic cancer is the eighth most common cause of death from cancer in the world, for which palliative treatments are not effective and frequently accompanied by severe side effects. We propose a DNA-based therapy for pancreatic cancer using a nonviral vector, expressing the diphtheria toxin A chain under the control of the H19 gene regulatory sequences. The H19 gene is an oncofetal RNA expressed during embryo development and in several types of cancer. We tested the expression of H19 gene in patients, and found that 65% of human pancreatic tumors analyzed showed moderated to strong expression of the gene. In vitro experiments showed that the vector was effective in reducing Luciferase protein activity on pancreatic carcinoma cell lines. In vivo experiment results revealed tumor growth arrest in different animal models for pancreatic cancer. Differences in tumor size between control and treated groups reached a 75% in the heterotopic model (P = .037) and 50% in the orthotopic model (P = .007). In addition, no visible metastases were found in the treated group of the orthotopic model. These results indicate that the treatment with the vector DTA-H19 might be a viable new therapeutic option for patients with unresectable pancreatic cancer. PMID:21052499

  19. Sequence evidence for common ancestry of eukaryotic endomembrane coatomers

    PubMed Central

    Promponas, Vasilis J.; Katsani, Katerina R.; Blencowe, Benjamin J.; Ouzounis, Christos A.

    2016-01-01

    Eukaryotic cells are defined by compartments through which the trafficking of macromolecules is mediated by large complexes, such as the nuclear pore, transport vesicles and intraflagellar transport. The assembly and maintenance of these complexes is facilitated by endomembrane coatomers, long suspected to be divergently related on the basis of structural and more recently phylogenomic analysis. By performing supervised walks in sequence space across coatomer superfamilies, we uncover subtle sequence patterns that have remained elusive to date, ultimately unifying eukaryotic coatomers by divergent evolution. The conserved residues shared by 3,502 endomembrane coatomer components are mapped onto the solenoid superhelix of nucleoporin and COPII protein structures, thus determining the invariant elements of coatomer architecture. This ancient structural motif can be considered as a universal signature connecting eukaryotic coatomers involved in multiple cellular processes across cell physiology and human disease. PMID:26931514

  20. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  1. Identification of common microRNA-mRNA regulatory biomodules in human epithelial cancers

    PubMed Central

    Yang, Xinan; Lee, Younghee; Fan, Hong; Sun, Xiao; Lussier, Yves A

    2010-01-01

    The complex regulatory network between microRNAs and gene expression remains unclear domain of active research. We proposed to address in part this complex regulation with a novel approach for the genome-wide identification of biomodules derived from paired microRNA and mRNA profiles, which could reveal correlations associated with a complex network of de-regulation in human cancer. Two published expression datasets for 68 samples with 11 distinct types of epithelial cancers and 21 samples of normal tissues were used, containing microRNA expression (Lu et al. Nature Letters 2005) and gene expression (Ramaswarmy et al. PNAS 2001) profiles, respectively. As results, the microRNA expression used jointly with mRNA expression can provide better classifiers of epithelial cancers against normal epithelial tissue than either dataset alone (p=1×10-10, F-Test). We identified a combination of six microRNA-mRNA biomodules that optimally classified epithelial cancers from normal epithelial tissue (total accuracy = 93.3%; 95% confidence intervals: 86% - 97%), using penalized logistic regression (PLR) algorithm and three-fold cross-validation. Three of these biomodules are individually sufficient to cluster epithelial cancers from normal tissue using mutual information distance. The biomodules contain 10 distinct microRNAs and 98 distinct genes, including well known tumor markers such as miR-15a, miR-30e, IRAK1, TGFBR2, DUSP16, CDC25B and PDCD2. In addition, there is a significant enrichment (Fisher’s exact test p=3×10-10) between putative microRNA-target gene pairs reported in five microRNA target databases and the inversely correlated micro-RNA-mRNA pairs in the biomodules. Further, microRNAs and genes in the biomodules were found in abstracts mentioning epithelial cancers (Fisher Exact Test, unadjusted p<0.05). Taken together, these results strongly suggest that the discovered microRNA-mRNA biomodules correspond to regulatory mechanisms common to human epithelial cancer

  2. The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

    PubMed Central

    Portales-Casamar, Elodie; Arenillas, David; Lim, Jonathan; Swanson, Magdalena I.; Jiang, Steven; McCallum, Anthony; Kirov, Stefan; Wasserman, Wyeth W.

    2009-01-01

    The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk. PMID:18971253

  3. Evolution in biosynthetic pathways: two enzymes catalyzing consecutive steps in methionine biosynthesis originate from a common ancestor and possess a similar regulatory region.

    PubMed

    Belfaiza, J; Parsot, C; Martel, A; de la Tour, C B; Margarita, D; Cohen, G N; Saint-Girons, I

    1986-02-01

    The metC gene of Escherichia coli K-12 was cloned and the nucleotide sequence of the metC gene and its flanking regions was determined. The translation initiation codon was identified by sequencing the NH2-terminal part of beta-cystathionase, the MetC gene product. The metC gene (1185 nucleotides) encodes a protein having 395 amino acid residues. The 5' noncoding region was found to contain a "Met box" homologous to sequences suggestive of operator structures upstream from other methionine genes that are controlled by the product of the pleiotropic regulatory metJ gene. The deduced amino acid sequence of beta-cystathionase showed extensive homology with that of the MetB protein (cystathionine gamma-synthase) that catalyzes the preceding step in methionine biosynthesis. The homology strongly suggests that the structural genes for the MetB and MetC proteins evolved from a common ancestral gene.

  4. COMMON WARM DUST TEMPERATURES AROUND MAIN-SEQUENCE STARS

    SciTech Connect

    Morales, Farisa Y.; Werner, M. W.; Bryden, G.; Stapelfeldt, K. R.; Rieke, G. H.; Su, K. Y. L.

    2011-04-01

    We compare the properties of warm dust emission from a sample of main-sequence A-type stars (B8-A7) to those of dust around solar-type stars (F5-K0) with similar Spitzer Space Telescope Infrared Spectrograph/MIPS data and similar ages. Both samples include stars with sources with infrared spectral energy distributions that show evidence of multiple components. Over the range of stellar types considered, we obtain nearly the same characteristic dust temperatures ({approx}190 K and {approx}60 K for the inner and outer dust components, respectively)-slightly above the ice evaporation temperature for the inner belts. The warm inner dust temperature is readily explained if populations of small grains are being released by sublimation of ice from icy planetesimals. Evaporation of low-eccentricity icy bodies at {approx}150 K can deposit particles into an inner/warm belt, where the small grains are heated to T{sub dust} {approx} 190 K. Alternatively, enhanced collisional processing of an asteroid belt-like system of parent planetesimals just interior to the snow line may account for the observed uniformity in dust temperature. The similarity in temperature of the warmer dust across our B8-K0 stellar sample strongly suggests that dust-producing planetesimals are not found at similar radial locations around all stars, but that dust production is favored at a characteristic temperature horizon.

  5. Common Warm Dust Temperatures Around Main Sequence Stars

    NASA Technical Reports Server (NTRS)

    Morales, Farisa; Rieke, George; Werner, Michael; Stapelfeldt, Karl; Bryden, Geoffrey; Su, Kate

    2011-01-01

    We compare the properties of warm dust emission from a sample of main-sequence A-type stars (B8-A7) to those of dust around solar-type stars (F5-KO) with similar Spitzer Space Telescope Infrared Spectrograph/MIPS data and similar ages. Both samples include stars with sources with infrared spectral energy distributions that show evidence of multiple components. Over the range of stellar types considered, we obtain nearly the same characteristic dust temperatures (∼ 190 K and ∼60 K for the inner and outer dust components, respectively)-slightly above the ice evaporation temperature for the inner belts. The warm inner dust temperature is readily explained if populations of small grains are being released by sublimation of ice from icy planetesimals. Evaporation of low-eccentricity icy bodies at ∼ 150 K can deposit particles into an inner/warm belt, where the small grains are heated to dust Temperatures of -190 K. Alternatively, enhanced collisional processing of an asteroid belt-like system of parent planetesimals just interior to the snow line may account for the observed uniformity in dust temperature. The similarity in temperature of the warmer dust across our B8-KO stellar sample strongly suggests that dust-producing planetesimals are not found at similar radial locations around all stars, but that dust production is favored at a characteristic temperature horizon.

  6. Anti-Sigma Factors in E. coli: Common Regulatory Mechanisms Controlling Sigma Factors Availability

    PubMed Central

    Treviño-Quintanilla, Luis Gerardo; Freyre-González, Julio Augusto; Martínez-Flores, Irma

    2013-01-01

    In bacteria, transcriptional regulation is a key step in cellular gene expression. All bacteria contain a core RNA polymerase that is catalytically competent but requires an additional σ factor for specific promoter recognition and correct transcriptional initiation. The RNAP core is not able to selectively bind to a given σ factor. In contrast, different σ factors have different affinities for the RNAP core. As a consequence, the concentration of alternate σ factors requires strict regulation in order to properly control the delicate interplay among them, which favors the competence for the RNAP core. This control is archived by different σ/anti-σ controlling mechanisms that shape complex regulatory networks and cascades, and enable the response to sudden environmental cues, whose global understanding is a current challenge for systems biology. Although there have been a number of excellent studies on each of these σ/anti-σ post-transcriptional regulatory systems, no comprehensive comparison of these mechanisms in a single model organism has been conducted. Here, we survey all these systems in E. coli dissecting and analyzing their inner workings and highlightin their differences. Then, following an integral approach, we identify their commonalities and outline some of the principles exploited by the cell to effectively and globally reprogram the transcriptional machinery. These principles provide guidelines for developing biological synthetic circuits enabling an efficient and robust response to sudden stimuli. PMID:24396271

  7. Anti-Sigma Factors in E. coli: Common Regulatory Mechanisms Controlling Sigma Factors Availability.

    PubMed

    Treviño-Quintanilla, Luis Gerardo; Freyre-González, Julio Augusto; Martínez-Flores, Irma

    2013-09-01

    In bacteria, transcriptional regulation is a key step in cellular gene expression. All bacteria contain a core RNA polymerase that is catalytically competent but requires an additional σ factor for specific promoter recognition and correct transcriptional initiation. The RNAP core is not able to selectively bind to a given σ factor. In contrast, different σ factors have different affinities for the RNAP core. As a consequence, the concentration of alternate σ factors requires strict regulation in order to properly control the delicate interplay among them, which favors the competence for the RNAP core. This control is archived by different σ/anti-σ controlling mechanisms that shape complex regulatory networks and cascades, and enable the response to sudden environmental cues, whose global understanding is a current challenge for systems biology. Although there have been a number of excellent studies on each of these σ/anti-σ post-transcriptional regulatory systems, no comprehensive comparison of these mechanisms in a single model organism has been conducted. Here, we survey all these systems in E. coli dissecting and analyzing their inner workings and highlightin their differences. Then, following an integral approach, we identify their commonalities and outline some of the principles exploited by the cell to effectively and globally reprogram the transcriptional machinery. These principles provide guidelines for developing biological synthetic circuits enabling an efficient and robust response to sudden stimuli.

  8. Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers.

    PubMed

    Goren, Amir; Ram, Oren; Amit, Maayan; Keren, Hadas; Lev-Maor, Galit; Vig, Ida; Pupko, Tal; Ast, Gil

    2006-06-23

    Exonic splicing regulatory sequences (ESRs) are cis-acting factor binding sites that regulate constitutive and alternative splicing. A computational method based on the conservation level of wobble positions and the overabundance of sequence motifs between 46,103 human and mouse orthologous exons was developed, identifying 285 putative ESRs. Alternatively spliced exons that are either short in length or contain weak splice sites show the highest conservation level of those ESRs, especially toward the edges of exons. ESRs that are abundant in those subgroups show a different distribution between constitutively and alternatively spliced exons. Representatives of these ESRs and two SR protein binding sites were shown, experimentally, to display variable regulatory effects on alternative splicing, depending on their relative locations in the exon. This finding signifies the delicate positional effect of ESRs on alternative splicing regulation.

  9. Divergence in cis-regulatory sequences surrounding the opsin gene arrays of African cichlid fishes

    PubMed Central

    2011-01-01

    Background Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression. Results We identified 20 conserved non-coding elements surrounding the opsins of cichlids and other teleosts, including one known enhancer and a retinal microRNA. Most conserved elements contained computationally-predicted binding sites that correspond to transcription factors that function in vertebrate opsin expression; O. niloticus and M. zebra were significantly divergent in two of these. Similarly, we found a large number of relevant transcription factor binding sites within each opsin's proximal promoter, and identified five opsins that were considerably divergent in both expression and the number of transcription factor binding sites shared between O. niloticus and M. zebra. We also found several microRNA target sites within the 3'-UTR of each opsin, including two 3'-UTRs that differ significantly between O. niloticus and M. zebra. Finally, we examined interspecific divergence among 18 phenotypically diverse cichlids from Lake Malawi for one conserved non-coding element, two 3'-UTRs, and five opsin proximal promoters. We found that all regions were highly conserved with some evidence of CRX transcription factor binding site turnover. We

  10. Divergence in cis-regulatory sequences surrounding the opsin gene arrays of African cichlid fishes.

    PubMed

    O'Quin, Kelly E; Smith, Daniel; Naseer, Zan; Schulte, Jane; Engel, Samuel D; Loh, Yong-Hwee E; Streelman, J Todd; Boore, Jeffrey L; Carleton, Karen L

    2011-05-09

    Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression. We identified 20 conserved non-coding elements surrounding the opsins of cichlids and other teleosts, including one known enhancer and a retinal microRNA. Most conserved elements contained computationally-predicted binding sites that correspond to transcription factors that function in vertebrate opsin expression; O. niloticus and M. zebra were significantly divergent in two of these. Similarly, we found a large number of relevant transcription factor binding sites within each opsin's proximal promoter, and identified five opsins that were considerably divergent in both expression and the number of transcription factor binding sites shared between O. niloticus and M. zebra. We also found several microRNA target sites within the 3'-UTR of each opsin, including two 3'-UTRs that differ significantly between O. niloticus and M. zebra. Finally, we examined interspecific divergence among 18 phenotypically diverse cichlids from Lake Malawi for one conserved non-coding element, two 3'-UTRs, and five opsin proximal promoters. We found that all regions were highly conserved with some evidence of CRX transcription factor binding site turnover. We also found three

  11. Variation in sequence and organization of splicing regulatory elements in vertebrate genes

    PubMed Central

    Yeo, Gene; Hoon, Shawn; Venkatesh, Byrappa; Burge, Christopher B.

    2004-01-01

    Although core mechanisms and machinery of premRNA splicing are conserved from yeast to human, the details of intron recognition often differ, even between closely related organisms. For example, genes from the pufferfish Fugu rubripes generally contain one or more introns that are not properly spliced in mouse cells. Exploiting available genome sequence data, a battery of sequence analysis techniques was used to reach several conclusions about the organization and evolution of splicing regulatory elements in vertebrate genes. The classical splice site and putative branch site signals are completely conserved across the vertebrates studied (human, mouse, pufferfish, and zebrafish), and exonic splicing enhancers also appear broadly conserved in vertebrates. However, another class of splicing regulatory elements, the intronic splicing enhancers, appears to differ substantially between mammals and fish, with G triples (GGG) very abundant in mammalian introns but comparatively rare in fish. Conversely, short repeats of AC and GT are predicted to function as intronic splicing enhancers in fish but are not enriched in mammalian introns. Consistent with this pattern, exonic splicing enhancer-binding SR proteins are highly conserved across all vertebrates, whereas heterogeneous nuclear ribonucleoproteins, which bind many intronic sequences, vary in domain structure and even presence/absence between mammals and fish. Exploiting differences in intronic sequence composition, a statistical model was developed to predict the splicing phenotype of Fugu introns in mammalian systems and was used to engineer the spliceability of a Fugu intron in human cells by insertion of specific sequences, thereby rescuing splicing in human cells. PMID:15505203

  12. High sequence turnover in the regulatory regions of the developmental gene hunchback in insects.

    PubMed

    Hancock, J M; Shaw, P J; Bonneton, F; Dover, G A

    1999-02-01

    Extensive sequence analysis of the developmental gene hunchback and its 5' and 3' regulatory regions in Drosophila melanogaster, Drosophila virilis, Musca domestica, and Tribolium castaneum, using a variety of computer algorithms, reveals regions of high sequence simplicity probably generated by slippage-like mechanisms of turnover. No regions are entirely refractory to the action of slippage, although the density and composition of simple sequence motifs varies from region to region. Interestingly, the 5' and 3' flanking regions share short repetitive motifs despite their separation by the gene itself, and the motifs are different in composition from those in the exons and introns. Furthermore, there are high levels of conservation of motifs in equivalent orthologous regions. Detailed sequence analysis of the P2 promoter and DNA footprinting assays reveal that the number, orientation, sequence, spacing, and protein-binding affinities of the BICOID-binding sites varies between species and that the 'P2' promoter, the nanos response element in the 3' untranslated region, and several conserved boxes of sequence in the gene (e.g., the two zinc-finger regions) are surrounded by cryptically-simple-sequence DNA. We argue that high sequence turnover and genetic redundancy permit both the general maintenance of promoter functions through the establishment of coevolutionary (compensatory) changes in cis- and trans-acting genetic elements and, at the same time, the possibility of subtle changes in the regulation of hunchback in the different species.

  13. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases.

    PubMed

    Gusev, Alexander; Lee, S Hong; Trynka, Gosia; Finucane, Hilary; Vilhjálmsson, Bjarni J; Xu, Han; Zang, Chongzhi; Ripke, Stephan; Bulik-Sullivan, Brendan; Stahl, Eli; Kähler, Anna K; Hultman, Christina M; Purcell, Shaun M; McCarroll, Steven A; Daly, Mark; Pasaniuc, Bogdan; Sullivan, Patrick F; Neale, Benjamin M; Wray, Naomi R; Raychaudhuri, Soumya; Price, Alkes L

    2014-11-06

    Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg(2)) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg(2) from imputed SNPs (5.1× enrichment; p = 3.7 × 10(-17)) and 38% (SE = 4%) of hg(2) from genotyped SNPs (1.6× enrichment, p = 1.0 × 10(-4)). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg(2) despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease.

  14. Partitioning Heritability of Regulatory and Cell-Type-Specific Variants across 11 Common Diseases

    PubMed Central

    Gusev, Alexander; Lee, S. Hong; Trynka, Gosia; Finucane, Hilary; Vilhjálmsson, Bjarni J.; Xu, Han; Zang, Chongzhi; Ripke, Stephan; Bulik-Sullivan, Brendan; Stahl, Eli; Ripke, Stephan; Neale, Benjamin M.; Corvin, Aiden; Walters, James T.R.; Farh, Kai-How; Holmans, Peter A.; Lee, Phil; Bulik-Sullivan, Brendan; Collier, David A.; Huang, Hailiang; Pers, Tune H.; Agartz, Ingrid; Agerbo, Esben; Albus, Margot; Alexander, Madeline; Amin, Farooq; Bacanu, Silviu A.; Begemann, Martin; Belliveau, Richard A.; Bene, Judit; Bergen, Sarah E.; Bevilacqua, Elizabeth; Bigdeli, Tim B.; Black, Donald W.; Børglum, Anders D.; Bruggeman, Richard; Buccola, Nancy G.; Buckner, Randy L.; Byerley, William; Cahn, Wiepke; Cai, Guiqing; Campion, Dominique; Cantor, Rita M.; Carr, Vaughan J.; Carrera, Noa; Catts, Stanley V.; Chambert, Kimberly D.; Chan, Raymond C.K.; Chen, Ronald Y.L.; Chen, Eric Y.H.; Cheng, Wei; Cheung, Eric F.C.; Chong, Siow Ann; Cloninger, C. Robert; Cohen, David; Cohen, Nadine; Cormican, Paul; Craddock, Nick; Crowley, James J.; Curtis, David; Davidson, Michael; Davis, Kenneth L.; Degenhardt, Franziska; Del Favero, Jurgen; DeLisi, Lynn E.; Demontis, Ditte; Dikeos, Dimitris; Dinan, Timothy; Djurovic, Srdjan; Donohoe, Gary; Drapeau, Elodie; Duan, Jubao; Dudbridge, Frank; Durmishi, Naser; Eichhammer, Peter; Eriksson, Johan; Escott-Price, Valentina; Essioux, Laurent; Fanous, Ayman H.; Farrell, Martilias S.; Frank, Josef; Franke, Lude; Freedman, Robert; Freimer, Nelson B.; Friedl, Marion; Friedman, Joseph I.; Fromer, Menachem; Genovese, Giulio; Georgieva, Lyudmila; Gershon, Elliot S.; Giegling, Ina; Giusti-Rodrguez, Paola; Godard, Stephanie; Goldstein, Jacqueline I.; Golimbet, Vera; Gopal, Srihari; Gratten, Jacob; Grove, Jakob; de Haan, Lieuwe; Hammer, Christian; Hamshere, Marian L.; Hansen, Mark; Hansen, Thomas; Haroutunian, Vahram; Hartmann, Annette M.; Henskens, Frans A.; Herms, Stefan; Hirschhorn, Joel N.; Hoffmann, Per; Hofman, Andrea; Hollegaard, Mads V.; Hougaard, David M.; Ikeda, Masashi; Joa, Inge; Julià, Antonio; Kahn, René S.; Kalaydjieva, Luba; Karachanak-Yankova, Sena; Karjalainen, Juha; Kavanagh, David; Keller, Matthew C.; Kelly, Brian J.; Kennedy, James L.; Khrunin, Andrey; Kim, Yunjung; Klovins, Janis; Knowles, James A.; Konte, Bettina; Kucinskas, Vaidutis; Kucinskiene, Zita Ausrele; Kuzelova-Ptackova, Hana; Kähler, Anna K.; Laurent, Claudine; Keong, Jimmy Lee Chee; Lee, S. Hong; Legge, Sophie E.; Lerer, Bernard; Li, Miaoxin; Li, Tao; Liang, Kung-Yee; Lieberman, Jeffrey; Limborska, Svetlana; Loughland, Carmel M.; Lubinski, Jan; Lnnqvist, Jouko; Macek, Milan; Magnusson, Patrik K.E.; Maher, Brion S.; Maier, Wolfgang; Mallet, Jacques; Marsal, Sara; Mattheisen, Manuel; Mattingsdal, Morten; McCarley, Robert W.; McDonald, Colm; McIntosh, Andrew M.; Meier, Sandra; Meijer, Carin J.; Melegh, Bela; Melle, Ingrid; Mesholam-Gately, Raquelle I.; Metspalu, Andres; Michie, Patricia T.; Milani, Lili; Milanova, Vihra; Mokrab, Younes; Morris, Derek W.; Mors, Ole; Mortensen, Preben B.; Murphy, Kieran C.; Murray, Robin M.; Myin-Germeys, Inez; Mller-Myhsok, Bertram; Nelis, Mari; Nenadic, Igor; Nertney, Deborah A.; Nestadt, Gerald; Nicodemus, Kristin K.; Nikitina-Zake, Liene; Nisenbaum, Laura; Nordin, Annelie; O’Callaghan, Eadbhard; O’Dushlaine, Colm; O’Neill, F. Anthony; Oh, Sang-Yun; Olincy, Ann; Olsen, Line; Van Os, Jim; Pantelis, Christos; Papadimitriou, George N.; Papiol, Sergi; Parkhomenko, Elena; Pato, Michele T.; Paunio, Tiina; Pejovic-Milovancevic, Milica; Perkins, Diana O.; Pietilinen, Olli; Pimm, Jonathan; Pocklington, Andrew J.; Powell, John; Price, Alkes; Pulver, Ann E.; Purcell, Shaun M.; Quested, Digby; Rasmussen, Henrik B.; Reichenberg, Abraham; Reimers, Mark A.; Richards, Alexander L.; Roffman, Joshua L.; Roussos, Panos; Ruderfer, Douglas M.; Salomaa, Veikko; Sanders, Alan R.; Schall, Ulrich; Schubert, Christian R.; Schulze, Thomas G.; Schwab, Sibylle G.; Scolnick, Edward M.; Scott, Rodney J.; Seidman, Larry J.; Shi, Jianxin; Sigurdsson, Engilbert; Silagadze, Teimuraz; Silverman, Jeremy M.; Sim, Kang; Slominsky, Petr; Smoller, Jordan W.; So, Hon-Cheong; Spencer, Chris C.A.; Stahl, Eli A.; Stefansson, Hreinn; Steinberg, Stacy; Stogmann, Elisabeth; Straub, Richard E.; Strengman, Eric; Strohmaier, Jana; Stroup, T. Scott; Subramaniam, Mythily; Suvisaari, Jaana; Svrakic, Dragan M.; Szatkiewicz, Jin P.; Sderman, Erik; Thirumalai, Srinivas; Toncheva, Draga; Tooney, Paul A.; Tosato, Sarah; Veijola, Juha; Waddington, John; Walsh, Dermot; Wang, Dai; Wang, Qiang; Webb, Bradley T.; Weiser, Mark; Wildenauer, Dieter B.; Williams, Nigel M.; Williams, Stephanie; Witt, Stephanie H.; Wolen, Aaron R.; Wong, Emily H.M.; Wormley, Brandon K.; Wu, Jing Qin; Xi, Hualin Simon; Zai, Clement C.; Zheng, Xuebin; Zimprich, Fritz; Wray, Naomi R.; Stefansson, Kari; Visscher, Peter M.; Adolfsson, Rolf; Andreassen, Ole A.; Blackwood, Douglas H.R.; Bramon, Elvira; Buxbaum, Joseph D.; Brglum, Anders D.; Cichon, Sven; Darvasi, Ariel; Domenici, Enrico; Ehrenreich, Hannelore; Esko, Tõnu; Gejman, Pablo V.; Gill, Michael; Gurling, Hugh; Hultman, Christina M.; Iwata, Nakao; Jablensky, Assen V.; Jönsson, Erik G.; Kendler, Kenneth S.; Kirov, George; Knight, Jo; Lencz, Todd; Levinson, Douglas F.; Li, Qingqin S.; Liu, Jianjun; Malhotra, Anil K.; McCarroll, Steven A.; McQuillin, Andrew; Moran, Jennifer L.; Mortensen, Preben B.; Mowry, Bryan J.; Nthen, Markus M.; Ophoff, Roel A.; Owen, Michael J.; Palotie, Aarno; Pato, Carlos N.; Petryshen, Tracey L.; Posthuma, Danielle; Rietschel, Marcella; Riley, Brien P.; Rujescu, Dan; Sham, Pak C.; Sklar, Pamela; St. Clair, David; Weinberger, Daniel R.; Wendland, Jens R.; Werge, Thomas; Daly, Mark J.; Sullivan, Patrick F.; O’Donovan, Michael C.; Ripke, Stephan; O’Dushlaine, Colm; Chambert, Kimberly; Moran, Jennifer L.; Kähler, Anna K.; Akterin, Susanne; Bergen, Sarah; Magnusson, Patrik K.E.; Neale, Benjamin M.; Ruderfer, Douglas; Scolnick, Edward; Purcell, Shaun; McCarroll, Steve; Sklar, Pamela; Hultman, Christina M.; Sullivan, Patrick F.; Kähler, Anna K.; Hultman, Christina M.; Purcell, Shaun M.; McCarroll, Steven A.; Daly, Mark; Pasaniuc, Bogdan; Sullivan, Patrick F.; Neale, Benjamin M.; Wray, Naomi R.; Raychaudhuri, Soumya; Price, Alkes L.

    2014-01-01

    Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg2) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg2 from imputed SNPs (5.1× enrichment; p = 3.7 × 10−17) and 38% (SE = 4%) of hg2 from genotyped SNPs (1.6× enrichment, p = 1.0 × 10−4). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg2 despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease. PMID:25439723

  15. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  16. Organization of the lexA gene of Escherichia coli and nucleotide sequence of the regulatory region.

    PubMed Central

    Miki, T; Ebina, Y; Kishi, F; Nakazawa, A

    1981-01-01

    The product of the lexA gene of Escherichia coli has been shown to regulate expression of the several cellular functions (SOS functions) induced by treatments which abruptly inhibit DNA synthesis. We have cloned and mapped the lexA gene on a small segment of approximately 600 base pairs. The lexA promotor was located by transcription R-loop analysis, and the lexA product of 22,000 daltons was identified by protein synthesis in vitro. An unknown gene was found which directed the synthesis of a protein of 35,000 daltons in a region downstream from the lexA gene. Nucleotide sequence of the regulatory region of the lexA gene was determined. The sequence contained inverted repeats homologous to that of the recA regulatory region. These inverted repeats may be recognized by the lexA protein, because the protein is considered to repress both the genes as a common repressor. Images PMID:6261224

  17. Time Delayed Causal Gene Regulatory Network Inference with Hidden Common Causes

    PubMed Central

    Lo, Leung-Yau; Wong, Man-Leung; Lee, Kin-Hong; Leung, Kwong-Sak

    2015-01-01

    Inferring the gene regulatory network (GRN) is crucial to understanding the working of the cell. Many computational methods attempt to infer the GRN from time series expression data, instead of through expensive and time-consuming experiments. However, existing methods make the convenient but unrealistic assumption of causal sufficiency, i.e. all the relevant factors in the causal network have been observed and there are no unobserved common cause. In principle, in the real world, it is impossible to be certain that all relevant factors or common causes have been observed, because some factors may not have been conceived of, and therefore are impossible to measure. In view of this, we have developed a novel algorithm named HCC-CLINDE to infer an GRN from time series data allowing the presence of hidden common cause(s). We assume there is a sparse causal graph (possibly with cycles) of interest, where the variables are continuous and each causal link has a delay (possibly more than one time step). A small but unknown number of variables are not observed. Each unobserved variable has only observed variables as children and parents, with at least two children, and the children are not linked to each other. Since it is difficult to obtain very long time series, our algorithm is also capable of utilizing multiple short time series, which is more realistic. To our knowledge, our algorithm is far less restrictive than previous works. We have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. The results show that our algorithm can adequately recover the true causal GRN and is robust to slight deviation from Gaussian distribution in the error terms. We have also demonstrated the potential of our algorithm on small YEASTRACT subnetworks using limited real data. PMID:26394325

  18. Time Delayed Causal Gene Regulatory Network Inference with Hidden Common Causes.

    PubMed

    Lo, Leung-Yau; Wong, Man-Leung; Lee, Kin-Hong; Leung, Kwong-Sak

    2015-01-01

    Inferring the gene regulatory network (GRN) is crucial to understanding the working of the cell. Many computational methods attempt to infer the GRN from time series expression data, instead of through expensive and time-consuming experiments. However, existing methods make the convenient but unrealistic assumption of causal sufficiency, i.e. all the relevant factors in the causal network have been observed and there are no unobserved common cause. In principle, in the real world, it is impossible to be certain that all relevant factors or common causes have been observed, because some factors may not have been conceived of, and therefore are impossible to measure. In view of this, we have developed a novel algorithm named HCC-CLINDE to infer an GRN from time series data allowing the presence of hidden common cause(s). We assume there is a sparse causal graph (possibly with cycles) of interest, where the variables are continuous and each causal link has a delay (possibly more than one time step). A small but unknown number of variables are not observed. Each unobserved variable has only observed variables as children and parents, with at least two children, and the children are not linked to each other. Since it is difficult to obtain very long time series, our algorithm is also capable of utilizing multiple short time series, which is more realistic. To our knowledge, our algorithm is far less restrictive than previous works. We have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. The results show that our algorithm can adequately recover the true causal GRN and is robust to slight deviation from Gaussian distribution in the error terms. We have also demonstrated the potential of our algorithm on small YEASTRACT subnetworks using limited real data.

  19. Conserved regulatory elements of the promoter sequence of the gene rpoH of enteric bacteria

    PubMed Central

    Ramírez-Santos, Jesús; Collado-Vides, Julio; García-Varela, Martin; Gómez-Eichelmann, M. Carmen

    2001-01-01

    The rpoH regulatory region of different members of the enteric bacteria family was sequenced or downloaded from GenBank and compared. In addition, the transcriptional start sites of rpoH of Yersinia frederiksenii and Proteus mirabilis, two distant members of this family, were determined. Sequences similar to the σ70 promoters P1, P4 and P5, to the σE promoter P3 and to boxes DnaA1, DnaA2, cAMP receptor protein (CRP) boxes CRP1, CRP2 and box CytR present in Escherichia coli K12, were identified in sequences of closely related bacteria such as: E.coli, Shigella flexneri, Salmonella enterica serovar Typhimurium, Citrobacter freundii, Enterobacter cloacae and Klebsiella pneumoniae. In more distant bacteria, Y.frederiksenii and P.mirabilis, the rpoH regulatory region has a distal P1-like σ70 promoter and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. Sequences similar to the regulatory boxes were not identified in these bacteria. This study suggests that the general pattern of transcription of the rpoH gene in enteric bacteria includes a distal σ70 promoter, >200 nt upstream of the initiation codon, and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. A second proximal σ70 promoter under catabolite-regulation is probably present only in bacteria closely related to E.coli. PMID:11139607

  20. LedPred: an R/bioconductor package to predict regulatory sequences using support vector machines.

    PubMed

    Seyres, Denis; Darbo, Elodie; Perrin, Laurent; Herrmann, Carl; González, Aitor

    2016-04-01

    Supervised classification based on support vector machines (SVMs) has successfully been used for the prediction of cis-regulatory modules (CRMs). However, no integrated tool using such heterogeneous data as position-specific scoring matrices, ChIP-seq data or conservation scores is currently available. Here, we present LedPred, a flexible SVM workflow that predicts new regulatory sequences based on the annotation of known CRMs, which are associated to a large variety of feature types. LedPred is provided as an R/Bioconductor package connected to an online server to avoid installation of non-R software. Due to the heterogeneous CRM feature integration, LedPred excels at the prediction of regulatory sequences in Drosophila and mouse datasets compared with similar SVM-based software. LedPred is available on GitHub: https://github.com/aitgon/LedPred and Bioconductor: http://bioconductor.org/packages/release/bioc/html/LedPred.html under the MIT license. aitor.gonzalez@univ-amu.fr Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Identification of the regulatory sequence of anaerobically expressed locus aeg-46.5.

    PubMed Central

    Choe, M; Reznikoff, W S

    1993-01-01

    A newly identified anaerobically expressed locus, aeg-46.5, which is located at min 46.5 on Escherichia coli linkage map, was cloned and analyzed. The phenotype of this gene was studied by using a lacZ operon fusion. aeg-46.5 is induced anaerobically in the presence of nitrate in wild-type and narL cells. It is repressed by the narL gene product, as it showed derepressed anaerobic expression in narL mutant cells. We postulate that aeg-46.5 is subject to multiple regulatory systems, activation as a result of anaerobiosis, narL-independent nitrate-dependent activation, and narL-mediated repression. The regulatory region of aeg-46.5 was identified. A 304-bp DNA sequence which includes the regulatory elements was obtained, and the 5' end of aeg-46.5 mRNA was identified. It was verified that the anaerobic regulation of aeg-46.5 expression is controlled on the transcriptional level. Computer analysis predicted possible control sites for the NarL and FNR proteins. The proposed NarL site was found in a perfect-symmetry element. The aeg-46.5 regulatory elements are adjacent to, but divergent from, those of the eco gene. Images PMID:8432709

  2. A pan-cancer modular regulatory network analysis to identify common and cancer-specific network components.

    PubMed

    Knaack, Sara A; Siahpirani, Alireza Fotuhi; Roy, Sushmita

    2014-01-01

    Many human diseases including cancer are the result of perturbations to transcriptional regulatory networks that control context-specific expression of genes. A comparative approach across multiple cancer types is a powerful approach to illuminate the common and specific network features of this family of diseases. Recent efforts from The Cancer Genome Atlas (TCGA) have generated large collections of functional genomic data sets for multiple types of cancers. An emerging challenge is to devise computational approaches that systematically compare these genomic data sets across different cancer types that identify common and cancer-specific network components. We present a module- and network-based characterization of transcriptional patterns in six different cancers being studied in TCGA: breast, colon, rectal, kidney, ovarian, and endometrial. Our approach uses a recently developed regulatory network reconstruction algorithm, modular regulatory network learning with per gene information (MERLIN), within a stability selection framework to predict regulators for individual genes and gene modules. Our module-based analysis identifies a common theme of immune system processes in each cancer study, with modules statistically enriched for immune response processes as well as targets of key immune response regulators from the interferon regulatory factor (IRF) and signal transducer and activator of transcription (STAT) families. Comparison of the inferred regulatory networks from each cancer type identified a core regulatory network that included genes involved in chromatin remodeling, cell cycle, and immune response. Regulatory network hubs included genes with known roles in specific cancer types as well as genes with potentially novel roles in different cancer types. Overall, our integrated module and network analysis recapitulated known themes in cancer biology and additionally revealed novel regulatory hubs that suggest a complex interplay of immune response, cell

  3. Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

    PubMed Central

    Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

    2012-01-01

    An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661

  4. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA.

    PubMed

    Turner, Tychele N; Hormozdiari, Fereydoun; Duyzend, Michael H; McClymont, Sarah A; Hook, Paul W; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A; Zody, Michael C; Nelson, Bradley J; Huddleston, John; Sandstrom, Richard; Smith, Joshua D; Hanna, David; Swanson, James M; Faustman, Elaine M; Bamshad, Michael J; Stamatoyannopoulos, John; Nickerson, Deborah A; McCallion, Andrew S; Darnell, Robert; Eichler, Evan E

    2016-01-07

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. Copyright © 2016 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  5. Two lamprey Hedgehog genes share non-coding regulatory sequences and expression patterns with gnathostome Hedgehogs.

    PubMed

    Kano, Shungo; Xiao, Jin-Hua; Osório, Joana; Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-10-13

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences.

  6. Two Lamprey Hedgehog Genes Share Non-Coding Regulatory Sequences and Expression Patterns with Gnathostome Hedgehogs

    PubMed Central

    Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-01-01

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences. PMID:20967201

  7. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  8. Phylogenetic Relationships and the Evolution of Regulatory Gene Sequences in the Parrotfishes

    PubMed Central

    Smith, Lydia L.; Fessler, Jennifer L.; Alfaro, Michael E.; Streelman, J. Todd; Westneat, Mark W.

    2008-01-01

    Regulatory genes control the expression of other genes and are key components of developmental processes such as segmentation and embryonic construction of the skull in vertebrates. Here we examine the variability and evolution of three vertebrate regulatory genes, addressing issues of their utility for phylogenetics and comparing the rates of genetic change seen in regulatory loci to the rates seen in other genes in the parrotfishes. The parrotfishes are a diverse group of colorful fishes from coral reefs and seagrasses worldwide and have been placed phylogenetically within the family Labridae. We tested phylogenetic hypotheses among the parrotfishes, with a focus on the genera Chlorurus and Scarus, by analyzing eight gene fragments for 42 parrotfishes and eight outgroup species. We sequenced mitochondrial 12s rRNA (967 bp), 16s rRNA (577 bp), and cytochrome b (477 bp). From the nuclear genome, we sequenced part of the protein-coding genes rag2 (715 bp), tmo4c4 (485 bp), and the developmental regulatory genes otx1 (672 bp), bmp4 (488 bp), and dlx2 (522 bp). Bayesian, likelihood, and parsimony analyses on the resulting 4903 bp of DNA sequence produced similar topologies that confirm the monophyly of the scarines and provide a phylogeny at the species level for portions of the genera Scarus and Chlorurus. Four major clades of Scarus were recovered, with three distributed in the Indo-Pacific and one containing Caribbean/Atlantic taxa. Molecular rates suggest a Miocene origin of the parrotfishes (22 mya) and a recent divergence of species within Scarus and Chlorurus, within the past 5 million years. Developmentally important genes made a significant contribution to phylogenetic structure, and rates of genetic evolution were high in bmp4, similar to other coding nuclear genes, but low in otx1 and the dlx2 exons. Synonymous and nonsynonymous substitution patterns in developmental regulatory genes support the hypothesis of stabilizing selection during the history of

  9. Integrated analysis of microRNA regulatory network in nasopharyngeal carcinoma with deep sequencing.

    PubMed

    Wang, Fan; Lu, Juan; Peng, Xiaohong; Wang, Jie; Liu, Xiong; Chen, Xiaomei; Jiang, Yiqi; Li, Xiangping; Zhang, Bao

    2016-01-22

    MicroRNAs (miRNAs) have been shown to play a critical role in the development and progression of nasopharyngeal carcinoma (NPC). Although accumulating studies have been performed on the molecular mechanisms of NPC, the miRNA regulatory networks in cancer progression remain largely unknown. Laser capture microdissection (LCM) and deep sequencing are powerful tools that can help us to detect the integrated view of miRNA-target network. Illumina Hiseq2000 deep sequencing was used to screen differentially expressed miRNAs in laser-microdessected biopsies between 12 NPC and 8 chronic nasopharyngitis patients. The result was validated by real-time PCR on 201 NPC and 25 chronic nasopharyngitis patients. The potential candidate target genes of the miRNAs were predicted using published target prediction softwares (RNAhybrid, TargetScan, Miranda, PITA), and the overlay part was analyzed in Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) biological process. The miRNA regulatory network analysis was performed using the Ingenuity Pathway Analysis (IPA) software. Eight differentially expressed miRNAs were identified between NPC and chronic nasopharyngitis patients by deep sequencing. Further qRT-PCR assays confirmed 3 down-regulated miRNAs (miR-34c-5p, miR-375 and miR-449c-5p), 4 up-regulated miRNAs (miR-205-5p, miR-92a-3p, miR-193b-3p and miR-27a-5p). Additionally, the low level of miR-34c-5p (miR-34c) was significantly correlated with advanced TNM stage. GO and KEGG enrichment analyses showed that 914 target genes were involved in cell cycle, cytokine secretion and tumor immunology, and so on. IPA revealed that cancer was the top disease associated with those dysregulated miRNAs, and the genes regulated by miR-34c were in the center of miRNA-mRNA regulatory network, including TP53, CCND1, CDK6, MET and BCL2, and the PI3K/AKT/ mTOR signaling was regarded as a significant function pathway in this network. Our study presents the current knowledge of mi

  10. Exploring the reasons for the large density of triplex-forming oligonucleotide target sequences in the human regulatory regions

    PubMed Central

    Goñi, Josep Ramon; Vaquerizas, Juan Manuel; Dopazo, Joaquin; Orozco, Modesto

    2006-01-01

    Background DNA duplex sequences that can be targets for triplex formation are highly over-represented in the human genome, especially in regulatory regions. Results Here we studied using bioinformatics tools several properties of triplex target sequences in an attempt to determine those that make these sequences so special in the genome. Conclusion Our results strongly suggest that the unique physical properties of these sequences make them particularly suitable as "separators" between protein-recognition sites in the promoter region. PMID:16566817

  11. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  12. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  13. Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation

    PubMed Central

    2010-01-01

    Background Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application. Results Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants. Conclusion Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing. PMID:20707912

  14. [Identification of common medicinal snakes in medicated liquor of Guangdong by COI barcode sequence].

    PubMed

    Liao, Jing; Chao, Zhi; Zhang, Liang

    2013-11-01

    To identify the common snakes in medicated liquor of Guangdong using COI barcode sequence,and to test the feasibility. The COI barcode sequences of collected medicinal snakes were amplified and sequenced. The sequences combined with the data from GenBank were analyzed for divergence and building a neighbor-joining(NJ) tree with MEGA 5.0. The genetic distance and NJ tree demonstrated that there were 241 variable sites in these species, and the average (A + T) content of 56.2% was higher than the average (G + C) content of 43.7%. The maximum interspecific genetic distance was 0.2568, and the minimum was 0. 1519. In the NJ tree,each species formed a monophyletic clade with bootstrap supports of 100%. DNA barcoding identification method based on the COI sequence is accurate and can be applied to identify the common medicinal snakes.

  15. IL-10-Producing Regulatory B Cells Are Decreased in Patients with Common Variable Immunodeficiency

    PubMed Central

    Costa, Priscilla Ramos; Barros, Myrthes Toledo; Kalil, Jorge; Kokron, Cristina Maria

    2016-01-01

    Common variable immunodeficiency (CVID) is the most prevalent symptomatic primary immunodeficiency in adults. CVID patients often present changes in the frequency and function of B lymphocytes, reduced number of Treg cells, chronic immune activation, recurrent infections, high incidence of autoimmunity and increased risk for malignancies. We hypothesized that the frequency of B10 cells would be diminished in CVID patients because these cells play an important role in the development of Treg cells and in the control of T cell activation and autoimmunity. Therefore, we evaluated the frequency of B10 cells in CVID patients and correlated it with different clinical and immunological characteristics of this disease. Forty-two CVID patients and 17 healthy controls were recruited for this study. Cryopreserved PBMCs were used for analysis of T cell activation, frequency of Treg cells and characterization of B10 cells by flow cytometry. IL-10 production by sorted B cells culture and plasma sCD14 were determined by ELISA. We found that CVID patients presented decreased frequency of IL-10-producing CD24hiCD38hi B cells in different cell culture conditions and decreased frequency of IL-10-producing CD24hiCD27+ B cells stimulated with CpG+PIB. Moreover, we found that CVID patients presented lower secretion of IL-10 by sorting-purified B cells when compared to healthy controls. The frequency of B10 cells had no correlation with autoimmunity, immune activation and Treg cells in CVID patients. This work suggests that CVID patients have a compromised regulatory B cell compartment which is not correlated with clinical and immunological characteristics presented by these individuals. PMID:26991898

  16. Analysis of common k-mers for whole genome sequences using SSB-tree.

    PubMed

    Choi, Jeong-Hyeon; Cho, Hwan-Gue

    2002-01-01

    As sequenced genomes become larger and sequencing process becomes faster, there is a need to develop a tool to analyze sequences in the whole genomic scale. However, on-memory algorithms such as suffix tree and suffix array are not applicable to the analysis of whole genome sequence set, since the size of individual whole genome ranges from several million base pairs to hundreds billion base pairs. In order to effectively manipulate the huge sequence data, it is necessary to use the indexed data structure for external memory. In this paper, we introduce a workbench called SequeX for the analysis and visualization of whole genome sequences using SSB-tree (Static SB-tree). It consists of two parts: the analysis query subsystem and the visualization subsystem. The query subsystem supports various transactions such as pattern matching, k-occurrence, and k-mer analysis. The visualization subsystem helps biologists to easily understand whole genome structure and feature by sequence viewer, annotation viewer, CGR (Chaos Game Representation) viewer, and k-mer viewer. The system also supports a user-friendly programming interface based on Java script for batch processing and the extension for a specific purpose of a user. SequeX can be used to identify conserved genes or sequences by the analysis of the common k-mers and annotation. We analyze the common k-mer for 72 microbial genomes announced by Entrez, and find an interesting biological fact that the longest common k-mer for 72 sequences is 11-mer, and only 11 such sequences exist. Finally we note that many common k-mers occur in conserved region such as CDS, rRNA, and tRNA.

  17. Detecting Functional Divergence after Gene Duplication through Evolutionary Changes in Posttranslational Regulatory Sequences

    PubMed Central

    Nguyen Ba, Alex N.; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L.; Landry, Christian R.; Moses, Alan M.

    2014-01-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. PMID:25474245

  18. Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

    PubMed

    Nguyen Ba, Alex N; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L; Landry, Christian R; Moses, Alan M

    2014-12-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

  19. Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants

    PubMed Central

    Ionita-Laza, Iuliana; Lee, Seunggeun; Makarov, Vlad; Buxbaum, Joseph D.; Lin, Xihong

    2013-01-01

    Recent developments in sequencing technologies have made it possible to uncover both rare and common genetic variants. Genome-wide association studies (GWASs) can test for the effect of common variants, whereas sequence-based association studies can evaluate the cumulative effect of both rare and common variants on disease risk. Many groupwise association tests, including burden tests and variance-component tests, have been proposed for this purpose. Although such tests do not exclude common variants from their evaluation, they focus mostly on testing the effect of rare variants by upweighting rare-variant effects and downweighting common-variant effects and can therefore lose substantial power when both rare and common genetic variants in a region influence trait susceptibility. There is increasing evidence that the allelic spectrum of risk variants at a given locus might include novel, rare, low-frequency, and common genetic variants. Here, we introduce several sequence kernel association tests to evaluate the cumulative effect of rare and common variants. The proposed tests are computationally efficient and are applicable to both binary and continuous traits. Furthermore, they can readily combine GWAS and whole-exome-sequencing data on the same individuals, when available, and are also applicable to deep-resequencing data of GWAS loci. We evaluate these tests on data simulated under comprehensive scenarios and show that compared with the most commonly used tests, including the burden and variance-component tests, they can achieve substantial increases in power. We next show applications to sequencing studies for Crohn disease and autism spectrum disorders. The proposed tests have been incorporated into the software package SKAT. PMID:23684009

  20. Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property.

    PubMed

    Zhong, Wei; Altun, Gulsah; Harrison, Robert; Tai, Phang C; Pan, Yi

    2005-09-01

    Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse conformation and activities of proteins. In this work, recurring sequence motifs of proteins are explored with an improved K-means clustering algorithm on a new dataset. The structural similarity of these recurring sequence clusters to produce sequence motifs is studied in order to evaluate the relationship between sequence motifs and their structures. To the best of our knowledge, the dataset used by our research is the most updated dataset among similar studies for sequence motifs. A new greedy initialization method for the K-means algorithm is proposed to improve traditional K-means clustering techniques. The new initialization method tries to choose suitable initial points, which are well separated and have the potential to form high-quality clusters. Our experiments indicate that the improved K-means algorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity. Careful comparison of sequence motifs obtained by the improved and traditional algorithms also suggests that the improved K-means clustering algorithm may discover some relatively weak and subtle sequence motifs, which are undetectable by the traditional K-means algorithms. Many biochemical tests reported in the literature show that these sequence motifs are biologically meaningful. Experimental results also indicate that the improved K-means algorithm generates more detailed sequence motifs representing common structures than previous research. Furthermore, these motifs are universally conserved sequence patterns across protein families, overcoming some weak points of other popular sequence motifs. The satisfactory result of the experiment suggests that this new K-means algorithm may be applied to other areas of bioinformatics

  1. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis

    PubMed Central

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-01-01

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled. PMID:26586576

  2. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    PubMed

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  3. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    SciTech Connect

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.; Loots, Gabriela G.; Houston, Kathryn A.; Dubchak, Inna; Speed, Terence P.; Rubin, Edward M.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs in gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.

  4. Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes.

    PubMed

    Wood, Andrew R; Tuke, Marcus A; Nalls, Mike; Hernandez, Dena; Gibbs, J Raphael; Lin, Haoxiang; Xu, Christopher S; Li, Qibin; Shen, Juan; Jun, Goo; Almeida, Marcio; Tanaka, Toshiko; Perry, John R B; Gaulton, Kyle; Rivas, Manny; Pearson, Richard; Curran, Joanne E; Johnson, Matthew P; Göring, Harald H H; Duggirala, Ravindranath; Blangero, John; Mccarthy, Mark I; Bandinelli, Stefania; Murray, Anna; Weedon, Michael N; Singleton, Andrew; Melzer, David; Ferrucci, Luigi; Frayling, Timothy M

    2015-03-01

    Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008 were common and low frequency (<5%), respectively, low frequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10(-06) (false discovery rate ∼5%)] and one of eight biomarker associations at P < 8 × 10(-10). Very few (30 of 1232; 2%) common variant associations were fully explained by low-frequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.

  5. Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing.

    PubMed

    Liu, Yu; Koyutürk, Mehmet; Maxwell, Sean; Xiang, Min; Veigl, Martina; Cooper, Richard S; Tayo, Bamidele O; Li, Li; LaFramboise, Thomas; Wang, Zhenghe; Zhu, Xiaofeng; Chance, Mark R

    2014-08-16

    Sequences up to several megabases in length have been found to be present in individual genomes but absent in the human reference genome. These sequences may be common in populations, and their absence in the reference genome may indicate rare variants in the genomes of individuals who served as donors for the human genome project. As the reference genome is used in probe design for microarray technology and mapping short reads in next generation sequencing (NGS), this missing sequence could be a source of bias in functional genomic studies and variant analysis. One End Anchor (OEA) and/or orphan reads from paired-end sequencing have been used to identify novel sequences that are absent in reference genome. However, there is no study to investigate the distribution, evolution and functionality of those sequences in human populations. To systematically identify and study the missing common sequences (micSeqs), we extended the previous method by pooling OEA reads from large number of individuals and applying strict filtering methods to remove false sequences. The pipeline was applied to data from phase 1 of the 1000 Genomes Project. We identified 309 micSeqs that are present in at least 1% of the human population, but absent in the reference genome. We confirmed 76% of these 309 micSeqs by comparison to other primate genomes, individual human genomes, and gene expression data. Furthermore, we randomly selected fifteen micSeqs and confirmed their presence using PCR validation in 38 additional individuals. Functional analysis using published RNA-seq and ChIP-seq data showed that eleven micSeqs are highly expressed in human brain and three micSeqs contain transcription factor (TF) binding regions, suggesting they are functional elements. In addition, the identified micSeqs are absent in non-primates and show dynamic acquisition during primate evolution culminating with most micSeqs being present in Africans, suggesting some micSeqs may be important sources of human

  6. CIPHER: a flexible and extensive workflow platform for integrative next-generation sequencing data analysis and genomic regulatory element prediction.

    PubMed

    Guzman, Carlos; D'Orso, Iván

    2017-08-08

    Next-generation sequencing (NGS) approaches are commonly used to identify key regulatory networks that drive transcriptional programs. Although these technologies are frequently used in biological studies, NGS data analysis remains a challenging, time-consuming, and often irreproducible process. Therefore, there is a need for a comprehensive and flexible workflow platform that can accelerate data processing and analysis so more time can be spent on functional studies. We have developed an integrative, stand-alone workflow platform, named CIPHER, for the systematic analysis of several commonly used NGS datasets including ChIP-seq, RNA-seq, MNase-seq, DNase-seq, GRO-seq, and ATAC-seq data. CIPHER implements various open source software packages, in-house scripts, and Docker containers to analyze and process single-ended and pair-ended datasets. CIPHER's pipelines conduct extensive quality and contamination control checks, as well as comprehensive downstream analysis. A typical CIPHER workflow includes: (1) raw sequence evaluation, (2) read trimming and adapter removal, (3) read mapping and quality filtering, (4) visualization track generation, and (5) extensive quality control assessment. Furthermore, CIPHER conducts downstream analysis such as: narrow and broad peak calling, peak annotation, and motif identification for ChIP-seq, differential gene expression analysis for RNA-seq, nucleosome positioning for MNase-seq, DNase hypersensitive site mapping, site annotation and motif identification for DNase-seq, analysis of nascent transcription from Global-Run On (GRO-seq) data, and characterization of chromatin accessibility from ATAC-seq datasets. In addition, CIPHER contains an "analysis" mode that completes complex bioinformatics tasks such as enhancer discovery and provides functions to integrate various datasets together. Using public and simulated data, we demonstrate that CIPHER is an efficient and comprehensive workflow platform that can analyze several NGS

  7. Interspecific "common" repetitive DNA sequences in salamanders of the genus Plethodon.

    PubMed

    Mizuno, S; Andrews, C; Macgregor, H C

    1976-10-12

    Intermediate repetitive sequences of Plethodon cinereus which comprised about 30% of the genomic DNA were isolated and iodinated with 125I. About 5% of the 125I-repetitive fraction hybridized with a large excess of DNA from P. dunni at Cot 20. About half of the 125I-DNA in the hybrids was resistant to extensive digestion with S-1 nuclease. The average molecular size of the S-1 nuclease-resistant fraction was about 100 nucleotide pairs. The melting temperature of the S-1 nuclease-resistant fraction was about 2 degrees lower than that of the corresponding fraction made with P. cinereus DNA. These results are taken to indicate the presence in the genomes of P. cinereus and P. dunni of evolutionarily stable "common" repetitive sequences. The average frequency of repetition of the common repetitive sequences is about 6,000 X in both species. The common repetitive fraction is also present in the genomes of other species of Plethodon, although the general populations of intermediate repetitive sequences are markedly different from one species to another. The cinereus--dunni common repetitive sequences could not be detected in plethodontids belonging to different tribes, nor in more distantly related amphibians. The profiles of binding of the common repetitive sequences to CsCl or CS2SO4-Ag+ density gradient fractions of P. dunni DNA suggested that these sequences consisted of heterogeneous components with respect to base compositions, and that they did not include large amounts of the genes for ribosomal RNA, 5S RNA, 4S RNA, or histone messenger RNA. In situ hybridization of the 3H-labelled intermediate repetitive sequences of P. cinereus to male meiotic chromosomes of the same species gave autoradiographs after an exposure of seven days showing all 14 chromosomes labelled. The pattern of labelling appeared not to be random, but was impossible to analyse on account of the irregular shapes and different degrees of stretching of diplotene and prometaphase chromosomes. In

  8. Complete mitochondrial genome sequence of the common bean anthracnose pathogen Colletotrichum lindemuthianum.

    PubMed

    Gutiérrez, Pablo; Alzate, Juan; Yepes, Mauricio Salazar; Marín, Mauricio

    2016-01-01

    Colletotrichum lindemuthianum is the causal agent of anthracnose in common bean (Phaseolus vulgaris), one of the most limiting factors for this crop in South and Central America. In this work, the mitochondrial sequence of a Colombian isolate of C. lindemuthianum obtained from a common bean plant (var. Cargamanto) with anthracnose symptoms is presented. The mtDNA codes for 13 proteins of the respiratory chain, 1 ribosomal protein, 2 homing endonucleases, 2 ribosomal RNAs and 28 tRNAs. This is the first report of a complete mtDNA genome sequence from C. lindemuthianum.

  9. SISEQ: manipulation of multiple sequence and large database files for common platforms.

    PubMed

    Sato, N

    2000-02-01

    A multiple sequence file converter for common platforms, SISEQ,is described, which performs extraction of DNA sequences that correspond to CDS or RNA field of a large database file as well as subsequent multi-sequence conversions for phylogenetic or molecular biological analysis. Command-line interface as well as a GUI and a script-driven operation mode are provided. The program is freely available to academic users in the form of Macintosh FAT binary, DOS executable, or UNIX source code at http://www.molbiol.saitama-u.ac.jp/ñaoki/ Software.html. naokisat@molbiol.saitama-u.ac.jp

  10. Complete genome sequences of two novel begomoviruses infecting common bean in Venezuela.

    PubMed

    Fiallo-Olivé, Elvira; Márquez-Martín, Belén; Hassan, Ishtiaq; Chirinos, Dorys T; Geraud-Pouey, Francis; Navas-Castillo, Jesús; Moriones, Enrique

    2013-03-01

    The complete genome sequences of isolates of two new bipartite begomoviruses (genus Begomovirus, family Geminiviridae) found infecting common bean in Venezuela are provided. The names proposed for each of these viruses are "bean yellow chlorosis virus" (BYCV) and "bean white chlorosis mosaic virus" (BWCMV). Phylogenetic analysis showed that they segregated in two distinct clades of New World begomoviruses. This is the first report of begomoviruses infecting common bean in Venezuela.

  11. Exceptionally high heterologous protein levels in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Angenon, Geert; Depicker, Ann

    2003-01-01

    Seeds are concentrated sources of protein and thus may be ideal 'bioreactors' for the production of heterologous proteins. For this application, strong seed-specific expression signals are required. A set of expression cassettes were designed using 5' and 3' regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) from Phaseolus vulgaris, and evaluated for the production of heterologous proteins in dicotyledonous plant species. A murine single-chain variable fragment (scFv) was chosen as model protein because of the current industrial interest to produce antibodies and derived fragments in crops. Because the highest scFv accumulation in seed had previously been achieved in the endoplasmic reticulum (ER), the scFv-encoding sequence was provided with signal sequences for accumulation in the ER. Transgenic Arabidopsis seed stocks, expressing the scFv under control of the 35S promoter, contained scFv accumulation levels in the range of 1% of total soluble protein (TSP). However, the seed storage promoter constructs boosted the scFv to exceptionally high levels. Maximum scFv levels were obtained in homozygous seed stocks, being 12.5% of TSP under control of the arc5-I regulatory sequences and even up to 36.5% of TSP upon replacing the arc5-I promoter by the beta-phaseolin promoter of Phaseolus vulgaris. Even at such very high levels, the scFv proteins retain their full antigen-binding activity. Moreover, the presence of very high scFv levels has only minory effects on seed germination and no effect on seed production. These results demonstrate that the expression levels of arcelin 5-I and beta-phaseolin seed storage protein genes can be transferred to heterologous proteins, giving exceptionally high levels of heterologous proteins, which can be of great value for the molecular farming industry by raising production yield and lowering bio-mass production and purification costs. Finally, the feasibility of heterologous protein production using the

  12. The immunogenicity of viral haemorragic septicaemia rhabdovirus (VHSV) DNA vaccines can depend on plasmid regulatory sequences.

    PubMed

    Chico, V; Ortega-Villaizan, M; Falco, A; Tafalla, C; Perez, L; Coll, J M; Estepa, A

    2009-03-18

    A plasmid DNA encoding the viral hemorrhagic septicaemia virus (VHSV)-G glycoprotein under the control of 5' sequences (enhancer/promoter sequence plus both non-coding 1st exon and 1st intron sequences) from carp beta-actin gene (pAE6-G(VHSV)) was compared to the vaccine plasmid usually described the gene expression is regulated by the human cytomegalovirus (CMV) immediate-early promoter (pMCV1.4-G(VHSV)). We observed that these two plasmids produced a markedly different profile in the level and time of expression of the encoded-antigen, and this may have a direct effect upon the intensity and suitability of the in vivo immune response. Thus, fish genetic immunisation assays were carried out to study the immune response of both plasmids. A significantly enhanced specific-antibody response against the viral glycoprotein was found in the fish immunised with pAE6-G(VHSV). However, the protective efficacy against VHSV challenge conferred by both plasmids was similar. Later analysis of the transcription profile of a set of representative immune-related genes in the DNA immunized fish suggested that depending on the plasmid-related regulatory sequences controlling its expression, the plasmid might activate distinct patterns of the immune system. All together, the results from this study mainly point out that the selection of a determinate encoded-antigen/vector combination for genetic immunisation is of extraordinary importance in designing optimised DNA vaccines that, when required for inducing protective immune response, could elicit responses biased to antigen-specific antibodies or cytotoxic T cells generation.

  13. Prediction of Protein Pairs Sharing Common Active Ligands Using Protein Sequence, Structure, and Ligand Similarity.

    PubMed

    Chen, Yu-Chen; Tolbert, Robert; Aronov, Alex M; McGaughey, Georgia; Walters, W Patrick; Meireles, Lidio

    2016-09-26

    We benchmarked the ability of comparative computational approaches to correctly discriminate protein pairs sharing a common active ligand (positive protein pairs) from protein pairs with no common active ligands (negative protein pairs). Since the target and the off-targets of a drug share at least a common ligand, i.e., the drug itself, the prediction of positive protein pairs may help identify off-targets. We evaluated representative protein-centric and ligand-centric approaches, including (1) 2D and 3D ligand similarity, (2) several measures of protein sequence similarity in conjunction with different sequence sources (e.g., full protein sequence versus binding site residues), and (3) a newly described pocket shape similarity and alignment program called SiteHopper. While the sequence-based alignment of pocket residues achieved the best overall performance, SiteHopper outperformed sequence-based approaches for unrelated proteins with only 20-30% pocket residue identity. Analogously, among ligand-centric approaches, path-based fingerprints achieved the best overall performance, but ROCS-based ligand shape similarity outperformed path-based fingerprints for structurally dissimilar ligands (Tanimoto 25%-40%). A significant drop in recognition performance was observed for ligand-centric approaches when PDB ligands were used instead of ChEMBL ligands. Finally, we analyzed the relationship between pocket shape and ligand shape in our data set and found that similar ligands tend to bind to similar pockets while similar pockets may accept a range of different-shaped ligands.

  14. Complete Genome Sequences of Eight Rhizobium Symbionts Associated with Common Bean (Phaseolus vulgaris)

    PubMed Central

    Santamaría, Rosa Isela; Bustos, Patricia; Pérez-Carrascal, Olga María; Miranda-Sánchez, Fabiola; Vinuesa, Pablo; Martínez-Flores, Irma; Juárez, Soledad; Lozano, Luis; Martínez-Romero, Esperanza; Cevallos, Miguel Ángel; Romero, David; Dávila, Guillermo; Ormeño-Orrillo, Ernesto

    2017-01-01

    ABSTRACT We present here the high-quality complete genome sequences of eight strains of Rhizobium-nodulating Phaseolus vulgaris. Comparative analyses showed that some of them belonged to different genomic and evolutionary lineages with common symbiotic properties. Two novel symbiotic plasmids (pSyms) with P. vulgaris specificity are reported here. PMID:28751391

  15. Complete Genome Sequences of Eight Rhizobium Symbionts Associated with Common Bean (Phaseolus vulgaris).

    PubMed

    Santamaría, Rosa Isela; Bustos, Patricia; Pérez-Carrascal, Olga María; Miranda-Sánchez, Fabiola; Vinuesa, Pablo; Martínez-Flores, Irma; Juárez, Soledad; Lozano, Luis; Martínez-Romero, Esperanza; Cevallos, Miguel Ángel; Romero, David; Dávila, Guillermo; Ormeño-Orrillo, Ernesto; González, Víctor

    2017-07-27

    We present here the high-quality complete genome sequences of eight strains of Rhizobium-nodulating Phaseolus vulgaris Comparative analyses showed that some of them belonged to different genomic and evolutionary lineages with common symbiotic properties. Two novel symbiotic plasmids (pSyms) with P. vulgaris specificity are reported here. Copyright © 2017 Santamaría et al.

  16. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    Treesearch

    Shannon C.K. Straub; Mark Fishbein; Tatyana Livshult; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C. Cronn; Aaron. Liston

    2011-01-01

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in...

  17. Complete genome sequence of bean leaf crumple virus, a novel begomovirus infecting common bean in Colombia.

    PubMed

    Carvajal-Yepes, Monica; Zambrano, Leidy; Bueno, Juan M; Raatz, Bodo; Cuellar, Wilmer J

    2017-02-10

    A copy of the complete genome of a novel bipartite begomovirus infecting common bean (Phaseolus vulgaris L.) in Colombia was obtained by rolling-circle amplification (RCA), cloned, and sequenced. The virus is associated with leaf crumple symptoms and significant yield losses in Andean and Mesoamerican beans. Such symptoms have been reported increasingly in Colombia since at least 2002, and we detected the virus in leaf material collected since 2008. Sequence analysis showed that the virus is a member of a distinct species, sharing 81% and 76% nucleotide (nt) sequence identity (in DNA-A and DNA-B, respectively) to other begomoviruses infecting common bean in the Americas. The data obtained support the taxonomic status of this virus (putatively named 'bean leaf crumple virus', BLCrV) as a member of a novel species in the genus Begomovirus.

  18. Candidate regulatory sequence elements for cell cycle-dependent transcription in Saccharomyces cerevisiae.

    PubMed

    Wolfsberg, T G; Gabrielian, A E; Campbell, M J; Cho, R J; Spouge, J L; Landsman, D

    1999-08-01

    Recent developments in genome-wide transcript monitoring have led to a rapid accumulation of data from gene expression studies. Such projects highlight the need for methods to predict the molecular basis of transcriptional coregulation. A microarray project identified the 420 yeast transcripts whose synthesis displays cell cycle-dependent periodicity. We present here a statistical technique we developed to identify the sequence elements that may be responsible for this cell cycle regulation. Because most gene regulatory sites contain a short string of highly conserved nucleotides, any such strings that are involved in gene regulation will occur frequently in the upstream regions of the genes that they regulate, and rarely in the upstream regions of other genes. Our strategy therefore utilizes statistical procedures to identify short oligomers, five or six nucleotides in length, that are over-represented in upstream regions of genes whose expression peaks at the same phase of the cell cycle. We report, with a high level of confidence, that 9 hexamers and 12 pentamers are over-represented in the upstream regions of genes whose expression peaks at the early G(1), late G(1), S, G(2), or M phase of the cell cycle. Some of these sequence elements show a preference for a particular orientation, and others, through a separate statistical test, for a particular position upstream of the ATG start codon. The finding that the majority of the statistically significant sequence elements are located in late G(1) upstream regions correlates with other experiments that identified the late G(1)/early S boundary as a vital cell cycle control point. Our results highlight the importance of MCB, an element implicated previously in late G(1)/early S gene regulation, as most of the late G(1) oligomers contain the MCB sequence or variations thereof. It is striking that most MCB-like sequences localize to a specific region upstream of the ATG start codon. Additional sequences that we have

  19. Sequence analysis of the myosin regulatory light chain gene of the vestimentiferan Riftia pachyptila.

    PubMed

    Ravaux, J; Hassanin, A; Deutsch, J; Gaill, F; Markmann-Mulisch, U

    2001-01-24

    We have isolated and characterized a cDNA (DNA complementary to RNA) clone (Rf69) from the vestimentiferan Riftia pachyptila. The cDNA insert consists of 1169 base pairs. The aminoacid sequence deduced from the longest reading frame is 193 residues in length, and clearly characterized it as a myosin regulatory light chain (RLC). The RLC primary structure is described in relation to its function in muscle contraction. The comparison with other RLCs suggested that Riftia myosin is probably regulated through its RLC either by phosphorylation like the vertebrate smooth muscle myosins, and/or by Ca2+-binding like the mollusk myosins. Riftia RLC possesses a N-terminal extension lacking in all other species besides the earthworm Lumbricus terrestris. Aminoacid sequence comparisons with a number of RLCs from vertebrates and invertebrates revealed a relatively high identity score (64%) between Riftia RLC and the homologous gene from Lumbricus. The relationships between the members of the myosin RLCs were examined by two phylogenetic methods, i.e. distance matrix and maximum parsimony. The resulting trees depict the grouping of the RLCs according to their role in myosin activity regulation. In all trees, Riftia RLC groups with RLCs that depend on Ca2+-binding for myosin activity regulation.

  20. Insights from exome sequencing in common and rare human endocrine disorders

    PubMed Central

    Dauber, Andrew; de Bruin, Christiaan

    2016-01-01

    Exome sequencing has emerged in recent years as a rapid and effective tool for the elucidation of genetic defects underlying both rare and common human disease. Increased availability and decreased costs of next generation sequencing has enabled researchers worldwide to use this approach not only in individual patients with rare diseases, but also to screen larger cohorts or populations for genetic determinants of disease. Within the field of endocrinology, exome sequencing has led to significant advancements in our understanding of numerous disorders including adrenal disease, growth and pubertal disorders, type 2 diabetes, as well as a multitude of rare genetic syndromes with prominent endocrine involvement. In this review, we aim to provide an overview of these recent new insights and discuss the role that exome sequencing is expected to play in endocrine research and clinical practice in the coming years. PMID:25963271

  1. Mining the LIPG allelic spectrum reveals the contribution of rare and common regulatory variants to HDL cholesterol.

    PubMed

    Khetarpal, Sumeet A; Edmondson, Andrew C; Raghavan, Avanthi; Neeli, Hemanth; Jin, Weijun; Badellino, Karen O; Demissie, Serkalem; Manning, Alisa K; DerOhannessian, Stephanie L; Wolfe, Megan L; Cupples, L Adrienne; Li, Mingyao; Kathiresan, Sekar; Rader, Daniel J

    2011-12-01

    Genome-wide association studies (GWAS) have successfully identified loci associated with quantitative traits, such as blood lipids. Deep resequencing studies are being utilized to catalogue the allelic spectrum at GWAS loci. The goal of these studies is to identify causative variants and missing heritability, including heritability due to low frequency and rare alleles with large phenotypic impact. Whereas rare variant efforts have primarily focused on nonsynonymous coding variants, we hypothesized that noncoding variants in these loci are also functionally important. Using the HDL-C gene LIPG as an example, we explored the effect of regulatory variants identified through resequencing of subjects at HDL-C extremes on gene expression, protein levels, and phenotype. Resequencing a portion of the LIPG promoter and 5' UTR in human subjects with extreme HDL-C, we identified several rare variants in individuals from both extremes. Luciferase reporter assays were used to measure the effect of these rare variants on LIPG expression. Variants conferring opposing effects on gene expression were enriched in opposite extremes of the phenotypic distribution. Minor alleles of a common regulatory haplotype and noncoding GWAS SNPs were associated with reduced plasma levels of the LIPG gene product endothelial lipase (EL), consistent with its role in HDL-C catabolism. Additionally, we found that a common nonfunctional coding variant associated with HDL-C (rs2000813) is in linkage disequilibrium with a 5' UTR variant (rs34474737) that decreases LIPG promoter activity. We attribute the gene regulatory role of rs34474737 to the observed association of the coding variant with plasma EL levels and HDL-C. Taken together, the findings show that both rare and common noncoding regulatory variants are important contributors to the allelic spectrum in complex trait loci.

  2. Texture analysis of common renal masses in multiple MR sequences for prediction of pathology

    NASA Astrophysics Data System (ADS)

    Hoang, Uyen N.; Malayeri, Ashkan A.; Lay, Nathan S.; Summers, Ronald M.; Yao, Jianhua

    2017-03-01

    This pilot study performs texture analysis on multiple magnetic resonance (MR) images of common renal masses for differentiation of renal cell carcinoma (RCC). Bounding boxes are drawn around each mass on one axial slice in T1 delayed sequence to use for feature extraction and classification. All sequences (T1 delayed, venous, arterial, pre-contrast phases, T2, and T2 fat saturated sequences) are co-registered and texture features are extracted from each sequence simultaneously. Random forest is used to construct models to classify lesions on 96 normal regions, 87 clear cell RCCs, 8 papillary RCCs, and 21 renal oncocytomas; ground truths are verified through pathology reports. The highest performance is seen in random forest model when data from all sequences are used in conjunction, achieving an overall classification accuracy of 83.7%. When using data from one single sequence, the overall accuracies achieved for T1 delayed, venous, arterial, and pre-contrast phase, T2, and T2 fat saturated were 79.1%, 70.5%, 56.2%, 61.0%, 60.0%, and 44.8%, respectively. This demonstrates promising results of utilizing intensity information from multiple MR sequences for accurate classification of renal masses.

  3. Comprehensive functional analyses of expressed sequence tags in common wheat (Triticum aestivum).

    PubMed

    Manickavelu, Alagu; Kawaura, Kanako; Oishi, Kazuko; Shin-I, Tadasu; Kohara, Yuji; Yahiaoui, Nabila; Keller, Beat; Abe, Reina; Suzuki, Ayako; Nagayama, Taishi; Yano, Kentaro; Ogihara, Yasunari

    2012-04-01

    About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, resulting in 37,138 contigs and 215,199 singlets. In the assembled sequences, 10.6% presented no matches with existing sequences in public databases. Functional characterization of wheat unigenes by gene ontology annotation, mining transcription factors, full-length cDNA, and miRNA targeting sites were carried out. A bioinformatics strategy was developed to discover single-nucleotide polymorphisms (SNPs) within our large EST resource and reported the SNPs between and within (homoeologous) cultivars. Digital gene expression was performed to find the tissue-specific gene expression, and correspondence analysis was executed to identify common and specific gene expression by selecting four biotic stress-related libraries. The assembly and associated information cater a framework for future investigation in functional genomics.

  4. Comprehensive Functional Analyses of Expressed Sequence Tags in Common Wheat (Triticum aestivum)

    PubMed Central

    Manickavelu, Alagu; Kawaura, Kanako; Oishi, Kazuko; Shin-I, Tadasu; Kohara, Yuji; Yahiaoui, Nabila; Keller, Beat; Abe, Reina; Suzuki, Ayako; Nagayama, Taishi; Yano, Kentaro; Ogihara, Yasunari

    2012-01-01

    About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, resulting in 37 138 contigs and 215 199 singlets. In the assembled sequences, 10.6% presented no matches with existing sequences in public databases. Functional characterization of wheat unigenes by gene ontology annotation, mining transcription factors, full-length cDNA, and miRNA targeting sites were carried out. A bioinformatics strategy was developed to discover single-nucleotide polymorphisms (SNPs) within our large EST resource and reported the SNPs between and within (homoeologous) cultivars. Digital gene expression was performed to find the tissue-specific gene expression, and correspondence analysis was executed to identify common and specific gene expression by selecting four biotic stress-related libraries. The assembly and associated information cater a framework for future investigation in functional genomics. PMID:22334568

  5. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites

    PubMed Central

    Hemberg, Martin; Gray, Jesse M.; Cloonan, Nicole; Kuersten, Scott; Grimmond, Sean; Greenberg, Michael E.; Kreiman, Gabriel

    2012-01-01

    More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements. PMID:22684627

  6. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    PubMed

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  7. Transient and stable GFP expression in germ cells by the vasa regulatory sequences from the red seabream (Pagrus major).

    PubMed

    Lin, Fan; Liu, Qinghua; Li, Mingyou; Li, Zhendong; Hong, Ni; Li, Jun; Hong, Yunhan

    2012-01-01

    Primordial germ cells (PGCs) are the precursors of gametes responsible for genetic transmission to the next generation. They provide an ideal system for cryopreservation and restoration of biodiversity. Recently, considerable attention has been raised to visualize, isolate and transplant PGCs within and between species. In fish, stable PGC visualization in live embryo and individual has been limited to laboratory fish models such as medaka and zebrafish. One exception is the rainbow trout, which represents the only species with aquaculture importance and has GFP-labeled germ cells throughout development. PGCs can be transiently labeled by embryonic injection of mRNA containing green fluorescence protein gene (GFP) and 3'-untranslated region (3'-UTR) of a maternal germ gene such as vasa, nos1, etc. Stable PGC labeling can be achieved through production of transgenic animals by some transcriptional regulatory sequences from germ genes, such as the vasa promoter and 3'-UTR. In this study, we reported the functional analyses of the red seabream vasa (Pmvas) regulatory sequences, using medaka as a model system. It was showed that injection of GFP-Pmvas3'UTR mRNA was able to label medaka PGCs during embryogenesis. Besides, we have constructed pPmvasGFP transgenic vector, and established a stable transgenic medaka line exhibiting GFP expression in germ cells including PGCs, mitotic and meiotic germ cells of both sexes, under control of the Pmvas transcriptional regulatory sequences. It is concluded that the Pmvas regulatory sequences examined in this study are sufficient for germ cell expression and labeling.

  8. Suffix tree searcher: exploration of common substrings in large DNA sequence sets.

    PubMed

    Minkley, David; Whitney, Michael J; Lin, Song-Han; Barsky, Marina G; Kelly, Chris; Upton, Chris

    2014-07-23

    Large DNA sequence data sets require special bioinformatics tools to search and compare them. Such tools should be easy to use so that the data can be easily accessed by a wide array of researchers. In the past, the use of suffix trees for searching DNA sequences has been limited by a practical need to keep the trees in RAM. Newer algorithms solve this problem by using disk-based approaches. However, none of the fastest suffix tree algorithms have been implemented with a graphical user interface, preventing their incorporation into a feasible laboratory workflow. Suffix Tree Searcher (STS) is designed as an easy-to-use tool to index, search, and analyze very large DNA sequence datasets. The program accommodates very large numbers of very large sequences, with aggregate size reaching tens of billions of nucleotides. The program makes use of pre-sorted persistent "building blocks" to reduce the time required to construct new trees. STS is comprised of a graphical user interface written in Java, and four C modules. All components are automatically downloaded when a web link is clicked. The underlying suffix tree data structure permits extremely fast searching for specific nucleotide strings, with wild cards or mismatches allowed. Complete tree traversals for detecting common substrings are also very fast. The graphical user interface allows the user to transition seamlessly between building, traversing, and searching the dataset. Thus, STS provides a new resource for the detection of substrings common to multiple DNA sequences or within a single sequence, for truly huge data sets. The re-searching of sequence hits, allowing wild card positions or mismatched nucleotides, together with the ability to rapidly retrieve large numbers of sequence hits from the DNA sequence files, provides the user with an efficient method of evaluating the similarity between nucleotide sequences by multiple alignment or use of Logos. The ability to re-use existing suffix tree pieces

  9. Development of taxon-specific sequences of common wheat for the detection of genetically modified wheat.

    PubMed

    Iida, Mayu; Yamashiro, Satomi; Yamakawa, Hirohito; Hayakawa, Katsuyuki; Kuribara, Hideo; Kodama, Takashi; Furui, Satoshi; Akiyama, Hiroshi; Maitani, Tamio; Hino, Akihiro

    2005-08-10

    Qualitative and quantitative Polymerase Chain Reaction (PCR) systems aimed at the specific detection and quantification of common wheat DNA are described. Many countries have issued regulations to label foods that include genetically modified organisms (GMOs). PCR technology is widely recognized as a reliable and useful technique for the qualitative and quantitative detection of GMOs. Detection methods are needed to amplify a target GM gene, and the amplified results should be compared with those of the corresponding taxon-specific reference gene to obtain reliable results. This paper describes the development of a specific DNA sequence in the waxy-D1 gene for common wheat (Triticum aestivum L.) and the design of a specific primer pair and TaqMan probe on the waxy-D1 gene for PCR analysis. The primers amplified a product (Wx012) of 102 bp. It is indicated that the Wx012 DNA sequence is specific to common wheat, showing homogeneity in qualitative PCR results and very similar quantification accuracy along 19 distantly related common wheat varieties. In Southern blot and real-time PCR analyses, this sequence showed either a single or a low number of copy genes. In addition, by qualitative and quantitative PCR using wx012 primers and a wx012-T probe, the limits of detection of the common wheat genome were found to be about 15 copies, and the reproducibility was reliable. In consequence, the PCR system using wx012 primers and wx012-T probe is considered to be suitable for use as a common wheat-specific taxon-specific reference gene in DNA analyses, including GMO tests.

  10. Using mitochondrial nucleotide sequences to investigate diversity and genealogical relationships within common carp (Cyprinus carpio L.).

    PubMed

    Thai, B T; Burridge, C P; Pham, T A; Austin, C M

    2005-02-01

    Direct sequencing of mitochondrial DNA (mtDNA) D-loop (745 bp) and MTATPase6/MTATPase8 (857 bp) regions was used to investigate genetic variation within common carp and develop a global genealogy of common carp strains. The D-loop region was more variable than the MTATPase6/MTATPase8 region, but given the wide distribution of carp the overall levels of sequence divergence were low. Levels of haplotype diversity varied widely among countries with Chinese, Indonesian and Vietnamese carp showing the greatest diversity whereas Japanese Koi and European carp had undetectable nucleotide variation. A genealogical analysis supports a close relationship between Vietnamese, Koi and Chinese Color carp strains and to a lesser extent, European carp. Chinese and Indonesian carp strains were the most divergent, and their relationships do not support the evolution of independent Asian and European lineages and current taxonomic treatments.

  11. Optimum designs for next-generation sequencing to discover rare variants for common complex disease.

    PubMed

    Shi, Gang; Rao, D C

    2011-09-01

    Recent advances in next-generation sequencing technologies make it affordable to search for rare and functional variants for common complex diseases systematically. We investigated strategies for enriching rare variants in the samples selected for sequencing so as to optimize the power for their discovery. In particular, we investigated the roles of alternative sources of enrichment in families through computer simulations. We showed that linkage information, extreme phenotype, and nonrandom ascertainment, such as multiply affected families, constitute different sources for enriching rare and functional variants in a sequencing study design. Linkage is well known to have limited power for detecting small genetic effects, and hence not considered to be a powerful tool for discovering variants for common complex diseases. However, those families with some degree of family-specific linkage evidence provide an effective sampling strategy to sub-select the most linkage-informative families for sequencing. Compared with selecting subjects with extreme phenotypes, linkage evidence performs better with larger families, while extreme-phenotype method is more efficient with smaller families. Families with multiple affected siblings were found to provide the largest enrichment of rare variants. Finally, we showed that combined strategies, such as selecting linkage-informative families from multiply affected families, provide much higher enrichment of rare functional variants than either strategy alone.

  12. Regulatory codes of conduct and the common law. Part 2: confidentiality.

    PubMed

    Fullbrook, Suzanne

    In Part One, three aspects of the principles that underpin the law of confidentiality were identified from a review of case law. Public interest(s), public safety and the protection of vulnerable people were identified as producing a matrix whereby health providers could see that the rules relating to confidentiality were viewed by all in society as being of the utmost importance. This article concentrates on the codes of conduct that two regulatory bodies have produced to guide the practice of health professionals. The General Medical Council (GMC) and the Nursing and Midwifery Council (NMC) have codes of conduct that are very similar in their guidance. This is not surprising given that the central importance of confidentiality is reflected at the highest possible levels of judicial and political thinking.

  13. Safety and regulatory review of dyes commonly used as excipients in pharmaceutical and nutraceutical applications.

    PubMed

    Pérez-Ibarbia, Leire; Majdanski, Tobias; Schubert, Stephanie; Windhab, Norbert; Schubert, Ulrich S

    2016-10-10

    Color selection is one of the key elements of building a strong brand development and product identity in the pharmaceutical industry, besides to prevent counterfeiting. Moreover, colored pharmaceutical dosage forms may increase patient compliance and therapy enhancement. Although most synthetic dyes are classified as safe, their regulations are stricter than other classes of excipients. Safety concerns have increased during the last years but the efforts to change to natural dyes seem to be not promising. Their instability problems and the development of "non-toxic" dyes is still a challenge. This review focuses specifically on the issues related to dye selection and summarizes the current regulatory status. A deep awareness of toxicological data based on the public domain, making sure the compliance of standards for regulation and safety for successful product development is provided. In addition, synthetic strategies are provided to covalently bind dyes on polymers to possibly overcome toxicity issues. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Understanding the Effects of Users' Behaviors on Effectiveness of Different Exogenous Regulatory Common Pool Resource Management Institutions

    NASA Astrophysics Data System (ADS)

    Madani, K.; Dinar, A.

    2013-12-01

    Tragedy of the commons is generally recognized as one of the possible destinies for common pool resources (CPRs). To avoid the tragedy of the commons and prolonging the life of CPRs, users may show different behavioral characteristics and use different rationales for CPR planning and management. Furthermore, regulators may adopt different strategies for sustainable management of CPRs. The effectiveness of different regulatory exogenous management institutions cannot be evaluated through conventional CPR models since they assume that either users base their behavior on individual rationality and adopt a selfish behavior (Nash behavior), or that the users seek the system's optimal solution without giving priority to their own interests. Therefore, conventional models fail to reliably predict the outcome of CPR problems in which parties may have a range of behavioral characteristics, putting them somewhere in between the two types of behaviors traditionally considered. This work examines the effectiveness of different regulatory exogenous CPR management institutions through a user-based model (as opposed to a system-based model). The new modeling framework allows for consideration of sensitivity of the results to different behavioral characteristics of interacting CPR users. The suggested modeling approach is applied to a benchmark groundwater management problem. Results indicate that some well-known exogenous management institutions (e.g. taxing) are ineffective in sustainable management of CPRs in most cases. Bankruptcy-based management can be helpful, but determination of the fair level of cutbacks remains challenging under this type of institution. Furthermore, some bankruptcy rules such as the Constrained Equal Award (CEA) method are more beneficial to wealthier users, failing to establish social justice. Quota-based and CPR status-based management perform as the most promising and robust regulatory exogenous institutions in prolonging the CPR's life and

  15. Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences.

    PubMed

    Jansen, A; Gemayel, R; Verstrepen, K J

    2012-01-01

    Tandem repeats are intrinsically highly variable sequences since repeat units are often lost or gained during replication or following unequal recombination events. Because of their low complexity and their instability, these repeats, which are also called satellite repeats, are often considered to be useless 'junk' DNA. However, recent findings show that tandem repeats are frequently found within promoters of stress-induced genes and within the coding regions of genes encoding cell-surface and regulatory proteins. Interestingly, frequent changes in these repeats often confer phenotypic variability. Examples include variation in the microbial cell surface, rapid tuning of internal molecular clocks in flies, and enhanced morphological plasticity in mammals. This suggests that instead of being useless junk DNA, some variable tandem repeats are useful functional elements that confer 'evolvability', facilitating swift evolution and rapid adaptation to changing environments. Since changes in repeats are frequent and reversible, repeats provide a unique type of mutation that bridges the gap between rare genetic mutations, such as single nucleotide polymorphisms, and highly unstable but reversible epigenetic inheritance.

  16. Direct interaction of the Polycomb protein with Antennapedia regulatory sequences in polytene chromosomes of Drosophila melanogaster.

    PubMed Central

    Zink, B; Engström, Y; Gehring, W J; Paro, R

    1991-01-01

    The Polycomb (Pc) gene is responsible for the elaboration and maintenance of the expression pattern of the homeotic genes during development of Drosophila. In mutant Pc- embryos, homeotic transcripts are ectopically expressed, leading to abdominal transformations in all segments. From this it was suggested that PC+ acts as a repressor of homeotic gene transcription. We have mapped the cis-acting control sequences of the homeotic Antennapedia (Antp) gene regulated by Pc. Using Antp P1 and P2 promoter fragments linked to the E. coli lacZ reporter gene we show different expression patterns of beta-galactosidase (beta-gal) in transformed Pc+ and Pc- embryos. In addition we are able to visualize by immunocytochemical techniques on polytene chromosomes the direct binding of the Pc protein to the transposed cis-regulatory promoter fragments. However, short Antp P1 promoter constructs which are--due to position effects--ectopically activated in salivary glands, do not reveal a Pc binding signal. Images PMID:1671215

  17. Deep sequencing-based identification of small regulatory RNAs in Synechocystis sp. PCC 6803.

    PubMed

    Xu, Wen; Chen, Hui; He, Chen-Liu; Wang, Qiang

    2014-01-01

    Synechocystis sp. PCC 6803 is a genetically tractable model organism for photosynthesis research. The genome of Synechocystis sp. PCC 6803 consists of a circular chromosome and seven plasmids. The importance of small regulatory RNAs (sRNAs) as mediators of a number of cellular processes in bacteria has begun to be recognized. However, little is known regarding sRNAs in Synechocystis sp. PCC 6803. To provide a comprehensive overview of sRNAs in this model organism, the sRNAs of Synechocystis sp. PCC 6803 were analyzed using deep sequencing, and 7,951,189 reads were obtained. High quality mapping reads (6,127,890) were mapped onto the genome and assembled into 16,192 transcribed regions (clusters) based on read overlap. A total number of 5211 putative sRNAs were revealed from the genome and the 4 megaplasmids, and 27 of these molecules, including four from plasmids, were confirmed by RT-PCR. In addition, possible target genes regulated by all of the putative sRNAs identified in this study were predicted by IntaRNA and analyzed for functional categorization and biological pathways, which provided evidence that sRNAs are indeed involved in many different metabolic pathways, including basic metabolic pathways, such as glycolysis/gluconeogenesis, the citrate cycle, fatty acid metabolism and adaptations to environmentally stress-induced changes. The information from this study provides a valuable reservoir for understanding the sRNA-mediated regulation of the complex physiology and metabolic processes of cyanobacteria.

  18. Epigenetic Modifications of Distinct Sequences of the p1 Regulatory Gene Specify Tissue-Specific Expression Patterns in Maize

    PubMed Central

    Sekhon, Rajandeep S.; Peterson, Thomas; Chopra, Surinder

    2007-01-01

    Tandemly repeated endogenous genes are common in plants, but their transcriptional regulation is not well characterized. In maize, the P1-wr allele of pericarp color1 is composed of multiple copies arranged in a head-to-tail fashion. P1-wr confers a white kernel pericarp and red cob glume pigment phenotype that is stably inherited over generations. To understand the molecular mechanisms that regulate tissue-specific expression of P1-wr, we have characterized P1-wr*, a spontaneous loss-of-function epimutation that shows a white kernel pericarp and white cob glume phenotype. As compared to its progenitor P1-wr, the P1-wr* is hypermethylated in exon 1 and intron 2 regions. In the presence of the epigenetic modifier Ufo1 (Unstable factor for orange1), P1-wr* plants exhibit a range of cob glume pigmentation whereas pericarps remain colorless. In these plants, the level of cob pigmentation directly correlates with the degree of DNA demethylation in the intron 2 region of p1. Further, genomic bisulfite sequencing indicates that a 168-bp region of intron 2 is significantly hypomethylated in both CG and CNG context in P1-wr* Ufo1 plants. Interestingly, P1-wr* Ufo1 plants did not show any methylation change in a distal enhancer region that has previously been implicated in Ufo1-induced gain of pericarp pigmentation of the P1-wr allele. These results suggest that distinct regulatory sequences in the P1-wr promoter and intron 2 regions can undergo independent epigenetic modifications to generate tissue-specific expression patterns. PMID:17179091

  19. Molecular cloning and amino acid sequence of human plakoglobin, the common junctional plaque protein

    SciTech Connect

    Franke, W.W.; Goldschmidt, M.D.; Zimbelmann, R.; Mueller, H.M.; Schiller, D.L.; Cowin, P. )

    1989-06-01

    Plakoglobin is a major cytoplasmic protein that occurs in a soluble and a membrane-associated form and is the only known constituent common to the submembranous plaques of both kinds of adhering junctions, the desmosomes and the intermediate junctions. Using a partial cDNA clone for bovine plakoglobin, the authors isolated cDNAs encoding human plakoglobin, determined its nucleotide sequence, and deduced the complete amino acid sequence. The polypeptide encoded by the cDNA was synthesized by in vitro transcription and translation and identified by its comigration with authentic plakoglobin in two-dimensional gel electrophoresis. The identity was further confirmed by comparison of the deduced sequence with the directly determined amino acid sequence of two fragments from bovine plakoglobin. Analysis of the plakoglobin sequence showed the protein to be unrelated to any other known proteins, highly conserved between human and bovine tissues, and characterized by numerous changes between hydrophilic and hydrophobic sections. Only one kind of plakoglobin mRNA was found in most tissues, but an additional mRNA was detected in certain human tumor cell lines. This longer mRNA may be represented by a second type of plakoglobin cDNA, which contains an insertion of 297 nucleotides in the 3{prime} noncoding region.

  20. Selected heterozygosity at cis-regulatory sequences increases the expression homogeneity of a cell population in humans.

    PubMed

    Sung, Min Kyung; Jang, Juneil; Lee, Kang Seon; Ghim, Cheol-Min; Choi, Jung Kyoon

    2016-07-28

    Examples of heterozygote advantage in humans are scarce and limited to protein-coding sequences. Here, we attempt a genome-wide functional inference of advantageous heterozygosity at cis-regulatory regions. The single-nucleotide polymorphisms bearing the signatures of balancing selection are enriched in active cis-regulatory regions of immune cells and epithelial cells, the latter of which provide barrier function and innate immunity. Examples associated with ancient trans-specific balancing selection are also discovered. Allelic imbalance in chromatin accessibility and divergence in transcription factor motif sequences indicate that these balanced polymorphisms cause distinct regulatory variation. However, a majority of these variants show no association with the expression level of the target gene. Instead, single-cell experimental data for gene expression and chromatin accessibility demonstrate that heterozygous sequences can lower cell-to-cell variability in proportion to selection strengths. This negative correlation is more pronounced for highly expressed genes and consistently observed when using different data and methods. Based on mathematical modeling, we hypothesize that extrinsic noise from fluctuations in transcription factor activity may be amplified in homozygotes, whereas it is buffered in heterozygotes. While high expression levels are coupled with intrinsic noise reduction, regulatory heterozygosity can contribute to the suppression of extrinsic noise. This mechanism may confer a selective advantage by increasing cell population homogeneity and thereby enhancing the collective action of the cells, especially of those involved in the defense systems in humans.

  1. Development of polymorphic expressed sequence tag-single sequence repeat markers in the common Chinese cuttlefish, Sepiella maindroni.

    PubMed

    Li, R H; Lu, S K; Zhang, C L; Song, W W; Mu, C K; Wang, C L

    2014-07-25

    The common Chinese cuttlefish (Sepiella maindroni) is one of the popular edible cephalopod consumed across Asia. To facilitate the population genetic investigation of this species, we developed fourteen polymorphic microsatellite makers from expressed sequence tags of S. maindroni. The number of alleles at each locus ranged from 6 to 10 with an average of 7.9 alleles per locus. The ranges of observed and expected heterozygosity were from 0.615 to 0.962 and 0.685 to 0.888, respectively. Four loci were found deviated significantly from Hardy-Weinberg equilibrium. The polymorphism information content ranged from 0.638 to 0.833. These polymorphic microsatellite loci will be helpful for the population genetic, genetic linkage map, and other genetic studies of S. maindroni.

  2. Common DNA sequences with potential for detection of genetically manipulated organisms in food.

    PubMed

    MacCormick, C A; Griffin, H G; Underwood, H M; Gasson, M J

    1998-06-01

    Foods produced by genetic engineering technology are now appearing on the market and many more are likely to emerge in the future. The safety aspects, regulation, and labelling of these foods are still contentious issues in most countries and recent surveys highlight consumer concerns about the safety and labelling of genetically modified food. In most countries it is necessary to have approval for the use of genetically manipulated organisms (GMOs) in the production of food. In order to police regulations, a technology to detect such foods is desirable. In addition, a requirement to label approved genetically modified food would necessitate a monitoring system. One solution is to 'tag' approved GMOs with some form of biological or genetic marker, permitting the surveillance of foods for the presence of approved products of genetic engineering. While non-approved GMOs would not be detected by such a surveillance, they might be detected by a screen for DNA sequences common to all or most GMOs. This review focuses on the potential of using common DNA sequences as detection probes for GMOs. The identification of vector sequences, plant transcription terminators, and marker genes by PCR and hybridization techniques is discussed.

  3. Common features in structures and sequences of sandwich-like proteins

    PubMed Central

    Kister, Alexander E.; Finkelstein, Alexei V.; Gelfand, Israel M.

    2002-01-01

    The goal of this work is to define the structural and sequence features common to sandwich-like proteins (SPs), a group of very different proteins now comprising 69 superfamilies in 38 protein folds. Analysis of the arrangements of strands within main sandwich sheets revealed a rigorously defined constraint on the supersecondary substructure that holds true for 94% of known SP structures. The invariant substructure consists of two interlocked pairs of neighboring β-strands. It is even more typical for centers of SP than the well-known “Greek key” strands arrangement for their edges. As homology among these proteins is not usually detectable even with the most powerful sequence-comparing algorithms, we employed a structure-based approach to sequence alignment. Within the interlocked strands we found 12 positions with fixed structural roles in SP. A residue at any of these positions possesses similar structural properties with residues in the same position of other SPs. The 12 positions lie at the center of the interface between the β-sheets and form the common geometrical core of SPs. Of the 12 positions, 8 are occupied by only four hydrophobic residues in 80% of all SPs. PMID:12384574

  4. Whole-genome sequencing identifies common-to-rare variants associated with human blood metabolites.

    PubMed

    Long, Tao; Hicks, Michael; Yu, Hung-Chun; Biggs, William H; Kirkness, Ewen F; Menni, Cristina; Zierer, Jonas; Small, Kerrin S; Mangino, Massimo; Messier, Helen; Brewerton, Suzanne; Turpaz, Yaron; Perkins, Brad A; Evans, Anne M; Miller, Luke A D; Guo, Lining; Caskey, C Thomas; Schork, Nicholas J; Garner, Chad; Spector, Tim D; Venter, J Craig; Telenti, Amalio

    2017-04-01

    Genetic factors modifying the blood metabolome have been investigated through genome-wide association studies (GWAS) of common genetic variants and through exome sequencing. We conducted a whole-genome sequencing study of common, low-frequency and rare variants to associate genetic variations with blood metabolite levels using comprehensive metabolite profiling in 1,960 adults. We focused the analysis on 644 metabolites with consistent levels across three longitudinal data collections. Genetic sequence variations at 101 loci were associated with the levels of 246 (38%) metabolites (P ≤ 1.9 × 10(-11)). We identified 113 (10.7%) among 1,054 unrelated individuals in the cohort who carried heterozygous rare variants likely influencing the function of 17 genes. Thirteen of the 17 genes are associated with inborn errors of metabolism or other pediatric genetic conditions. This study extends the map of loci influencing the metabolome and highlights the importance of heterozygous rare variants in determining abnormal blood metabolic phenotypes in adults.

  5. Identification of parasite DNA in common bile duct stones by PCR and DNA sequencing

    PubMed Central

    Jang, Ji Sun; Kim, Kyung Ho; Yu, Jae-Ran

    2007-01-01

    We attempted to identify parasite DNA in the biliary stones of humans via PCR and DNA sequencing. Genomic DNA was isolated from each of 15 common bile duct (CBD) stones and 5 gallbladder (GB) stones. The patients who had the CBD stones suffered from cholangitis, and the patients with GB stones showed acute cholecystitis, respectively. The 28S and 18S rDNA genes were amplified successfully from 3 and/or 1 common bile duct stone samples, and then cloned and sequenced. The 28S and 18S rDNA sequences were highly conserved among isolates. Identity of the obtained 28S D1 rDNA with that of Clonorchis sinensis was higher than 97.6%, and identity of the 18S rDNA with that of other Ascarididae was 97.9%. Almost no intra-specific variations were detected in the 28S and 18S rDNA with the exception of a few nucleotide variations, i.e., substitution and deletion. These findings suggest that C. sinensis and Ascaris lumbricoides may be related with the biliary stone formation and development. PMID:18165713

  6. Sequencing of SCN5A identifies rare and common variants associated with cardiac conduction

    PubMed Central

    Magnani, Jared W.; Brody, Jennifer A.; Prins, Bram P.; Arking, Dan E.; Lin, Honghuang; Yin, Xiaoyan; Liu, Ching-Ti; Morrison, Alanna C.; Zhang, Feng; Spector, Tim D.; Alonso, Alvaro; Bis, Joshua C.; Heckbert, Susan R.; Lumley, Thomas; Sitlani, Colleen M.; Cupples, L. Adrienne; Lubitz, Steven A.; Soliman, Elsayed Z.; Pulit, Sara L.; Newton-Cheh, Christopher; O'Donnell, Christopher J.; Ellinor, Patrick T.; Benjamin, Emelia J.; Muzny, Donna M.; Gibbs, Richard A.; Santibanez, Jireh; Taylor, Herman A.; Rotter, Jerome I.; Lange, Leslie A.; Psaty, Bruce M.; Jackson, Rebecca; Rich, Stephen S.; Boerwinkle, Eric; Jamshidi, Yalda; Sotoodehnia, Nona

    2014-01-01

    Background The cardiac sodium channel SCN5A regulates atrioventricular and ventricular conduction. Genetic variants in this gene are associated with PR and QRS intervals. We sought to further characterize the contribution of rare and common coding variation in SCN5A to cardiac conduction. Methods and Results In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study (CHARGE), we performed targeted exonic sequencing of SCN5A (n=3699, European-ancestry individuals) and identified 4 common (minor allele frequency >1%) and 157 rare variants. Common and rare SCN5A coding variants were examined for association with PR and QRS intervals through meta-analysis of European ancestry participants from CHARGE, NHLBI’s Exome Sequencing Project (ESP, n=607) and the UK10K (n=1275) and by examining ESP African-ancestry participants (N=972). Rare coding SCN5A variants in aggregate were associated with PR interval in European and African-ancestry participants (P=1.3×10−3). Three common variants were associated with PR and/or QRS interval duration among European-ancestry participants and one among African-ancestry participants. These included two well-known missense variants; rs1805124 (H558R) was associated with PR and QRS shortening in European-ancestry participants (P=6.25×10−4 and P=5.2×10−3 respectively) and rs7626962 (S1102Y) was associated with PR shortening in those of African ancestry (P=2.82×10−3). Among European-ancestry participants, two novel synonymous variants, rs1805126 and rs6599230, were associated with cardiac conduction. Our top signal, rs1805126 was associated with PR and QRS lengthening (P=3.35×10−7 and P=2.69×10−4 respectively), and rs6599230 was associated with PR shortening (P=2.67×10−5). Conclusions By sequencing SCN5A, we identified novel common and rare coding variants associated with cardiac conduction. PMID:24951663

  7. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    SciTech Connect

    Macke, J.P.; Nathans, J.; King, V.L. ); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. ); Brown, T. )

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  8. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences

    PubMed Central

    2012-01-01

    Background Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools. Results This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones. Conclusions MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools. PMID:23282090

  9. Common 5' beta-globin RFLP haplotypes harbour a surprising level of ancestral sequence mosaicism.

    PubMed

    Webster, Matthew T; Clegg, John B; Harding, Rosalind M

    2003-07-01

    Blocks of linkage disequilibrium (LD) in the human genome represent segments of ancestral chromosomes. To investigate the relationship between LD and genealogy, we analysed diversity associated with restriction fragment length polymorphism (RFLP) haplotypes of the 5' beta-globin gene complex. Genealogical analyses were based on sequence alleles that spanned a 12.2-kb interval, covering 3.1 kb around the psibeta gene and 6.2 kb of the delta-globin gene and its 5' flanking sequence known as the R/T region. Diversity was sampled from a Kenyan Luo population where recent malarial selection has contributed to substantial LD. A single common sequence allele spanning the 12.2-kb interval exclusively identified the ancestral chromosome bearing the "Bantu" beta(s) (sickle-cell) RFLP haplotype. Other common 5' RFLP haplotypes comprised interspersed segments from multiple ancestral chromosomes. Nucleotide diversity was similar between psibeta and R/T-delta-globin but was non-uniformly distributed within the R/T-delta-globin region. High diversity associated with the 5' R/T identified two ancestral lineages that probably date back more than 2 million years. Within this genealogy, variation has been introduced into the 3' R/T by gene conversion from other ancestral chromosomes. Diversity in delta-globin was found to lead through parts of the main genealogy but to coalesce in a more recent ancestor. The well-known recombination hotspot is clearly restricted to the region 3' of delta-globin. Our analyses show that, whereas one common haplotype in a block of high LD represents a long segment from a single ancestral chromosome, others are mosaics of short segments from multiple ancestors related in genealogies of unsuspected complexity.

  10. Nucleotide sequence and functional analysis of regulatory region of the lumP and the lux operon from Photobacterium leiognathi.

    PubMed

    Lin, J W; Chao, Y F; Weng, S F

    1995-05-25

    The lumP gene is linked to the lux operon, but runs in the opposite direction in Photobacterium leiognathi PL741. The gene order of the lumP and the lux operon is < -lumP-R & R-luxC-luxD-luxA-luxB-luxN-luxE- > (R & R: regulatory region). The nucleotide sequence of the regulatory region (827-bp) between the lumP and the lux operon was determined. Sequence analysis illustrates that the regulatory region includes two divergent promoter systems, PR-promoter system for the lux operon (R-operon) and PL-promoter system for the lumP or lum operon (L-operon). Functional analysis of the regulatory region shows that the PR- and PL-promoter systems both are able to lead the gene expression. The deletion experiment result elicits that the PR- and PL-promoter are coordinatively and negatively regulated; the PR- and PL-promoter might be competing for recognition by RNA polymerase to initiate transcription. The fact of the LumP responsible for the spectral blue shift in P. leiognathi implied that the lumP gene closedly linked to the lux operon is for coordinative regulation with the lux operon. In addition, the glucose repression on the PR-promoter system shows that the expression of the lux operon is regulated by cAMP-CRP induction in E. coli.

  11. Complete nucleotide sequence and affinities of the genomic RNA of Narcissus common latent virus (genus Carlavirus).

    PubMed

    Zheng, H-Y; Chen, J; Adams, M J; Chen, J-P

    2006-08-01

    The complete sequence of an isolate of Narcissus common latent virus (NCLV) from Zhangzhou city, Fujian, China was determined from amplified fragments of purified viral RNA. Excluding the poly(A) tail, the genomic RNA of NCLV was 8539 nucleotides (nt) long and had the typical organization for a member of the genus Carlavirus. The most closely related species were Potato virus M, Hop latent virus and Aconitum latent virus, which had 58-59% nt identity to NCLV in their entire genomes. These relationships were confirmed by a phylogenetic analysis using a composite nucleotide alignment of all the open reading frames.

  12. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    PubMed Central

    2011-01-01

    Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives

  13. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    PubMed

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  14. Omeprazole transactivates human CYP1A1 and CYP1A2 expression through the common regulatory region containing multiple xenobiotic-responsive elements.

    PubMed

    Yoshinari, Kouichi; Ueda, Rika; Kusano, Kazutomi; Yoshimura, Tsutomu; Nagata, Kiyoshi; Yamazoe, Yasushi

    2008-07-01

    Omeprazole induces human CYP1A1 and CYP1A2 in human hepatoma cells and human liver. Aryl hydrocarbon receptor (AHR) is shown to be involved in this induction. However, its precise molecular mechanism remains unknown because the chemical activates AHR without its direct binding in contrast to typical AHR ligands such as 3-methylcholanthrene (3MC) and beta-naphthoflavone (BNF). Human CYP1A1 and CYP1A2 genes are located in a head-to-head orientation sharing about 23 kb 5'-flanking region. Recently, we succeeded to measure CYP1A1 and CYP1A2 transcriptional activities simultaneously using dual reporter gene constructs containing the 23 kb sequence. In this study, transient transfection assays have been performed using numbers of single and dual reporter constructs to identify omeprazole-responsive region for CYP1A1 and CYP1A2 induction. Reporter assays with deletion constructs have demonstrated that the omeprazole-induced expression of both CYP1A1 and CYP1A2 is mediated via the common regulatory region containing multiple AHR-binding motifs (the nucleotides from -464 to -1829 of human CYP1A1), which is identical with the region for BNF and 3MC induction. Interestingly, omeprazole activated the transcription of CYP1A1 and CYP1A2 to similar extents while BNF and 3MC preferred CYP1A1 expression. We have also found that primaquine is an omeprazole-like CYP1A inducer, while lansoprazole and albendazole are 3MC/BNF-like in terms of the CYP1A1/CYP1A2 preference. The present results suggest that omeprazole as well as BNF and 3MC activates both human CYP1A1 and CYP1A2 expression through the common regulatory region despite that omeprazole may involve a different cellular signal(s) from BNF and 3MC.

  15. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  16. Sequencing Y Chromosomes Resolves Discrepancy in Time to Common Ancestor of Males versus Females

    PubMed Central

    Poznik, G. David; Henn, Brenna M.; Yee, Muh-Ching; Sliwerska, Elzbieta; Euskirchen, Ghia M.; Lin, Alice A.; Snyder, Michael; Quintana-Murci, Lluis; Kidd, Jeffrey M.; Underhill, Peter A.; Bustamante, Carlos D.

    2014-01-01

    The Y chromosome and the mitochondrial genome (mtDNA) have been used to estimate when the common patrilineal and matrilineal ancestors of humans lived. We sequenced the genomes of 69 males from nine populations, including two in which we find basal branches of the Y chromosome tree. We identify ancient phylogenetic structure within African haplogroups and resolve a long-standing ambiguity deep within the tree. Applying equivalent methodologies to the Y and mtDNA, we estimate the time to the most recent common ancestor (TMRCA) of the Y chromosome to be 120–156 thousand years and the mtDNA TMRCA to be 99–148 ky. Our findings suggest that, contrary to prior claims, male lineages do not coalesce significantly more recently than female lineages. PMID:23908239

  17. Polymorphism in the bovine BOLA-DRB3 upstream regulatory regions detected through PCR-SSCP and DNA sequencing.

    PubMed

    Ripoli, M V; Peral-García, P; Dulout, F N; Giovambattista, G

    2004-09-15

    In the present work, we describe through polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) and DNA sequencing the polymorphism within the URR-BoLA-DRB3 in 15 cattle breeds. In total, seven PCR-SSCP defined alleles were detected. The alignment of studied sequences showed six polymorphic sites (four transitions, one transversion and one deletion) in the interconsensus regions of the BoLA-DRB3 upstream regulatory region (URR), while the consensus boxes were invariant. Five out of six detected polymorphic sites were of one nucleotide substitution in the interconsensus regions. It is expected that these mutations do not affect significantly the level of expression. In contrast, the deletion observed in the sequence between CCAAT and TATA boxes could have some effect on affinity interactions between the promoter region and the transcription factors. The URR-BoLA-DRB3 DNA analyzed sequences showed moderate level of nucleotide diversity, high level of identity among them and were grouped in the same clade in the phylogenetic tree. In addition, the phylogenetic tree, the similarity analysis and the sequence structure confirmed that the fragment analyzed in this study corresponds to the URR-BoLA-DRB3. The functional role of the observed polymorphic sites among the regulatory motifs in bovine needs to be analyzed and confirmed by means of gene expression assays.

  18. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes

    PubMed Central

    Pavesi, Giulio; Mereghetti, Paolo; Zambelli, Federico; Stefani, Marco; Mauri, Giancarlo; Pesole, Graziano

    2006-01-01

    Understanding the complex mechanisms regulating gene expression at the transcriptional and post-transcriptional levels is one of the greatest challenges of the post-genomic era. The MoD (MOtif Discovery) Tools web server comprises a set of tools for the discovery of novel conserved sequence and structure motifs in nucleotide sequences, motifs that in turn are good candidates for regulatory activity. The server includes the following programs: Weeder, for the discovery of conserved transcription factor binding sites (TFBSs) in nucleotide sequences from co-regulated genes; WeederH, for the discovery of conserved TFBSs and distal regulatory modules in sequences from homologous genes; RNAProfile, for the discovery of conserved secondary structure motifs in unaligned RNA sequences whose secondary structure is not known. In this way, a given gene can be compared with other co-regulated genes or with its homologs, or its mRNA can be analyzed for conserved motifs regulating its post-transcriptional fate. The web server thus provides researchers with different strategies and methods to investigate the regulation of gene expression, at both the transcriptional and post-transcriptional levels. Available at and . PMID:16845071

  19. Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes

    PubMed Central

    Gorkin, David U.; Lee, Dongwon; Reed, Xylena; Fletez-Brant, Christopher; Bessling, Seneca L.; Loftus, Stacie K.; Beer, Michael A.; Pavan, William J.; McCallion, Andrew S.

    2012-01-01

    We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types. PMID:23019145

  20. Common sequence variants in the LOXL1 gene in pigment dispersion syndrome and pigmentary glaucoma.

    PubMed

    Giardina, Emiliano; Oddone, Francesco; Lepre, Tiziana; Centofanti, Marco; Peconi, Cristina; Tanga, Lucia; Quaranta, Luciano; Frezzotti, Paolo; Novelli, Giuseppe; Manni, Gianluca

    2014-04-16

    Single nucleotide polymorphisms (SNPs) within the LOXL1 gene are associated with pseudoesfoliation syndrome and pseudoesfoliation glaucoma. The aim of our study is to investigate a potential involvement of LOXL1 gene in the pathogenesis of pigment dispersion syndrome (PDS) and pigmentary glaucoma (PG). A cohort of Caucasian origin of 84 unrelated and clinically well-characterised patients with PDS/PG and 200 control subjects were included in the study. Genomic DNA from whole blood was extracted and the coding and regulatory regions of LOXL1 gene were risequenced in both patients and controls to identify unknown sequence variations. Genotype and haplotype analysis were performed with UNPHASED software. The expression levels of LOXL1 were determined on c-DNA from peripheral blood lymphocytes by quantitative real-time RT-PCR. A significant allele association was detected for SNP rs2304722 within the fifth intron of LOXL1 (Odds ratio (OR = 2.43, p-value = 3,05e-2). Haplotype analysis revealed the existence of risk and protective haplotypes associated with PG-PDS (OR = 3.35; p-value = 1.00e-5 and OR = 3.35; p-value = 1.00e-4, respectively). Expression analysis suggests that associated haplotypes can regulate the expression level LOXL1. Haplotypes of LOXL1 are associated with PG-PDS independently from rs1048661, leading to a differential expression of the transcript.

  1. Common sequence variants in the LOXL1 gene in pigment dispersion syndrome and pigmentary glaucoma

    PubMed Central

    2014-01-01

    Background Single nucleotide polymorphisms (SNPs) within the LOXL1 gene are associated with pseudoesfoliation syndrome and pseudoesfoliation glaucoma. The aim of our study is to investigate a potential involvement of LOXL1 gene in the pathogenesis of pigment dispersion syndrome (PDS) and pigmentary glaucoma (PG). Methods A cohort of Caucasian origin of 84 unrelated and clinically well-characterised patients with PDS/PG and 200 control subjects were included in the study. Genomic DNA from whole blood was extracted and the coding and regulatory regions of LOXL1 gene were risequenced in both patients and controls to identify unknown sequence variations. Genotype and haplotype analysis were performed with UNPHASED software. The expression levels of LOXL1 were determined on c-DNA from peripheral blood lymphocytes by quantitative real-time RT-PCR. Results A significant allele association was detected for SNP rs2304722 within the fifth intron of LOXL1 (Odds ratio (OR = 2.43, p-value = 3,05e-2). Haplotype analysis revealed the existence of risk and protective haplotypes associated with PG-PDS (OR = 3.35; p-value = 1.00e-5 and OR = 3.35; p-value = 1.00e-4, respectively). Expression analysis suggests that associated haplotypes can regulate the expression level LOXL1. Conclusions Haplotypes of LOXL1 are associated with PG-PDS independently from rs1048661, leading to a differential expression of the transcript. PMID:24739284

  2. Isolation of expressed sequences from the region commonly deleted in Velo-cardio-facial syndrome

    SciTech Connect

    Sirotkin, H.; Morrow, B.; DasGupta, R.

    1994-09-01

    Velo-cardio-facial syndrome (VCFS) is a relatively common autosomal dominant genetic disorder characterized by cleft palate, cardiac abnormalities, learning disabilities and a characteristic facial dysmorphology. Most VCFS patients have interstitial deletions of 22q11 of 1-2 mb. In an effort to isolate the gene(s) responsible for VCFS we have utilized a hybrid selection protocol to recover expressed sequences from three non-overlapping YACs comprising almost 1 mb of the commonly deleted region. Total yeast genomic DNA or isolated YAC DNA was immobilized on Hybond-N filters, blocked with yeast and human ribosomal and human repetitive sequences and hybridized with a mixture of random primed short fragment cDNA libraries. Six human short fragment libraries derived from total fetus, fetal brain, adult brain, testes, thymus and spleen have been used for the selections. Short fragment cDNAs retained on the filter were passed through a second round of selection and cloned into lambda gt10. cDNAs shown to originate from the YACs and from chromosome 22 are being used to isolate full length cDNAs. Three genes known to be present on these YACs, catechol-O-methyltransferase, tuple 1 and clathrin heavy chain have been recovered. Additionally, a gene related to the murine p120 gene and a number of novel short cDNAs have been isolated. The role of these genes in VCFS is being investigated.

  3. Predisposition gene identification in common cancers by exome sequencing: insights from familial breast cancer

    PubMed Central

    Snape, Katie; Ruark, Elise; Tarpey, Patrick; Renwick, Anthony; Turnbull, Clare; Seal, Sheila; Murray, Anne; Hanks, Sandra; Douglas, Jenny; Stratton, Michael R.; Rahman, Nazneen

    2013-01-01

    The genetic component of breast cancer predisposition remains largely unexplained. Candidate-gene case-control resequencing has identified predisposition genes characterised by rare, protein truncating mutations that confer moderate risks of disease. In theory, exome sequencing should yield additional genes of this class. Here, we explore the feasibility and design considerations of this approach. We performed exome sequencing in 50 individuals with familial breast cancer, applying frequency and protein function filters to identify variants most likely to be pathogenic. We identified 867,378 variants that passed the call quality filters of which 1,296 variants passed the frequency and protein truncation filters. The median number of validated, rare, protein truncating variants (PTVs) was 10 in individuals with, and without, mutations in known genes. The functional candidacy of mutated genes was similar in both groups. Without prior knowledge, the known genes would not have been recognisable as breast cancer predisposition genes. Everyone carries multiple rare mutations that are plausibly related to disease. Exome sequencing in common conditions will therefore require intelligent sample and variant prioritisation strategies in large case-control studies to deliver robust genetic evidence of disease association. PMID:22527104

  4. Reprint of: The effectiveness of common thermo-regulatory behaviours in a cool temperate grasshopper.

    PubMed

    Harris, Rebecca M B; McQuillan, Peter; Hughes, Lesley

    2015-12-01

    Behavioural thermoregulation has the potential to alleviate the short-term impacts of climate change on some small ectotherms, without the need for changes to species distributions or genetic adaptation. We illustrate this by measuring the effect of behaviour in a cool temperate species of grasshopper (Phaulacridium vittatum) over a range of spatial and temporal scales in laboratory and natural field experiments. Microhabitat selection at the site scale was tested in free-ranging grasshoppers and related to changing thermal quality over a daily period. Artificial warming experiments were then used to measure the temperature at which common thermoregulatory behaviours are initiated and the subsequent reductions in body temperature. Behavioural means such as timing of activity, choice of substrates with optimum surface temperatures, shade seeking and postural adjustments (e.g. stilting, vertical orientation) were found to be highly effective at maintaining preferred body temperature. The maximum voluntarily tolerated temperature (MVT) was determined to be 44 °C ± 0.4 °C, indicating the upper bounds of thermal flexibility in this species. Behavioural thermoregulation effectively enables small ectotherms to regulate exposure to changing environmental temperatures and utilize the spatially and temporally heterogeneous environments they occupy. Species such as the wingless grasshopper, although adapted to cool temperate conditions, are likely to be well equipped to respond successfully to coarse scale climate change.

  5. The effectiveness of common thermo-regulatory behaviours in a cool temperate grasshopper.

    PubMed

    Harris, Rebecca M B; McQuillan, Peter; Hughes, Lesley

    2015-08-01

    Behavioural thermoregulation has the potential to alleviate the short-term impacts of climate change on some small ectotherms, without the need for changes to species distributions or genetic adaptation. We illustrate this by measuring the effect of behaviour in a cool temperate species of grasshopper (Phaulacridium vittatum) over a range of spatial and temporal scales in laboratory and natural field experiments. Microhabitat selection at the site scale was tested in free-ranging grasshoppers and related to changing thermal quality over a daily period. Artificial warming experiments were then used to measure the temperature at which common thermoregulatory behaviours are initiated and the subsequent reductions in body temperature. Behavioural means such as timing of activity, choice of substrates with optimum surface temperatures, shade seeking and postural adjustments (e.g. stilting, vertical orientation) were found to be highly effective at maintaining preferred body temperature. The maximum voluntarily tolerated temperature (MVT) was determined to be 44°C±0.4°C, indicating the upper bounds of thermal flexibility in this species. Behavioural thermoregulation effectively enables small ectotherms to regulate exposure to changing environmental temperatures and utilize the spatially and temporally heterogeneous environments they occupy. Species such as the wingless grasshopper, although adapted to cool temperate conditions, are likely to be well equipped to respond successfully to coarse scale climate change.

  6. Identification of expressed resistance gene-like sequences by data mining in 454-derived transcriptomic sequences of common bean (Phaseolus vulgaris L.)

    PubMed Central

    2012-01-01

    Background Common bean (Phaseolus vulgaris L.) is one of the most important legumes in the world. Several diseases severely reduce bean production and quality; therefore, it is very important to better understand disease resistance in common bean in order to prevent these losses. More than 70 resistance (R) genes which confer resistance against various pathogens have been cloned from diverse plant species. Most R genes share highly conserved domains which facilitates the identification of new candidate R genes from the same species or other species. The goals of this study were to isolate expressed R gene-like sequences (RGLs) from 454-derived transcriptomic sequences and expressed sequence tags (ESTs) of common bean, and to develop RGL-tagged molecular markers. Results A data-mining approach was used to identify tentative P. vulgaris R gene-like sequences from approximately 1.69 million 454-derived sequences and 116,716 ESTs deposited in GenBank. A total of 365 non-redundant sequences were identified and named as common bean (P. vulgaris = Pv) resistance gene-like sequences (PvRGLs). Among the identified PvRGLs, about 60% (218 PvRGLs) were from 454-derived sequences. Reverse transcriptase-polymerase chain reaction (RT-PCR) analysis confirmed that PvRGLs were actually expressed in the leaves of common bean. Upon comparison to P. vulgaris genomic sequences, 105 (28.77%) of the 365 tentative PvRGLs could be integrated into the existing common bean physical map. Based on the syntenic blocks between common bean and soybean, 237 (64.93%) PvRGLs were anchored on the P. vulgaris genetic map and will need to be mapped to determine order. In addition, 11 sequence-tagged-site (STS) and 19 cleaved amplified polymorphic sequence (CAPS) molecular markers were developed for 25 unique PvRGLs. Conclusions In total, 365 PvRGLs were successfully identified from 454-derived transcriptomic sequences and ESTs available in GenBank and about 65% of PvRGLs were integrated into the common

  7. Identification of expressed resistance gene-like sequences by data mining in 454-derived transcriptomic sequences of common bean (Phaseolus vulgaris L.).

    PubMed

    Liu, Zhanji; Crampton, Mollee; Todd, Antonette; Kalavacharla, Venu

    2012-03-23

    Common bean (Phaseolus vulgaris L.) is one of the most important legumes in the world. Several diseases severely reduce bean production and quality; therefore, it is very important to better understand disease resistance in common bean in order to prevent these losses. More than 70 resistance (R) genes which confer resistance against various pathogens have been cloned from diverse plant species. Most R genes share highly conserved domains which facilitates the identification of new candidate R genes from the same species or other species. The goals of this study were to isolate expressed R gene-like sequences (RGLs) from 454-derived transcriptomic sequences and expressed sequence tags (ESTs) of common bean, and to develop RGL-tagged molecular markers. A data-mining approach was used to identify tentative P. vulgaris R gene-like sequences from approximately 1.69 million 454-derived sequences and 116,716 ESTs deposited in GenBank. A total of 365 non-redundant sequences were identified and named as common bean (P. vulgaris = Pv) resistance gene-like sequences (PvRGLs). Among the identified PvRGLs, about 60% (218 PvRGLs) were from 454-derived sequences. Reverse transcriptase-polymerase chain reaction (RT-PCR) analysis confirmed that PvRGLs were actually expressed in the leaves of common bean. Upon comparison to P. vulgaris genomic sequences, 105 (28.77%) of the 365 tentative PvRGLs could be integrated into the existing common bean physical map. Based on the syntenic blocks between common bean and soybean, 237 (64.93%) PvRGLs were anchored on the P. vulgaris genetic map and will need to be mapped to determine order. In addition, 11 sequence-tagged-site (STS) and 19 cleaved amplified polymorphic sequence (CAPS) molecular markers were developed for 25 unique PvRGLs. In total, 365 PvRGLs were successfully identified from 454-derived transcriptomic sequences and ESTs available in GenBank and about 65% of PvRGLs were integrated into the common bean genetic map. A total of 30

  8. Commonality.

    ERIC Educational Resources Information Center

    Beaton, Albert E., Jr.

    Commonality analysis is an attempt to understand the relative predictive power of the regressor variables, both individually and in combination. The squared multiple correlation is broken up into elements assigned to each individual regressor and to each possible combination of regressors. The elements have the property that the appropriate sums…

  9. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  10. IgH sequences in common variable immune deficiency reveal altered B cell development and selection**

    PubMed Central

    Roskin, Krishna M.; Simchoni, Noa; Liu, Yi; Lee, Ji-Yeun; Seo, Katie; Hoh, Ramona A.; Pham, Tho; Park, Joon H.; Furman, David; Dekker, Cornelia L.; Davis, Mark M.; James, Judith A.; Nadeau, Kari C.; Cunningham-Rundles, Charlotte; Boyd, Scott D.

    2015-01-01

    Common variable immune deficiency (CVID) is the most common symptomatic primary immune deficiency, affecting ∼1 in 25,000 persons. These patients suffer from impaired antibody responses, autoimmunity, and susceptibility to lymphoid cancers. To explore the cellular basis for these clinical phenotypes, we conducted high-throughput DNA sequencing of immunoglobulin heavy chain gene rearrangements from 93 CVID patients and 105 control subjects and sorted naïve and memory B cells from 13 of the CVID patients and 10 of the control subjects. CVID patients showed abnormal VDJ rearrangement and abnormal formation of complementarity determining region 3 (CDR3). We observed decreased selection against antibodies with long CDR3 regions in memory repertoires and decreased V gene replacement, offering possible mechanisms for increased patient autoreactivity. Our data indicate that patient immunodeficiency might derive both from decreased diversity of the naïve B cell pool and decreased somatic hypermutation in memory repertoires. CVID patients also exhibited abnormal clonal expansion of unmutated B cells relative to controls. Although impaired B cell germinal center activation is commonly viewed as causative in CVID, these data indicate that CVID B cells diverge from controls as early as the pro-B cell stage and suggest possible explanations for the increased incidence of autoimmunity, immunodeficiency, and lymphoma CVID patients. PMID:26311730

  11. Identification of novel craniofacial regulatory domains located far upstream of SOX9 and disrupted in Pierre Robin sequence

    PubMed Central

    Gordon, Christopher T.; Attanasio, Catia; Bhatia, Shipra; Benko, Sabina; Ansari, Morad; Tan, Tiong Y.; Munnich, Arnold; Pennacchio, Len A.; Abadie, Véronique; Temple, I. Karen; Goldenberg, Alice; van Heyningen, Veronica; Amiel, Jeanne; FitzPatrick, David; Kleinjan, Dirk A.; Visel, Axel; Lyonnet, Stanislas

    2015-01-01

    Mutations in the coding sequence of SOX9 cause campomelic dysplasia (CD), a disorder of skeletal development associated with 46,XY disorders of sex development (DSDs). Translocations, deletions and duplications within a ~2 Mb region upstream of SOX9 can recapitulate the CD-DSD phenotype fully or partially, suggesting the existence of an unusually large cis-regulatory control region. Pierre Robin sequence (PRS) is a craniofacial disorder that is frequently an endophenotype of CD and a locus for isolated PRS at ~1.2-1.5 Mb upstream of SOX9 has been previously reported. The craniofacial regulatory potential within this locus, and within the greater genomic domain surrounding SOX9, remains poorly defined. We report two novel deletions upstream of SOX9 in families with PRS, allowing refinement of the regions harbouring candidate craniofacial regulatory elements. In parallel, ChIP-Seq for p300 binding sites in mouse craniofacial tissue led to the identification of several novel craniofacial enhancers at the SOX9 locus, which were validated in transgenic reporter mice and zebrafish. Notably, some of the functionally validated elements fall within the PRS deletions. These studies suggest that multiple non-coding elements contribute to the craniofacial regulation of SOX9 expression, and that their disruption results in PRS. PMID:24934569

  12. Common neural mechanism for processing onset-to-onset intervals and silent gaps in sound sequences.

    PubMed

    Takegata, R; Syssoeva, O; Winkler, I; Paavilainen, P; Näätänen, R

    2001-06-13

    Stimulus onset asynchrony (SOA) and inter-stimulus interval (ISI) are important factors in the perceptual organization of sound sequences. The present study tested whether these two temporal parameters are independently processed in the auditory system. Independence was studied by testing the additivity of mismatch negativity (MMN). Four conditions differing in their temporal regularities were administered: (1) constant SOA and ISI, (2) constant SOA and variable ISI, (3) constant ISI and variable SOA, and (4) variable SOA and ISI. The MMN elicited by simultaneous deviance from the constant SOA and ISI (Condition 1) was compared with an additive model calculated from the MMNs elicited in the other conditions. The amplitude of the MMN in Condition 1 was significantly larger than that of the modeled MMN, suggesting that SOA and ISI are processed by interactive or common neural mechanisms.

  13. Common recognition principles across diverse sequence and structural families of sialic acid binding proteins.

    PubMed

    Bhagavat, Raghu; Chandra, Nagasuma

    2014-01-01

    Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural analyses using a non-redundant data set of known sialic acid binding proteins was carried out, which included exhaustive binding site comparisons and site alignments using in-house algorithms, followed by clustering and tree computation, which has led to derivation of sialic acid recognition principles. Although the proteins in the data set belong to several sequence and structure families, their binding sites could be grouped into only six types. Structural comparison of the binding sites indicates that all sites contain one or more different combinations of key structural features over a common scaffold. The six binding site types thus serve as structural motifs for recognizing sialic acid. Scanning the motifs against a non-redundant set of binding sites from PDB indicated the motifs to be specific for sialic acid recognition. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. As an example analysis, a genome-wide scan for the motifs in structures of Mycobacterium tuberculosis proteome identified 17 hits that contain combinations of the features, suggesting a possible function of sialic acid binding by these proteins.

  14. Multiple sequence assembly from reads alignable to a common reference genome.

    PubMed

    Peng, Qian; Smith, Andrew D

    2011-01-01

    We describe a set of computational problems motivated by certain analysis tasks in genome resequencing. These are assembly problems for which multiple distinct sequences must be assembled, but where the relative positions of reads to be assembled are already known. This information is obtained from a common reference genome and is characteristic of resequencing experiments. The simplest variant of the problem aims at determining a minimum set of superstrings such that each sequenced read matches at least one superstring. We give an algorithm with time complexity O(N), where N is the sum of the lengths of reads, substantially improving on previous algorithms for solving the same problem. We also examine the problem of finding the smallest number of reads to remove such that the remaining reads are consistent with k superstrings. By exploiting a surprising relationship with the minimum cost flow problem, we show that this problem can be solved in polynomial time when nested reads are excluded. If nested reads are permitted, this problem of removing the minimum number of reads becomes NP-hard. We show that permitting mismatches between reads and their nearest superstrings generally renders these problems NP-hard.

  15. Enhancer Sequence Variants and Transcription Factor Deregulation Synergize to Construct Pathogenic Regulatory Circuits in B Cell Lymphoma

    PubMed Central

    Koues, Olivia I.; Kowalewski, Rodney A.; Chang, Li-Wei; Pyfrom, Sarah C.; Schmidt, Jennifer A.; Luo, Hong; Sandoval, Luis E.; Hughes, Tyler B.; Bednarski, Jeffrey J.; Cashen, Amanda F.; Payton, Jacqueline E.; Oltz, Eugene M.

    2014-01-01

    Summary Most B cell lymphomas arise in the germinal center (GC), where humoral immune responses evolve from potentially oncogenic cycles of mutation, proliferation, and clonal selection. Although lymphoma gene expression diverges significantly from GC-B cells, underlying mechanisms that alter the activities of corresponding regulatory elements (REs) remain elusive. Here we define the complete pathogenic circuitry of human follicular lymphoma (FL), which activates or decommissions REs from normal GC-B cells and commandeers enhancers from other lineages. Moreover, independent sets of transcription factors, whose expression was deregulated in FL, targeted commandeered versus decommissioned REs. Our approach revealed two distinct subtypes of low-grade FL, whose pathogenic circuitries resembled GC-B or activated B cells. FL-altered enhancers also were enriched for sequence variants, including somatic mutations, which disrupt transcription factor binding and expression of circuit-linked genes. Thus, the pathogenic regulatory circuitry of FL reveals distinct genetic and epigenetic etiologies for GC-B transformation. PMID:25607463

  16. Maps of open chromatin highlight cell type-restricted patterns of regulatory sequence variation at hematological trait loci.

    PubMed

    Paul, Dirk S; Albers, Cornelis A; Rendon, Augusto; Voss, Katrin; Stephens, Jonathan; van der Harst, Pim; Chambers, John C; Soranzo, Nicole; Ouwehand, Willem H; Deloukas, Panos

    2013-07-01

    Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein-coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site, and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.

  17. CLAMMS: a scalable algorithm for calling common and rare copy number variants from exome sequencing data

    PubMed Central

    Packer, Jonathan S.; Maxwell, Evan K.; O’Dushlaine, Colm; Lopez, Alexander E.; Dewey, Frederick E.; Chernomorsky, Rostislav; Baras, Aris; Overton, John D.; Habegger, Lukas; Reid, Jeffrey G.

    2016-01-01

    Motivation: Several algorithms exist for detecting copy number variants (CNVs) from human exome sequencing read depth, but previous tools have not been well suited for large population studies on the order of tens or hundreds of thousands of exomes. Their limitations include being difficult to integrate into automated variant-calling pipelines and being ill-suited for detecting common variants. To address these issues, we developed a new algorithm—Copy number estimation using Lattice-Aligned Mixture Models (CLAMMS)—which is highly scalable and suitable for detecting CNVs across the whole allele frequency spectrum. Results: In this note, we summarize the methods and intended use-case of CLAMMS, compare it to previous algorithms and briefly describe results of validation experiments. We evaluate the adherence of CNV calls from CLAMMS and four other algorithms to Mendelian inheritance patterns on a pedigree; we compare calls from CLAMMS and other algorithms to calls from SNP genotyping arrays for a set of 3164 samples; and we use TaqMan quantitative polymerase chain reaction to validate CNVs predicted by CLAMMS at 39 loci (95% of rare variants validate; across 19 common variant loci, the mean precision and recall are 99% and 94%, respectively). In the Supplementary Materials (available at the CLAMMS Github repository), we present our methods and validation results in greater detail. Availability and implementation: https://github.com/rgcgithub/clamms (implemented in C). Contact: jeffrey.reid@regeneron.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26382196

  18. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  19. The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences.

    PubMed

    Kensche, Philip Reiner; Hoeijmakers, Wieteke Anna Maria; Toenhake, Christa Geeke; Bras, Maaike; Chappell, Lia; Berriman, Matthew; Bártfai, Richárd

    2016-03-18

    In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen.

  20. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  1. The downstream regulatory sequence of the adenovirus type 2 major late promoter is functionally redundant.

    PubMed Central

    Li, X C; Huang, W L; Flint, S J

    1992-01-01

    Mutagenesis of promoter sequences and oligonucleotide competition assays have been used to demonstrate the late-phase-specific stimulation of the adenovirus type 2 major late promoter is mediated by functionally redundant elements located between positions +75 and +125. These octamer motif-related sequences are recognized by multiple factors. Images PMID:1501301

  2. FINDING REGULATORY ELEMENTS USING JOINT LIKELIHOODS FOR SEQUENCE AND EXPRESSION PROFILE DATA.

    SciTech Connect

    IAN HOLMES, UC BERKELEY, CA, WILLIAM J. BRUNO, LANL

    2000-08-20

    A recent, popular method of finding promoter sequences is to look for conserved motifs up-stream of genes clustered on the basis of expression data. This method presupposes that the clustering is correct. Theoretically, one should be better able to find promoter sequences and create more relevant gene clusters by taking a unified approach to these two problems. We present a likelihood function for a sequence-expression model giving a joint likelihood for a promoter sequence and its corresponding expression levels. An algorithm to estimate sequence-expression model parameters using Gibbs sampling and Expectation/Maximization is described. A program, called kimono, that implements this algorithm has been developed and the source code is freely available over the internet.

  3. SNP marker development for linkage map construction, anchoring of the common bean whole genome sequence and genetic research

    USDA-ARS?s Scientific Manuscript database

    Our objectives were to identify SNP DNA markers based on a diverse set of common bean cultivars via next generation sequencing technologies; to develop Illumina Infinium BeadChip assays containing SNPs with high polymorphism within and between common bean market classes, to create high density genet...

  4. Levels of dendritic cell populations and regulatory T cells vary significantly between two commonly used mouse strains.

    PubMed

    Vogelsang, Petra; Hovden, Arnt-Ove; Jonsson, Roland; Appel, Silke

    2009-12-01

    Dendritic cells (DC) are a heterogeneous group of professional antigen-presenting cells (APC) involved in both initiating immune responses and maintaining tolerance. Roughly, DC can be divided into plasmacytoid DC (pDC) and conventional DC (cDC). By controlling regulatory T cells (Treg), DC can influence the outcome of both immunity and autoimmunity. Since the use of mice as in vivo models became a practical tool for researchers studying pathological events in all kind of human diseases, we decided to compare levels of cDC, pDC and Treg in both spleen and blood between two inbred mouse strains. Here we show that two commonly used mouse strains, BALB/c and C57BL/10J mice, have significantly different levels of distinct CD11c(+)/CD4(-)/CD8a(+), CD11c(+)/CD4(+)/CD8a(-) and CD11c(+)/CD4(-)/CD8a(-) cDC populations, pDC and Treg. Therefore, we emphasize the importance of considering the proper model when comparing data sets from different mouse strains.

  5. Autoimmunity and its association with regulatory T cells and B cell subsets in patients with common variable immunodeficiency.

    PubMed

    Azizi, G; Abolhassani, H; Kiaee, F; Tavakolinia, N; Rafiemanesh, H; Yazdani, R; Mahdaviani, S A; Mohammadikhajehdehi, S; Tavakol, M; Ziaee, V; Negahdari, B; Mohammadi, J; Mirshafiey, A; Aghamohammadi, A

    2017-07-20

    Common variable immunodeficiency (CVID) is one of the most prevalent symptomatic primary immunodeficiencies (PIDs), which manifests a wide clinical variability such as autoimmunity, as well as T cell and B cell abnormalities. A total of 72 patients with CVID were enrolled in this study. Patients were evaluated for clinical manifestations and classified according to the presence or absence of autoimmune disease. We measured regulatory T cells (Tregs) and B-cell subsets using flow cytometry, as well as specific antibody response (SAR) to pneumococcal vaccine, autoantibodies and anti-IgA in patients. Twenty-nine patients (40.3%) have shown at least one autoimmune manifestation. Autoimmune cytopenias and autoimmune gastrointestinal diseases were the most common. A significant association was detected between autoimmunity and presence of hepatomegaly and splenomegaly. Among CVID patients, 38.5% and 79.3% presented a defect in Tregs and switched memory B-cells, respectively, whereas 69.0% presented CD21(low) B cell expansion. Among patients with a defect in Treg, switched memory and CD21(low) B cell, the frequency of autoimmunity was 80.0%, 52.2% and 55.0%, respectively. A negative correlation was observed between the frequency of Tregs and CD21(low) B cell population. 82.2% of patients had a defective SAR which was associated with the lack of autoantibodies. Autoimmunity may be the first clinical manifestation of CVID, thus routine screening of immunoglobulins is suggested for patients with autoimmunity. Lack of SAR in CVID is associated with the lack of specific autoantibodies in patients with autoimmunity. It is suggested that physicians use alternative diagnostic procedures. Copyright © 2017. Published by Elsevier España, S.L.U.

  6. Weak Palindromic Consensus Sequences Are a Common Feature Found at the Integration Target Sites of Many Retroviruses

    PubMed Central

    Wu, Xiaolin; Li, Yuan; Crise, Bruce; Burgess, Shawn M.; Munroe, David J.

    2005-01-01

    Integration into the host genome is one of the hallmarks of the retroviral life cycle and is catalyzed by virus-encoded integrases. While integrase has strict sequence requirements for the viral DNA ends, target site sequences have been shown to be very diverse. We carefully examined a large number of integration target site sequences from several retroviruses, including human immunodeficiency virus type 1, simian immunodeficiency virus, murine leukemia virus, and avian sarcoma-leukosis virus, and found that a statistical palindromic consensus, centered on the virus-specific duplicated target site sequence, was a common feature at integration target sites for these retroviruses. PMID:15795304

  7. A Catalog of Regulatory Sequences for Trait Gene for the Genome Editing of Wheat.

    PubMed

    Makai, Szabolcs; Tamás, László; Juhász, Angéla

    2016-01-01

    Wheat has been cultivated for 10000 years and ever since the origin of hexaploid wheat it has been exempt from natural selection. Instead, it was under the constant selective pressure of human agriculture from harvest to sowing during every year, producing a vast array of varieties. Wheat has been adopted globally, accumulating variation for genes involved in yield traits, environmental adaptation and resistance. However, one small but important part of the wheat genome has hardly changed: the regulatory regions of both the x- and y-type high molecular weight glutenin subunit (HMW-GS) genes, which are alone responsible for approximately 12% of the grain protein content. The phylogeny of the HMW-GS regulatory regions of the Triticeae demonstrates that a genetic bottleneck may have led to its decreased diversity during domestication and the subsequent cultivation. It has also highlighted the fact that the wild relatives of wheat may offer an unexploited genetic resource for the regulatory region of these genes. Significant research efforts have been made in the public sector and by international agencies, using wild crosses to exploit the available genetic variation, and as a result synthetic hexaploids are now being utilized by a number of breeding companies. However, a newly emerging tool of genome editing provides significantly improved efficiency in exploiting the natural variation in HMW-GS genes and incorporating this into elite cultivars and breeding lines. Recent advancement in the understanding of the regulation of these genes underlines the needs for an overview of the regulatory elements for genome editing purposes.

  8. A Catalog of Regulatory Sequences for Trait Gene for the Genome Editing of Wheat

    PubMed Central

    Makai, Szabolcs; Tamás, László; Juhász, Angéla

    2016-01-01

    Wheat has been cultivated for 10000 years and ever since the origin of hexaploid wheat it has been exempt from natural selection. Instead, it was under the constant selective pressure of human agriculture from harvest to sowing during every year, producing a vast array of varieties. Wheat has been adopted globally, accumulating variation for genes involved in yield traits, environmental adaptation and resistance. However, one small but important part of the wheat genome has hardly changed: the regulatory regions of both the x- and y-type high molecular weight glutenin subunit (HMW-GS) genes, which are alone responsible for approximately 12% of the grain protein content. The phylogeny of the HMW-GS regulatory regions of the Triticeae demonstrates that a genetic bottleneck may have led to its decreased diversity during domestication and the subsequent cultivation. It has also highlighted the fact that the wild relatives of wheat may offer an unexploited genetic resource for the regulatory region of these genes. Significant research efforts have been made in the public sector and by international agencies, using wild crosses to exploit the available genetic variation, and as a result synthetic hexaploids are now being utilized by a number of breeding companies. However, a newly emerging tool of genome editing provides significantly improved efficiency in exploiting the natural variation in HMW-GS genes and incorporating this into elite cultivars and breeding lines. Recent advancement in the understanding of the regulation of these genes underlines the needs for an overview of the regulatory elements for genome editing purposes. PMID:27766102

  9. Conservation of position and sequence of a novel, widely expressed gene containing the major human {alpha}-globin regulatory element

    SciTech Connect

    Vyas, P.; Vickers, M.A.; Picketts, D.J.; Higgs, D.R.

    1995-10-10

    We have determined the cDNA and genomic structure of a gene (-14 gene) that lies adjacent to the human {alpha}-globin cluster. Although it is expressed in a wide range of cell lines and tissues, a previously described erythroid-specific regulatory element that controls expression of the {alpha}-globin genes lies within intron 5 of this gene. Analysis of the -14 gene promoter shows that it is GC rich and associated with a constitutively expressed DNase 1 hypersensitive site; unlike the {alpha}-globin promoter, it does not contain a TATA or CCAAT box. These and other differences in promoter structure may explain why the erythroid regulatory element interacts specifically with the {alpha}-globin promoters and not the -14 gene promoter, which lies between the {alpha} promoters and their regulatory element. Interspecies comparisons demonstrate that the sequence and location of the -14 gene adjacent to the a cluster have been maintained since the bird/mammal divergence, 270 million years ago. 38 refs., 6 figs.

  10. 'Size leap' algorithm: an efficient extraction of the longest common motifs from a molecular sequence set. Application to the DNA sequence reconstruction.

    PubMed

    Danckaert, A; Chappey, C; Hazout, S

    1991-10-01

    We propose a new method, called 'size leap' algorithm, of search for motifs of maximum size and common to two fragments at least. It allows the creation of a reduced database of motifs from a set of sequences whose size obeys the series of Fibonacci numbers. The convenience lies in the efficiency of the motif extraction. It can be applied in the establishment of overlap regions for DNA sequence reconstruction and multiple alignment of biological sequences. The method of complete DNA sequence reconstruction by extraction of the longest motifs ('anchor motifs') is presented as an application of the size leap algorithm. The details of a reconstruction from three sequenced fragments are given as an example.

  11. Comprehensive analysis and characterization of the TCR alpha chain sequences in the common marmoset.

    PubMed

    Fujii, Yoshiki; Matsutani, Takaji; Kitaura, Kazutaka; Suzuki, Satsuki; Itoh, Tsunetoshi; Takasaki, Tomohiko; Suzuki, Ryuji; Kurane, Ichiro

    2010-06-01

    The common marmoset (Callithrix jacchus) is useful as a nonhuman primate model of human diseases. Although the marmoset model has great potential for studying autoimmune diseases and immune responses against pathogens, little information is available regarding the genes involved in adaptive immunity. Here, we identified one TCR alpha constant (TRAC), 46 TRAJ (joining), and 35 TRAV (variable) segments from marmoset cDNA. Marmoset TRAC, TRAJ, and TRAV shared 80%, 68-100%, and 79-98% identity with their human counterparts at the amino acid level, respectively. The amino acid sequences were less conserved in TRAC than in TCRbeta chain constant (TRBC). Comparative analysis of TRAV between marmosets and humans showed that the rates of synonymous substitutions per site (d(S)) were not significantly different between the framework regions (FRs) and complementarity determining regions (CDRs), whereas the rates of nonsynonymous substitutions per site (d(N)) were significantly lower in the FRs than in CDRs. Interestingly, the d(N) values of the CDRs were greater for TRBV than TRAV. These results suggested that after the divergence of Catarrhini from Platyrrhini, amino acid substitutions were decreased in the FRs by purifying selection and occurred more frequently in CDRbeta than in CDRalpha by positive selection, probably depending on structural and functional constraints. This study provides not only useful information facilitating the investigation of adaptive immunity using the marmoset model but also new insight into the molecular evolution of the TCR heterodimer in primate species.

  12. Whole genome sequence of Desulfovibrio magneticus strain RS-1 revealed common gene clusters in magnetotactic bacteria

    PubMed Central

    Nakazawa, Hidekazu; Arakaki, Atsushi; Narita-Yamada, Sachiko; Yashiro, Isao; Jinno, Koji; Aoki, Natsuko; Tsuruyama, Ai; Okamura, Yoshiko; Tanikawa, Satoshi; Fujita, Nobuyuki; Takeyama, Haruko; Matsunaga, Tadashi

    2009-01-01

    Magnetotactic bacteria are ubiquitous microorganisms that synthesize intracellular magnetite particles (magnetosomes) by accumulating Fe ions from aquatic environments. Recent molecular studies, including comprehensive proteomic, transcriptomic, and genomic analyses, have considerably improved our hypotheses of the magnetosome-formation mechanism. However, most of these studies have been conducted using pure-cultured bacterial strains of α-proteobacteria. Here, we report the whole-genome sequence of Desulfovibrio magneticus strain RS-1, the only isolate of magnetotactic microorganisms classified under δ-proteobacteria. Comparative genomics of the RS-1 and four α-proteobacterial strains revealed the presence of three separate gene regions (nuo and mamAB-like gene clusters, and gene region of a cryptic plasmid) conserved in all magnetotactic bacteria. The nuo gene cluster, encoding NADH dehydrogenase (complex I), was also common to the genomes of three iron-reducing bacteria exhibiting uncontrolled extracellular and/or intracellular magnetite synthesis. A cryptic plasmid, pDMC1, encodes three homologous genes that exhibit high similarities with those of other magnetotactic bacterial strains. In addition, the mamAB-like gene cluster, encoding the key components for magnetosome formation such as iron transport and magnetosome alignment, was conserved only in the genomes of magnetotactic bacteria as a similar genomic island-like structure. Our findings suggest the presence of core genetic components for magnetosome biosynthesis; these genes may have been acquired into the magnetotactic bacterial genomes by multiple gene-transfer events during proteobacterial evolution. PMID:19675025

  13. Combining effects from rare and common genetic variants in an exome-wide association study of sequence data.

    PubMed

    Aschard, Hugues; Qiu, Weiliang; Pasaniuc, Bogdan; Zaitlen, Noah; Cho, Michael H; Carey, Vincent

    2011-11-29

    Recent breakthroughs in next-generation sequencing technologies allow cost-effective methods for measuring a growing list of cellular properties, including DNA sequence and structural variation. Next-generation sequencing has the potential to revolutionize complex trait genetics by directly measuring common and rare genetic variants within a genome-wide context. Because for a given gene both rare and common causal variants can coexist and have independent effects on a trait, strategies that model the effects of both common and rare variants could enhance the power of identifying disease-associated genes. To date, little work has been done on integrating signals from common and rare variants into powerful statistics for finding disease genes in genome-wide association studies. In this analysis of the Genetic Analysis Workshop 17 data, we evaluate various strategies for association of rare, common, or a combination of both rare and common variants on quantitative phenotypes in unrelated individuals. We show that the analysis of common variants only using classical approaches can achieve higher power to detect causal genes than recently proposed rare variant methods and that strategies that combine association signals derived independently in rare and common variants can slightly increase the power compared to strategies that focus on the effect of either the rare variants or the common variants.

  14. In situ detection of a heat-shock regulatory element binding protein using a soluble synthetic enhancer sequence.

    PubMed Central

    Harel-Bellan, A; Brini, A T; Ferris, D K; Robin, P; Farrar, W L

    1989-01-01

    In various studies, enhancer binding proteins have been successfully absorbed out by competing sequences inserted into plasmids, resulting in the inhibition of the plasmid expression. Theoretically, such a result could be achieved using synthetic enhancer sequences not inserted into plasmids. In this study, a double stranded DNA sequence corresponding to the human heat shock regulatory element was chemically synthesized. By in vitro retardation assays, the synthetic sequence was shown to bind specifically a protein in extracts from the human T cell line Jurkat. When the synthetic enhancer was electroporated into Jurkat cells, not only the enhancer was shown to remain undegraded into the cells for up to 2 days, but also it was shown to bind intracellularly a protein. The binding was specific and was modulated upon heat shock. Furthermore, the binding protein was shown to be of the expected molecular weight by UV crosslinking. However, when the synthetic enhancer element was co-electroporated with an HSP 70-CAT reporter construct, the expression of the reporter plasmid was consistently enhanced in the presence of the exogenous synthetic enhancer. Images PMID:2740211

  15. Spatio-temporal sequence of cross-regulatory events in root meristem growth

    PubMed Central

    Scacchi, Emanuele; Salinas, Paula; Gujas, Bojan; Santuari, Luca; Krogan, Naden; Ragni, Laura; Berleth, Thomas; Hardtke, Christian S.

    2010-01-01

    A central question in developmental biology is how multicellular organisms coordinate cell division and differentiation to determine organ size. In Arabidopsis roots, this balance is controlled by cytokinin-induced expression of SHORT HYPOCOTYL 2 (SHY2) in the so-called transition zone of the meristem, where SHY2 negatively regulates auxin response factors (ARFs) by protein–protein interaction. The resulting down-regulation of PIN-FORMED (PIN) auxin efflux carriers is considered the key event in promoting differentiation of meristematic cells. Here we show that this regulation involves additional, intermediary factors and is spatio-temporally constrained. We found that the described cytokinin–auxin crosstalk antagonizes BREVIS RADIX (BRX) activity in the developing protophloem. BRX is an auxin-responsive target of the prototypical ARF MONOPTEROS (MP), a key promoter of vascular development, and transiently enhances PIN3 expression to promote meristem growth in young roots. At later stages, cytokinin induction of SHY2 in the vascular transition zone restricts BRX expression to down-regulate PIN3 and thus limit meristem growth. Interestingly, proper SHY2 expression requires BRX, which could reflect feedback on the auxin responsiveness of SHY2 because BRX protein can directly interact with MP, likely acting as a cofactor. Thus, cross-regulatory antagonism between BRX and SHY2 could determine ARF activity in the protophloem. Our data suggest a model in which the regulatory interactions favor BRX expression in the early proximal meristem and SHY2 prevails because of supplementary cytokinin induction in the later distal meristem. The complex equilibrium of this regulatory module might represent a universal switch in the transition toward differentiation in various developmental contexts. PMID:21149702

  16. Sox2 regulatory region 2 sequence works as a DNA nuclear targeting sequence enhancing the efficiency of an exogenous gene expression in ES cells.

    PubMed

    Funabashi, Hisakage; Takatsu, Makoto; Saito, Mikako; Matsuoka, Hideaki

    2010-10-01

    In this report, the effects of two DNA nuclear targeting sequence (DTS) candidates on the gene expression efficiency in ES cells were investigated. Reporter plasmids containing the simian virus 40 (SV40) promoter/enhancer sequence (SV40-DTS), a DTS for various types of cells but not being reported yet for ES cells, and the 81 base pairs of Sox2 regulatory region 2 (SRR2) where two transcriptional factors in ES cells, Oct3/4 and Sox2, are bound (SRR2-DTS), were introduced into cytoplasm in living cells by femtoinjection. The gene expression efficiencies of each plasmid in mouse insulinoma cell line MIN6 cells and mouse ES cells were then evaluated. Plasmids including SV40-DTS and SRR2-DTS exhibited higher gene expression efficiency comparing to plasmids without these DTSs, and thus it was concluded that both sequences work as a DTS in ES cells. In addition, it was suggested that SRR2-DTS works as an ES cell-specific DTS. To the best of our knowledge, this is the first report to confirm the function of DTSs in ES cells.

  17. Formulaic Sequences as a Regulatory Mechanism for Cognitive Perturbations During the Achievement of Social Goals.

    PubMed

    Wray, Alison

    2017-07-01

    This paper explores two questions central to understanding the nature of formulaic sequences: (1) What are they for? and (2) What determines how many there are? The "Communicative Impact" model draws into a single account how language is shaped by cognitive processing on the one hand and socio-interactional function on the other: Formulaic sequences play a range of coordinated roles in neutralizing unanticipated perturbations in the cognitive management of language, so the speaker's socio-interactional goals can still be achieved. One role involves compensatory actions to sustain fluency. However, these actions are themselves context-sensitive, so the balance of types of formulaic sequence will vary according to situation. The model applies equally to temporary cognitive pressure and chronic problems such as dementia and limited linguistic competency in a foreign language. Copyright © 2017 Cognitive Science Society, Inc.

  18. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

    PubMed Central

    2011-01-01

    Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the

  20. Development of molecular markers linked to disease resistance genes in common bean based on whole genome sequence.

    PubMed

    Meziadi, Chouaïb; Richard, Manon M S; Derquennes, Amandine; Thareau, Vincent; Blanchet, Sophie; Gratias, Ariane; Pflieger, Stéphanie; Geffroy, Valérie

    2016-01-01

    Common bean (Phaseolus vulgaris) is the most important grain legume for direct human consumption in the world, particularly in developing countries where it constitutes the main source of protein. Unfortunately, common bean yield stability is constrained by a number of pests and diseases. As use of resistant genotypes is the most economic and ecologically safe means for controlling plant diseases, efforts have been made to genetically characterize resistance genes (R genes) in common bean. Despite its agronomic importance, genomic resources available in common bean were limited until the recent sequencing of common bean genome (Andean genotype G19833). Besides allowing the annotation of Nucleotide Binding-Leucine Rich Repeat (NB-LRR) encoding gene family, which is the prevalent class of disease R genes in plants, access to the whole genome sequence of common bean can be of great help for intense selection to increase the overall efficiency of crop improvement programs using marker-assisted selection (MAS). This review presents the state of the art of common bean NB-LRR gene clusters, their peculiar location in subtelomeres and correlation with genetically characterized monogenic R genes, as well as how the availability of the whole genome sequence can boost the development of molecular markers for MAS.

  1. Monte Carlo Simulations of the Post-Common-Envelope White-Dwarf Main-Sequence Binary Population

    SciTech Connect

    Camacho, Judit; Torres, Santiago; Garcia-Berro, Enrique; Schreiber, Matthias R.

    2010-12-22

    We present a detailed Monte Carlo simulator of the population of binary systems within the solar neighborhood. We have used the most up-to-date stellar evolutionary models, a complete treatment of the Roche lobe overflow episode, as well as a full implementation of the orbital evolution of the binary system. Preliminary results are presented for the population of white-dwarf main-sequence binaries, resulting from a common envelope episode. We also study the role played by the binding energy parameter, {lambda}, and by the common envelope efficiency, {alpha}{sub CE}. Finally, results are compared with the population of identified white-dwarf main-sequence binaries.

  2. [Dual-index sequence analysis of common and variant peak ratio in far-infrared fingerprint of Pyritum].

    PubMed

    Huang, Liping; Wu, Jing

    2011-06-01

    To set up the dual-indexes sequence analytical method for far-infrared fingerprint in which the dual indexes are common peak ratio and variant ration. Two new indexes, common peak ratio and variant peak ratio, were applied and their values were calculated by means of sequential analysis, in which each Pyritum sample's far-infrared fingerprint spectra were set up and the common peak ratio sequences were arranged in order of size in comparision with other samples. The analytical results suggested that samples S3 and S4, S5, S6 and S7, S8 and S9 from the same region showed higher common peak ratio and lower variant peak ratio. However, the sample S1 from Anhui showed little similarity with others. The method, applied to distinguish Pyritum of different areas and batches, is reasonable to characterize of traditional Chinese medicine.

  3. Identification of a novel regulatory sequence of actin nucleation promoting factor encoded by Autographa californica multiple nucleopolyhedrovirus.

    PubMed

    Wang, Yun; Zhang, Yongli; Han, Shili; Hu, Xue; Zhou, Yuan; Mu, Jingfang; Pei, Rongjuan; Wu, Chunchen; Chen, Xinwen

    2015-04-10

    Actin polymerization induced by nucleation promoting factors (NPFs) is one of the most fundamental biological processes in eukaryotic cells. NPFs contain a conserved output domain (VCA domain) near the C terminus, which interacts with and activates the cellular actin-related protein 2/3 complex (Arp2/3) to induce actin polymerization and a diverse regulatory domain near the N terminus. Autographa californica multiple nucleopolyhedrovirus (AcMNPV) nucleocapsid protein P78/83 is a virus-encoded NPF that contains a C-terminal VCA domain and induces actin polymerization in virus-infected cells. However, there is no similarity between the N terminus of P78/83 and that of other identified NPFs, suggesting that P78/83 may possess a unique regulatory mechanism. In this study, we identified a multifunctional regulatory sequence (MRS) located near the N terminus of P78/83 and determined that one of its functions is to serve as a degron to mediate P78/83 degradation in a proteasome-dependent manner. In AcMNPV-infected cells, the MRS also binds to another nucleocapsid protein, BV/ODV-C42, which stabilizes P78/83 and modulates the P78/83-Arp2/3 interaction to orchestrate actin polymerization. In addition, the MRS is also essential for the incorporation of P78/83 into the nucleocapsid, ensuring virion mobility powered by P78/83-induced actin polymerization. The triple functions of the MRS enable P78/83 to serve as an essential viral protein in the AcMNPV replication cycle, and the possible roles of the MRS in orchestrating the virus-induced actin polymerization and viral genome decapsidation are discussed.

  4. Mutation analysis of TRPS1 gene including core promoter, 5'UTR, and 3'UTR regulatory sequences with insight into their organization.

    PubMed

    Solc, Roman; Klugerova, Michaela; Vcelak, Josef; Baxova, Alice; Kuklik, Miloslav; Vseticka, Jan; Beharka, Rastislav; Hirschfeldova, Katerina

    2017-01-01

    The TRPS1 protein is a potent regulator of proliferation, differentiation, and apoptosis. The TRPS1 gene aberrations are strongly associated with rare trichorhinophalangeal syndrome (TRPS) development. We have conducted MLPA analysis to capture deletion within the crucial 8q24.1 chromosomal region in combination with mutation analysis of TRPS1 gene including core promoter, 5'UTR, and 3'UTR sequences in nine TRPS patients. Low complexity or extent of untranslated regulatory sequences avoided them from analysis in previous studies. Amplicon based next generation sequencing used in our study bridge over these technical limitations. Finally, we have made extended in silico analysis of TRPS1 gene regulatory sequences organization. Single contiguous deletion and an intragenic deletion intervening several exons were detected. Mutation analysis revealed five TRPS1 gene aberrations (two structural rearrangements, two nonsense mutations, and one missense substitution) reaching the overall detection rate of 78%. Several polymorphic variants were detected within the analysed regulatory sequences but without proposed pathogenic effect. In silico analysis suggested alternative promoter usage and diverse expression effectivity for different TRPS1 transcripts. Haploinsufficiency of TRPS1 gene was responsible for most of the TRPS phenotype. Structure of TRPS1 gene regulatory sequences is indicative of generally low single allele expression and its tight control.

  5. CTCF binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation

    PubMed Central

    Plasschaert, Robert N.; Vigneau, Sébastien; Tempera, Italo; Gupta, Ravi; Maksimoska, Jasna; Everett, Logan; Davuluri, Ramana; Mamorstein, Ronen; Lieberman, Paul M.; Schultz, David; Hannenhalli, Sridhar; Bartolomei, Marisa S.

    2014-01-01

    CTCF (CCCTC-binding factor) is a highly conserved multifunctional DNA-binding protein with thousands of binding sites genome-wide. Our previous work suggested that differences in CTCF’s binding site sequence may affect the regulation of CTCF recruitment and its function. To investigate this possibility, we characterized changes in genome-wide CTCF binding and gene expression during differentiation of mouse embryonic stem cells. After separating CTCF sites into three classes (LowOc, MedOc and HighOc) based on similarity to the consensus motif, we found that developmentally regulated CTCF binding occurs preferentially at LowOc sites, which have lower similarity to the consensus. By measuring the affinity of CTCF for selected sites, we show that sites lost during differentiation are enriched in motifs associated with weaker CTCF binding in vitro. Specifically, enrichment for T at the 18th position of the CTCF binding site is associated with regulated binding in the LowOc class and can predictably reduce CTCF affinity for binding sites. Finally, by comparing changes in CTCF binding with changes in gene expression during differentiation, we show that LowOc and HighOc sites are associated with distinct regulatory functions. Our results suggest that the regulatory control of CTCF is dependent in part on specific motifs within its binding site. PMID:24121688

  6. Linkage disequilibrium among commonly genotyped SNP variants detected from bull sequence().

    PubMed

    Snelling, W M; Kuehn, L A; Keel, B N; Thallman, R M; Bennett, G L

    2017-10-01

    Genomic prediction utilizing causal variants could increase selection accuracy above that achieved with SNPs genotyped by currently available arrays used for genomic selection. A number of variants detected from sequencing influential sires are likely to be causal, but noticeable improvements in prediction accuracy using imputed sequence variant genotypes have not been reported. Improvement in accuracy of predicted breeding values may be limited by the accuracy of imputed sequence variants. Using genotypes of SNPs on a high-density array and non-synonymous SNPs detected in sequence from influential sires of a multibreed population, results of this examination suggest that linkage disequilibrium between non-synonymous and array SNPs may be insufficient for accurate imputation from the array to sequence. In contrast to 75% of array SNPs being strongly correlated to another SNP on the array, less than 25% of the non-synonymous SNPs were strongly correlated to an array SNP. When correlations between non-synonymous and array SNPs were strong, distances between the SNPs were greater than separation that might be expected based on linkage disequilibrium decay. Consistently near-perfect whole-genome linkage disequilibrium between the full array and each non-synonymous SNP within the sequenced bulls suggests that whole-genome approaches to infer sequence variants might be more accurate than imputation based on local haplotypes. Opportunity for strong linkage disequilibrium between sequence and array SNPs may be limited by discrepancies in allele frequency distributions, so investigating alternate genotyping approaches and panels providing greater chances of frequency-matched SNPs strongly correlated to sequence variants is also warranted. Genotypes used for this study are available from https://www.animalgenome.org/repository/pub/;USDA2017.0519/. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  7. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  8. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  9. Maternal stress, preterm birth, and DNA methylation at imprint regulatory sequences in humans.

    PubMed

    Vidal, Adriana C; Benjamin Neelon, Sara E; Liu, Ying; Tuli, Abbas M; Fuemmeler, Bernard F; Hoyo, Cathrine; Murtha, Amy P; Huang, Zhiqing; Schildkraut, Joellen; Overcash, Francine; Kurtzberg, Joanne; Jirtle, Randy L; Iversen, Edwin S; Murphy, Susan K

    2014-01-01

    In infants exposed to maternal stress in utero, phenotypic plasticity through epigenetic events may mechanistically explain increased risk of preterm birth (PTB), which confers increased risk for neurodevelopmental disorders, cardiovascular disease, and cancers in adulthood. We examined associations between prenatal maternal stress and PTB, evaluating the role of DNA methylation at imprint regulatory regions. We enrolled women from prenatal clinics in Durham, NC. Stress was measured in 537 women at 12 weeks of gestation using the Perceived Stress Scale. DNA methylation at differentially methylated regions (DMRs) associated with H19, IGF2, MEG3, MEST, SGCE/PEG10, PEG3, NNAT, and PLAGL1 was measured from peripheral and cord blood using bisulfite pyrosequencing in a sub-sample of 79 mother-infant pairs. We examined associations between PTB and stress and evaluated differences in DNA methylation at each DMR by stress. Maternal stress was not associated with PTB (OR = 0.98; 95% CI, 0.40-2.40; P = 0.96), after adjustment for maternal body mass index (BMI), income, and raised blood pressure. However, elevated stress was associated with higher infant DNA methylation at the MEST DMR (2.8% difference, P < 0.01) after adjusting for PTB. Maternal stress may be associated with epigenetic changes at MEST, a gene relevant to maternal care and obesity. Reduced prenatal stress may support the epigenomic profile of a healthy infant.

  10. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

    PubMed Central

    Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

    2015-01-01

    We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355

  11. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum).

    PubMed

    Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

    2015-01-01

    We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  12. Inverted duplication of histone genes in chicken and disposition of regulatory sequences.

    PubMed Central

    Wang, S W; Robins, A J; d'Andrea, R; Wells, J R

    1985-01-01

    Sequence analysis of an 8.4 kb fragment containing five chicken histone genes shows that an H4-H2A gene pair is duplicated and inverted around a central H3 gene. A left and right region, each of 2.1 kb are 97% homologous and the boundaries of homology coincide with ten base pair repeats. These boundary regions also contain highly conserved gene promoter elements, suggesting that interaction of transcriptional machinery with histone genes may be connected with recombination in promoter regions, resulting in the inverted duplication structure seen in this cluster. PMID:4000938

  13. A variant in the sonic hedgehog regulatory sequence (ZRS) is associated with triphalangeal thumb and deregulates expression in the developing limb

    PubMed Central

    Furniss, Dominic; Lettice, Laura A.; Taylor, Indira B.; Critchley, Paul S.; Giele, Henk; Hill, Robert E.; Wilkie, Andrew O.M.

    2008-01-01

    A locus for triphalangeal thumb, variably associated with pre-axial polydactyly, was previously identified in the zone of polarizing activity regulatory sequence (ZRS), a long range limb-specific enhancer of the Sonic Hedgehog (SHH) gene at human chromosome 7q36.3. Here, we demonstrate that a 295T>C variant in the human ZRS, previously thought to represent a neutral polymorphism, acts as a dominant allele with reduced penetrance. We found this variant in three independently ascertained probands from southern England with triphalangeal thumb, demonstrated significant linkage of the phenotype to the variant (LOD = 4.1), and identified a shared microsatellite haplotype around the ZRS, suggesting that the probands share a common ancestor. An individual homozygous for the 295C allele presented with isolated bilateral triphalangeal thumb resembling the heterozygous phenotype, suggesting that the variant is largely dominant to the wild-type allele. As a functional test of the pathogenicity of the 295C allele, we utilized a mutated ZRS construct to demonstrate that it can drive ectopic anterior expression of a reporter gene in the developing mouse forelimb. We conclude that the 295T>C variant is in fact pathogenic and, in southern England, appears to be the most common cause of triphalangeal thumb. Depending on the dispersal of the founding mutation, it may play a wider role in the aetiology of this disorder. PMID:18463159

  14. Draft Genome Sequences of Two Isolates of Colletotrichum lindemuthianum, the Causal Agent of Anthracnose in Common Beans

    PubMed Central

    de Queiroz, Casley Borges; Correia, Hilberty L. Nunes; Menicucci, Renato Pedrozo

    2017-01-01

    ABSTRACT Colletotrichum lindemuthianum is the causal agent of anthracnose in common beans, one of the main limiting factors of their culture. Here, we report for the first time, to our knowledge, a draft of the complete genome sequences of two isolates belonging to 83.501 and 89 A2 2-3 of C. lindemutuianum. PMID:28473373

  15. The Effects of Common Knowledge Construction Model Sequence of Lessons on Science Achievement and Relational Conceptual Change

    ERIC Educational Resources Information Center

    Ebenezer, Jazlin; Chacko, Sheela; Kaya, Osman Nafiz; Koya, Satya Kiran; Ebenezer, Devairakkam Luke

    2010-01-01

    The purpose of this study was to investigate the effects of the Common Knowledge Construction Model (CKCM) lesson sequence, an intervention based both in conceptual change theory and in Phenomenography, a subset of conceptual change theory. A mixed approach was used to investigate whether this model had a significant effect on 7th grade students'…

  16. Small RNA deep sequencing revealed that mixed infection of known and unknown viruses were common in field collected vegetable samples

    USDA-ARS?s Scientific Manuscript database

    In an effort to characterize the causal agents for plant diseases in field collected samples using the small RNA deep sequencing technology, numerous known or novel viruses and viroids were identified. In many cases, a mixed infection with multiple pathogen species was common. Such situation compl...

  17. Draft Genome Sequence of Catellicoccus marimammalium, a Novel Species Commonly Found in Gull Feces

    EPA Science Inventory

    Catellicoccus marimammalium is a relatively uncharacterized Gram-positive, facultative anaerobe with potential utility as an indicator of waterfowl fecal contamination. Here we report an annotated draft genome sequence that suggests this organism may be a symbiotic gut microbe.

  18. Complete Genome Sequences of Three Rhizobium gallicum Symbionts Associated with Common Bean (Phaseolus vulgaris)

    PubMed Central

    Bustos, Patricia; Santamaría, Rosa Isela; Pérez-Carrascal, Olga María; Acosta, José Luis; Lozano, Luis; Juárez, Soledad; Martínez-Romero, Esperanza; Cevallos, Miguel Ángel; Romero, David; Dávila, Guillermo; Vinuesa, Pablo; Miranda, Fabiola; Ormeño, Ernesto

    2017-01-01

    ABSTRACT The whole-genome sequences of three strains of Rhizobium gallicum reported here support the concept that the distinct nodulation host ranges displayed by the symbiovars gallicum and phaseoli can be largely explained by different symbiotic plasmids. PMID:28302777

  19. Draft Genome Sequence of Catellicoccus marimammalium, a Novel Species Commonly Found in Gull Feces

    EPA Science Inventory

    Catellicoccus marimammalium is a relatively uncharacterized Gram-positive, facultative anaerobe with potential utility as an indicator of waterfowl fecal contamination. Here we report an annotated draft genome sequence that suggests this organism may be a symbiotic gut microbe.

  20. Patterns and Sequences: Interactive Exploration of Clickstreams to Understand Common Visitor Paths.

    PubMed

    Liu, Zhicheng; Wang, Yang; Dontcheva, Mira; Hoffman, Matthew; Walker, Seth; Wilson, Alan

    2017-01-01

    Modern web clickstream data consists of long, high-dimensional sequences of multivariate events, making it difficult to analyze. Following the overarching principle that the visual interface should provide information about the dataset at multiple levels of granularity and allow users to easily navigate across these levels, we identify four levels of granularity in clickstream analysis: patterns, segments, sequences and events. We present an analytic pipeline consisting of three stages: pattern mining, pattern pruning and coordinated exploration between patterns and sequences. Based on this approach, we discuss properties of maximal sequential patterns, propose methods to reduce the number of patterns and describe design considerations for visualizing the extracted sequential patterns and the corresponding raw sequences. We demonstrate the viability of our approach through an analysis scenario and discuss the strengths and limitations of the methods based on user feedback.

  1. Identification of regulatory sequences in the gene for 5-aminolevulinate synthase from rat.

    PubMed

    Braidotti, G; Borthwick, I A; May, B K

    1993-01-15

    The housekeeping enzyme 5-aminolevulinate synthase (ALAS) regulates the supply of heme for respiratory cytochromes. Here we report on the isolation of a genomic clone for the rat ALAS gene. The 5'-flanking region was fused to the chloramphenicol acetyltransferase gene and transient expression analysis revealed the presence of both positive and negative cis-acting sequences. Expression was substantially increased by the inclusion of the first intron located in the 5'-untranslated region. Sequence analysis of the promoter identified two elements at positions -59 and -88 bp with strong similarity to the binding site for nuclear respiratory factor 1 (NRF-1). Gel shift analysis revealed that both NRF-1 elements formed nucleoprotein complexes which could be abolished by an authentic NRF-1 oligomer. Mutagenesis of each NRF-1 motif in the ALAS promoter gave substantially lowered levels of chloramphenicol acetyltransferase expression, whereas mutagenesis of both NRF-1 motifs resulted in the almost complete loss of expression. These results establish that the NRF-1 motifs in the ALAS promoter are critical for promoter activity. NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation. The present studies suggest that NRF-1 may co-ordinate the supply of mitochondrial heme with the synthesis of respiratory cytochromes by regulating expression of ALAS. In erythroid cells, NRF-1 may be less important for controlling heme levels since an erythroid ALAS gene is strongly expressed and the promoter for this gene apparently lacks NRF-1 binding sites.

  2. Platelet function is modified by common sequence variation in megakaryocyte super enhancers.

    PubMed

    Petersen, Romina; Lambourne, John J; Javierre, Biola M; Grassi, Luigi; Kreuzhuber, Roman; Ruklisa, Dace; Rosa, Isabel M; Tomé, Ana R; Elding, Heather; van Geffen, Johanna P; Jiang, Tao; Farrow, Samantha; Cairns, Jonathan; Al-Subaie, Abeer M; Ashford, Sofie; Attwood, Antony; Batista, Joana; Bouman, Heleen; Burden, Frances; Choudry, Fizzah A; Clarke, Laura; Flicek, Paul; Garner, Stephen F; Haimel, Matthias; Kempster, Carly; Ladopoulos, Vasileios; Lenaerts, An-Sofie; Materek, Paulina M; McKinney, Harriet; Meacham, Stuart; Mead, Daniel; Nagy, Magdolna; Penkett, Christopher J; Rendon, Augusto; Seyres, Denis; Sun, Benjamin; Tuna, Salih; van der Weide, Marie-Elise; Wingett, Steven W; Martens, Joost H; Stegle, Oliver; Richardson, Sylvia; Vallier, Ludovic; Roberts, David J; Freson, Kathleen; Wernisch, Lorenz; Stunnenberg, Hendrik G; Danesh, John; Fraser, Peter; Soranzo, Nicole; Butterworth, Adam S; Heemskerk, Johan W; Turro, Ernest; Spivakov, Mikhail; Ouwehand, Willem H; Astle, William J; Downes, Kate; Kostadima, Myrto; Frontini, Mattia

    2017-07-13

    Linking non-coding genetic variants associated with the risk of diseases or disease-relevant traits to target genes is a crucial step to realize GWAS potential in the introduction of precision medicine. Here we set out to determine the mechanisms underpinning variant association with platelet quantitative traits using cell type-matched epigenomic data and promoter long-range interactions. We identify potential regulatory functions for 423 of 565 (75%) non-coding variants associated with platelet traits and we demonstrate, through ex vivo and proof of principle genome editing validation, that variants in super enhancers play an important role in controlling archetypical platelet functions.

  3. Platelet function is modified by common sequence variation in megakaryocyte super enhancers

    PubMed Central

    Petersen, Romina; Lambourne, John J.; Javierre, Biola M.; Grassi, Luigi; Kreuzhuber, Roman; Ruklisa, Dace; Rosa, Isabel M.; Tomé, Ana R.; Elding, Heather; van Geffen, Johanna P.; Jiang, Tao; Farrow, Samantha; Cairns, Jonathan; Al-Subaie, Abeer M.; Ashford, Sofie; Attwood, Antony; Batista, Joana; Bouman, Heleen; Burden, Frances; Choudry, Fizzah A.; Clarke, Laura; Flicek, Paul; Garner, Stephen F.; Haimel, Matthias; Kempster, Carly; Ladopoulos, Vasileios; Lenaerts, An-Sofie; Materek, Paulina M.; McKinney, Harriet; Meacham, Stuart; Mead, Daniel; Nagy, Magdolna; Penkett, Christopher J.; Rendon, Augusto; Seyres, Denis; Sun, Benjamin; Tuna, Salih; van der Weide, Marie-Elise; Wingett, Steven W.; Martens, Joost H.; Stegle, Oliver; Richardson, Sylvia; Vallier, Ludovic; Roberts, David J.; Freson, Kathleen; Wernisch, Lorenz; Stunnenberg, Hendrik G.; Danesh, John; Fraser, Peter; Soranzo, Nicole; Butterworth, Adam S.; Heemskerk, Johan W.; Turro, Ernest; Spivakov, Mikhail; Ouwehand, Willem H.; Astle, William J.; Downes, Kate; Kostadima, Myrto; Frontini, Mattia

    2017-01-01

    Linking non-coding genetic variants associated with the risk of diseases or disease-relevant traits to target genes is a crucial step to realize GWAS potential in the introduction of precision medicine. Here we set out to determine the mechanisms underpinning variant association with platelet quantitative traits using cell type-matched epigenomic data and promoter long-range interactions. We identify potential regulatory functions for 423 of 565 (75%) non-coding variants associated with platelet traits and we demonstrate, through ex vivo and proof of principle genome editing validation, that variants in super enhancers play an important role in controlling archetypical platelet functions. PMID:28703137

  4. Complete genome sequences of two novel bipartite begomoviruses infecting common bean in Cuba.

    PubMed

    Chang-Sidorchuk, Lidia; González-Alvarez, Heidy; Navas-Castillo, Jesús; Fiallo-Olivé, Elvira; Martínez-Zubiaur, Yamila

    2017-05-01

    The common bean is a host for a large number of begomoviruses (genus Begomovirus, family Geminiviridae) in the New World. Based on the current taxonomic criteria established for the genus Begomovirus, two new members of this genus infecting common bean (Phaseolus vulgaris) in Cuba are herein reported. The cloned bipartite genomes, composed of DNA-A and DNA-B, showed the typical organization of the New World begomoviruses. We propose the names common bean severe mosaic virus and common bean mottle virus for the new begomovirus species.

  5. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php.

  6. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  7. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    PubMed Central

    2011-01-01

    useful markers for population genetics studies and marker-assisted selection. Conclusion We have produced the first comprehensive transcriptome-wide analysis in A. auriculiformis and A. mangium using de novo assembly techniques. Our high quality and comprehensive assemblies allowed the identification of many genes in the lignin biosynthesis and secondary cell wall formation in Acacia hybrids. Our results demonstrated that Next Generation Sequencing is a cost-effective method for gene discovery, identification of regulatory sequences, and informative markers in a non-model plant. PMID:21729267

  8. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing.

    PubMed

    Wong, Melissa M L; Cannon, Charles H; Wickneswari, Ratnam

    2011-07-05

    population genetics studies and marker-assisted selection. We have produced the first comprehensive transcriptome-wide analysis in A. auriculiformis and A. mangium using de novo assembly techniques. Our high quality and comprehensive assemblies allowed the identification of many genes in the lignin biosynthesis and secondary cell wall formation in Acacia hybrids. Our results demonstrated that Next Generation Sequencing is a cost-effective method for gene discovery, identification of regulatory sequences, and informative markers in a non-model plant.

  9. Regulation of nrf operon expression in pathogenic enteric bacteria: sequence divergence reveals new regulatory complexity

    PubMed Central

    Godfrey, Rita E.; Lee, David J.; Busby, Stephen J. W.

    2017-01-01

    Summary The Escherichia coli K‐12 nrf operon encodes a periplasmic nitrite reductase, the expression of which is driven from a single promoter, pnrf. Expression from pnrf is activated by the FNR transcription factor in response to anaerobiosis and further increased in response to nitrite by the response regulator proteins, NarL and NarP. FNR‐dependent transcription is suppressed by the binding of two nucleoid associated proteins, IHF and Fis. As Fis levels increase in cells grown in rich medium, the positioning of its binding site, overlapping the promoter −10 element, ensures that pnrf is sharply repressed. Here, we investigate the expression of the nrf operon promoter from various pathogenic enteric bacteria. We show that pnrf from enterohaemorrhagic E. coli is more active than its K‐12 counterpart, exhibits substantial FNR‐independent activity and is insensitive to nutrient quality, due to an improved −10 element. We also demonstrate that the Salmonella enterica serovar Typhimurium core promoter is more active than previously thought, due to differences around the transcription start site, and that its expression is repressed by downstream sequences. We identify the CsrA RNA binding protein as being responsible for this, and show that CsrA differentially regulates the E. coli K‐12 and Salmonella nrf operons. PMID:28211111

  10. Linkage disequilibrium among commonly genotyped SNP and variants detected from bull sequence

    USDA-ARS?s Scientific Manuscript database

    Genomic prediction utilizing causal variants could increase selection accuracy above that achieved with SNP genotyped by commercial assays. A number of variants detected from sequencing influential sires are likely to be causal, but noticable improvements in prediction accuracy using imputed sequen...

  11. Draft Genome Sequence of Acholeplasma laidlawii, a Common Contaminant of Cell Cultures

    PubMed Central

    Siqueira, Franciele Maboni; Cibulski, Samuel Paulo; Teixeira, Thais Fumaco; Mayer, Fabiana Quoos

    2017-01-01

    ABSTRACT Mollicutes are important cell culture contaminants which may eventually affect the results of biological assays or affect their interpretation. Acholeplasma laidlawii is one of the most frequent contaminants of cell cultures. Here, we report the complete genome sequence of A. laidlawii strain MDBK/IPV, recovered from Madin-Darby bovine kidney (MDBK) cells. PMID:28153907

  12. Draft Genome Sequence of Acholeplasma laidlawii, a Common Contaminant of Cell Cultures.

    PubMed

    Siqueira, Franciele Maboni; Cibulski, Samuel Paulo; Teixeira, Thais Fumaco; Mayer, Fabiana Quoos; Roehe, Paulo Michel

    2017-02-02

    Mollicutes are important cell culture contaminants which may eventually affect the results of biological assays or affect their interpretation. Acholeplasma laidlawii is one of the most frequent contaminants of cell cultures. Here, we report the complete genome sequence of A. laidlawii strain MDBK/IPV, recovered from Madin-Darby bovine kidney (MDBK) cells. Copyright © 2017 Siqueira et al.

  13. CORE-SINEs: Eukaryotic short interspersed retroposing elements with common sequence motifs

    PubMed Central

    Gilbert, Nicolas; Labuda, Damian

    1999-01-01

    A 65-bp “core” sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3′ ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome. PMID:10077603

  14. CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs.

    PubMed

    Gilbert, N; Labuda, D

    1999-03-16

    A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.

  15. Full length nucleotide sequences of 30 common SLC44A2 alleles encoding human neutrophil antigen-3 (HNA-3)

    PubMed Central

    Chen, Qing; Srivastava, Kshitij; Ardinski, Stefanie C.; Lam, Kevin; Huvard, Michael J.; Schmid, Pirmin; Flegel, Willy A.

    2015-01-01

    Background HNA-3a alloantibodies can cause severe transfusion-related acute lung injury (TRALI). The frequency of the single nucleotide polymorphisms (SNPs) indicative of the two clinically relevant HNA-3a/b antigens are known in many populations. In the present study, we determined the full length nucleotide sequence of common SLC44A2 alleles encoding the choline transporter-like protein-2 (CTL2) that harbors HNA-3a/b antigens. Study design and methods A method was devised to determine the full length coding sequence and adjacent intron sequences from genomic DNA by 8 polymerase chain reaction (PCR) amplifications covering all 22 SLC44A2 exons. Samples from 200 African American, 96 Caucasian, 2 Hispanic and 4 Asian blood donors were analyzed. We developed a decision tree to determine alleles (confirmed haplotypes) from the genotype data. Results A total of 10 SNPs were detected in the SLC44A2 coding sequence. The non-coding sequences harbored an additional 28 SNPs (1 in the 5’-untranslated region (UTR); 23 in the introns; and 4 in the 3’-UTR). No SNP indicative of a non-functional allele was detected. The nucleotide sequences for 30 SLC44A2 alleles (haplotypes) were confirmed. There may be 66 haplotypes among the 604 chromosomes screened. Conclusions We found 38 SNPs, including 1 novel SNP, in 8192 nucleotides covering the coding sequence of the SLC44A2 gene among 302 blood donors. Population frequencies of these SNPs were established for African Americans and Caucasians. Because alleles encoding HNA-3b are more common than non-functional SLC44A2 alleles, we confirmed our previous postulate that African American donors are less likely to form HNA-3a antibodies compared to Caucasians. PMID:26437811

  16. RNA-Sequencing Analysis Reveals a Regulatory Role for Transcription Factor Fezf2 in the Mature Motor Cortex.

    PubMed

    Clare, Alison J; Wicky, Hollie E; Empson, Ruth M; Hughes, Stephanie M

    2017-01-01

    Forebrain embryonic zinc finger (Fezf2) encodes a transcription factor essential for the specification of layer 5 projection neurons (PNs) in the developing cerebral cortex. As with many developmental transcription factors, Fezf2 continues to be expressed into adulthood, suggesting it remains crucial to the maintenance of neuronal phenotypes. Despite the continued expression, a function has yet to be explored for Fezf2 in the PNs of the developed cortex. Here, we investigated the role of Fezf2 in mature neurons, using lentiviral-mediated delivery of a shRNA to conditionally knockdown the expression of Fezf2 in the mouse primary motor cortex (M1). RNA-sequencing analysis of Fezf2-reduced M1 revealed significant changes to the transcriptome, identifying a regulatory role for Fezf2 in the mature M1. Kyoto Encyclopedia Genes and Genomes (KEGG) pathway analyses of Fezf2-regulated genes indicated a role in neuronal signaling and plasticity, with significant enrichment of neuroactive ligand-receptor interaction, cell adhesion molecules and calcium signaling pathways. Gene Ontology analysis supported a functional role for Fezf2-regulated genes in neuronal transmission and additionally indicated an importance in the regulation of behavior. Using the mammalian phenotype ontology database, we identified a significant overrepresentation of Fezf2-regulated genes associated with specific behavior phenotypes, including associative learning, social interaction, locomotor activation and hyperactivity. These roles were distinct from that of Fezf2-regulated genes identified in development, indicating a dynamic transition in Fezf2 function. Together our findings demonstrate a regulatory role for Fezf2 in the mature brain, with Fezf2-regulated genes having functional roles in sustaining normal neuronal and behavioral phenotypes. These results support the hypothesis that developmental transcription factors are important for maintaining neuron transcriptomes and that disruption of their

  17. Whole Genome Shotgun Sequencing Shows Selection on Leptospira Regulatory Proteins during in vitro Culture Attenuation

    PubMed Central

    Lehmann, Jason S.; Corey, Victoria C.; Ricaldi, Jessica N.; Vinetz, Joseph M.; Winzeler, Elizabeth A.; Matthias, Michael A.

    2016-01-01

    Leptospirosis is the most common zoonotic disease worldwide with an estimated 500,000 severe cases reported annually, and case fatality rates of 12–25%, due primarily to acute kidney and lung injuries. Despite its prevalence, the molecular mechanisms underlying leptospirosis pathogenesis remain poorly understood. To identify virulence-related genes in Leptospira interrogans, we delineated cumulative genome changes that occurred during serial in vitro passage of a highly virulent strain of L. interrogans serovar Lai into a nearly avirulent isogenic derivative. Comparison of protein coding and computationally predicted noncoding RNA (ncRNA) genes between these two polyclonal strains identified 15 nonsynonymous single nucleotide variant (nsSNV) alleles that increased in frequency and 19 that decreased, whereas no changes in allelic frequency were observed among the ncRNA genes. Some of the nsSNV alleles were in six genes shown previously to be transcriptionally upregulated during exposure to in vivo-like conditions. Five of these nsSNVs were in evolutionarily conserved positions in genes related to signal transduction and metabolism. Frequency changes of minor nsSNV alleles identified in this study likely contributed to the loss of virulence during serial in vitro culture. The identification of new virulence-associated genes should spur additional experimental inquiry into their potential role in Leptospira pathogenesis. PMID:26711524

  18. The genome sequence of Geobacter metallireducens: features of metabolism, physiology and regulation common and dissimilar to Geobacter sulfurreducens

    SciTech Connect

    Aklujkar, Muktak; Krushkal, Julia; DiBartolo, Genevieve; Lapidus, Alla L.; Land, Miriam L; Lovley, Derek

    2009-01-01

    Background. The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results. The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion. The genomic evidence suggests that metabolism, physiology Background. The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and

  19. Common and distinct DNA-binding and regulatory activities of the BEN-solo transcription factor family.

    PubMed

    Dai, Qi; Ren, Aiming; Westholm, Jakub O; Duan, Hong; Patel, Dinshaw J; Lai, Eric C

    2015-01-01

    Recently, the BEN (BANP, E5R, and NAC1) domain was recognized as a new class of conserved DNA-binding domain. The fly genome encodes three proteins that bear only a single BEN domain ("BEN-solo" factors); namely, Insensitive (Insv), Bsg25A (Elba1), and CG9883 (Elba2). Insv homodimers preferentially bind CCAATTGG palindromes throughout the genome to mediate transcriptional repression, whereas Bsg25A and Elba2 heterotrimerize with their obligate adaptor, Elba3 (i.e., the ELBA complex), to recognize a CCAATAAG motif in the Fab-7 insulator. While these data suggest distinct DNA-binding properties of BEN-solo proteins, we performed reporter assays that indicate that both Bsg25A and Elba2 can individually recognize Insv consensus sites efficiently. We confirmed this by solving the structure of Bsg25A complexed to the Insv site, which showed that key aspects of the BEN:DNA recognition strategy are similar between these proteins. We next show that both Insv and ELBA proteins are competent to mediate transcriptional repression via Insv consensus sequences but that the ELBA complex appears to be selective for the ELBA site. Reciprocally, genome-wide analysis reveals that Insv exhibits significant cobinding to class I insulator elements, indicating that it may also contribute to insulator function. Indeed, we observed abundant Insv binding within the Hox complexes with substantial overlaps with class I insulators, many of which bear Insv consensus sites. Moreover, Insv coimmunoprecipitates with the class I insulator factor CP190. Finally, we observed that Insv harbors exclusive activity among fly BEN-solo factors with respect to regulation of Notch-mediated cell fate choices in the peripheral nervous system. This in vivo activity is recapitulated by BEND6, a mammalian BEN-solo factor that conserves the Notch corepressor function of Insv but not its capacity to bind Insv consensus sites. Altogether, our data define an array of common and distinct biochemical and functional

  20. Common and distinct DNA-binding and regulatory activities of the BEN-solo transcription factor family

    PubMed Central

    Dai, Qi; Ren, Aiming; Westholm, Jakub O.; Duan, Hong; Patel, Dinshaw J.

    2015-01-01

    Recently, the BEN (BANP, E5R, and NAC1) domain was recognized as a new class of conserved DNA-binding domain. The fly genome encodes three proteins that bear only a single BEN domain (“BEN-solo” factors); namely, Insensitive (Insv), Bsg25A (Elba1), and CG9883 (Elba2). Insv homodimers preferentially bind CCAATTGG palindromes throughout the genome to mediate transcriptional repression, whereas Bsg25A and Elba2 heterotrimerize with their obligate adaptor, Elba3 (i.e., the ELBA complex), to recognize a CCAATAAG motif in the Fab-7 insulator. While these data suggest distinct DNA-binding properties of BEN-solo proteins, we performed reporter assays that indicate that both Bsg25A and Elba2 can individually recognize Insv consensus sites efficiently. We confirmed this by solving the structure of Bsg25A complexed to the Insv site, which showed that key aspects of the BEN:DNA recognition strategy are similar between these proteins. We next show that both Insv and ELBA proteins are competent to mediate transcriptional repression via Insv consensus sequences but that the ELBA complex appears to be selective for the ELBA site. Reciprocally, genome-wide analysis reveals that Insv exhibits significant cobinding to class I insulator elements, indicating that it may also contribute to insulator function. Indeed, we observed abundant Insv binding within the Hox complexes with substantial overlaps with class I insulators, many of which bear Insv consensus sites. Moreover, Insv coimmunoprecipitates with the class I insulator factor CP190. Finally, we observed that Insv harbors exclusive activity among fly BEN-solo factors with respect to regulation of Notch-mediated cell fate choices in the peripheral nervous system. This in vivo activity is recapitulated by BEND6, a mammalian BEN-solo factor that conserves the Notch corepressor function of Insv but not its capacity to bind Insv consensus sites. Altogether, our data define an array of common and distinct biochemical and functional

  1. [Examination of processed vegetable foods for the presence of common DNA sequences of genetically modified tomatoes].

    PubMed

    Kitagawa, Mamiko; Nakamura, Kosuke; Kondo, Kazunari; Ubukata, Shoji; Akiyama, Hiroshi

    2014-01-01

    The contamination of processed vegetable foods with genetically modified tomatoes was investigated by the use of qualitative PCR methods to detect the cauliflower mosaic virus 35S promoter (P35S) and the kanamycin resistance gene (NPTII). DNA fragments of P35S and NPTII were detected in vegetable juice samples, possibly due to contamination with the genomes of cauliflower mosaic virus infecting juice ingredients of Brassica species and soil bacteria, respectively. Therefore, to detect the transformation construct sequences of GM tomatoes, primer pairs were designed for qualitative PCR to specifically detect the border region between P35S and NPTII, and the border region between nopaline synthase gene promoter and NPTII. No amplification of the targeted sequences was observed using genomic DNA purified from the juice ingredients. The developed qualitative PCR method is considered to be a reliable tool to check contamination of products with GM tomatoes.

  2. Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research.

    PubMed

    Dong, Lingli; Liu, Hongfang; Zhang, Juncheng; Yang, Shuangjuan; Kong, Guanyi; Chu, Jeffrey S C; Chen, Nansheng; Wang, Daowen

    2015-12-09

    The large and complex hexaploid genome has greatly hindered genomics studies of common wheat (Triticum aestivum, AABBDD). Here, we investigated transcripts in common wheat developing caryopses using the emerging single-molecule real-time (SMRT) sequencing technology PacBio RSII, and assessed the resultant data for improving common wheat genome annotation and grain transcriptome research. We obtained 197,709 full-length non-chimeric (FLNC) reads, 74.6 % of which were estimated to carry complete open reading frame. A total of 91,881 high-quality FLNC reads were identified and mapped to 16,188 chromosomal loci, corresponding to 13,162 known genes and 3026 new genes not annotated previously. Although some FLNC reads could not be unambiguously mapped to the current draft genome sequence, many of them are likely useful for studying highly similar homoeologous or paralogous loci or for improving chromosomal contig assembly in further research. The 91,881 high-quality FLNC reads represented 22,768 unique transcripts, 9591 of which were newly discovered. We found 180 transcripts each spanning two or three previously annotated adjacent loci, suggesting that they should be merged to form correct gene models. Finally, our data facilitated the identification of 6030 genes differentially regulated during caryopsis development, and full-length transcripts for 72 transcribed gluten gene members that are important for the end-use quality control of common wheat. Our work demonstrated the value of PacBio transcript sequencing for improving common wheat genome annotation through uncovering the loci and full-length transcripts not discovered previously. The resource obtained may aid further structural genomics and grain transcriptome studies of common wheat.

  3. One common structural feature of "words" in protein sequences and human texts.

    PubMed

    Zemková, M; Trifonov, E N; Zahradník, D

    2014-01-01

    Frequently discussed analogy between genetic and human texts is explored by comparison of alternation of polar and non-polar amino-acid residues in proteins and alternation of consonants and vowels in human texts. In human languages, the usage of possible combinations of consonants and vowels is influenced by pronounceability of the combinations. Similarly, oligopeptide composition of proteins is influenced by requirements of protein folding and stability. One special type of structure often present in proteins is amphipathic α-helices in which polar and non-polar amino acids alternate with the period 3.5 residues, not unlike alternation of consonants and vowels. In this study, we evaluated the contribution made by amphipathic alternations to the protein sequence texts (20-24%). Their proportion is lower than respective values for alternating words in human texts (57-89%). The proteomes (full sets of proteins for selected organisms) were transformed into ranked sequences of n-grams (words of length n), including periodical amphipathic structures. Similarly, human texts were transformed into sequences of alternating consonants and vowels. Analysis of the vocabularies shows that in both types of texts (human languages and proteins) the alternating words are dominant or highly preferred, thus, strengthening the analogy between these two types of texts. The contribution of amphipathic words in the upper parts of the ranked lists for 10 analyzed proteomes varies between 58 and 74%. In human texts respective values range between 90 and 100%.

  4. A 454 sequencing approach for large scale phylogenomic analysis of the common emperor scorpion (Pandinus imperator).

    PubMed

    Roeding, Falko; Borner, Janus; Kube, Michael; Klages, Sven; Reinhardt, Richard; Burmester, Thorsten

    2009-12-01

    In recent years, phylogenetic tree reconstructions that rely on multiple gene alignments that had been deduced from expressed sequence tags (ESTs) have become a popular method in molecular systematics. Here, we present a 454 pyrosequencing approach to infer the transcriptome of the Emperor scorpion Pandinus imperator. We obtained 428,844 high-quality reads (mean length=223+/-50 b) from total cDNA, which were assembled into 8334 contigs (mean length 422+/-313 bp) and 26,147 singletons. About 1200 contigs were successfully annotated by BLAST and orthology search. Specific analyses of eight distinct hemocyanin sequences provided further proof for the quality of the 454 reads and the assembly process. The P. imperator sequences were included in a concatenated alignment of 149 orthologous genes of 67 metazoan taxa that covers 39,842 amino acids. After removal of low-quality regions, 11,168 positions were employed for phylogenetic reconstructions. Using Bayesian and maximum likelihood methods, we obtained strongly supported monophyletic Ecdysozoa, Arthropoda (excluding Tardigrada), Euarthropoda, Pancrustacea and Hexapoda. We also recovered the Myriochelata (Chelicerata+Myriapoda). Within the chelicerates, Pycnogonida form the sister group of Euchelicerata. However, Arachnida were found paraphyletic because the Acari (mites and ticks) were recovered as sister group of a clade comprising Xiphosura, Scorpiones and Araneae. In summary, we have shown that 454 pyrosequencing is a cost-effective method that provides sufficient data and coverage depth for gene detection and multigene-based phylogenetic analyses.

  5. The genome sequence of Geobacter metallireducens: features of metabolism, physiology and regulation common and dissimilar to Geobacter sulfurreducens

    SciTech Connect

    Aklujkar, Muktak; Krushkal, Julia; DiBartolo, Genevieve; Lapidus, Alla; Land, Miriam L.; Lovley, Derek R.

    2008-12-01

    Background: The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results: The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion: The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae.

  6. The genome sequence of Geobacter metallireducens: features of metabolism, physiology and regulation common and dissimilar to Geobacter sulfurreducens

    PubMed Central

    2009-01-01

    Background The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second putative succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae. PMID:19473543

  7. De novo assembly and characterization of the spleen transcriptome of common carp (Cyprinus carpio) using Illumina paired-end sequencing.

    PubMed

    Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin

    2015-06-01

    Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Identification of sequences common to more than one therapeutic target to treat complex diseases: simulating the high variance in sequence interactivity evolved to modulate robust phenotypes.

    PubMed

    Varela, Miguel Angel

    2015-07-18

    Genome-wide association studies show that most human traits and diseases are caused by a combination of environmental and genetic causes, with each one of these having a relatively small effect. In contrast, most therapies based on macromolecules like antibodies, antisense oligonucleotides or peptides focus on a single gene product. On the other hand, complex organisms seem to have a plethora of functional molecules able to bind specifically to multiple genes or genes products based on their sequences but the mechanisms that lead organisms to recruit these multispecific regulators remain unclear. The mutational biases inferred from the genomic sequences of six organisms show an increase in the variance of sequence interactivity in complex organisms. The high variance in the interactivity of sequences presents an ideal evolutionary substrate to recruit sequence-specific regulators able to target multiple gene products. For example, here it is shown how the 3'UTR can fluctuate between sequences likely to be complementary to other sites in the genome in the search for advantageous interactions. A library of nucleotide- and peptide-based tools was built using a script to search for candidates (e.g. peptides, antigens to raise antibodies or antisense oligonucleotides) to target sequences shared by key pathways in human disorders, such as cancer and immune diseases. This resource will be accessible to the community at www.wikisequences.org . This study describes and encourages the adoption of the same multitarget strategy (e.g., miRNAs, Hsp90) that has evolved in organisms to modify complex traits to treat diseases with robust pathological phenotypes. The increase in the variance of sequence interactivity detected in the human and mouse genomes when compared with less complex organisms could have expedited the evolution of regulators able to interact to multiple gene products and modulate robust phenotypes. The identification of sequences common to more than one

  9. Application of whole genome and RNA sequencing to investigate the genomic landscape of common variable immunodeficiency disorders.

    PubMed

    van Schouwenburg, Pauline A; Davenport, Emma E; Kienzler, Anne-Kathrin; Marwah, Ishita; Wright, Benjamin; Lucas, Mary; Malinauskas, Tomas; Martin, Hilary C; Lockstone, Helen E; Cazier, Jean-Baptiste; Chapel, Helen M; Knight, Julian C; Patel, Smita Y

    2015-10-01

    Common Variable Immunodeficiency Disorders (CVIDs) are the most prevalent cause of primary antibody failure. CVIDs are highly variable and a genetic causes have been identified in <5% of patients. Here, we performed whole genome sequencing (WGS) of 34 CVID patients (94% sporadic) and combined them with transcriptomic profiling (RNA-sequencing of B cells) from three patients and three healthy controls. We identified variants in CVID disease genes TNFRSF13B, TNFRSF13C, LRBA and NLRP12 and enrichment of variants in known and novel disease pathways. The pathways identified include B-cell receptor signalling, non-homologous end-joining, regulation of apoptosis, T cell regulation and ICOS signalling. Our data confirm the polygenic nature of CVID and suggest individual-specific aetiologies in many cases. Together our data show that WGS in combination with RNA-sequencing allows for a better understanding of CVIDs and the identification of novel disease associated pathways.

  10. Application of whole genome and RNA sequencing to investigate the genomic landscape of common variable immunodeficiency disorders

    PubMed Central

    van Schouwenburg, Pauline A.; Davenport, Emma E.; Kienzler, Anne-Kathrin; Marwah, Ishita; Wright, Benjamin; Lucas, Mary; Malinauskas, Tomas; Martin, Hilary C.; Lockstone, Helen E.; Cazier, Jean-Baptiste; Chapel, Helen M.; Knight, Julian C.; Patel, Smita Y.

    2015-01-01

    Common Variable Immunodeficiency Disorders (CVIDs) are the most prevalent cause of primary antibody failure. CVIDs are highly variable and a genetic causes have been identified in <5% of patients. Here, we performed whole genome sequencing (WGS) of 34 CVID patients (94% sporadic) and combined them with transcriptomic profiling (RNA-sequencing of B cells) from three patients and three healthy controls. We identified variants in CVID disease genes TNFRSF13B, TNFRSF13C, LRBA and NLRP12 and enrichment of variants in known and novel disease pathways. The pathways identified include B-cell receptor signalling, non-homologous end-joining, regulation of apoptosis, T cell regulation and ICOS signalling. Our data confirm the polygenic nature of CVID and suggest individual-specific aetiologies in many cases. Together our data show that WGS in combination with RNA-sequencing allows for a better understanding of CVIDs and the identification of novel disease associated pathways. PMID:26122175

  11. Draft Genome Sequences of Two Isolates of Colletotrichum lindemuthianum, the Causal Agent of Anthracnose in Common Beans.

    PubMed

    de Queiroz, Casley Borges; Correia, Hilberty L Nunes; Menicucci, Renato Pedrozo; Vidigal, Pedro M Pereira; de Queiroz, Marisa Vieira

    2017-05-04

    Colletotrichum lindemuthianum is the causal agent of anthracnose in common beans, one of the main limiting factors of their culture. Here, we report for the first time, to our knowledge, a draft of the complete genome sequences of two isolates belonging to 83.501 and 89 A2 2-3 of C. lindemutuianum. Copyright © 2017 de Queiroz et al.

  12. Empirical power of very rare variants for common traits and disease: results from sanger sequencing 1998 individuals

    PubMed Central

    Ladouceur, Martin; Zheng, Hou-Feng; Greenwood, Celia M T; Richards, J Brent

    2013-01-01

    The optimal study design for identifying rare variants associated with common disease is not yet clear and researchers have to decide whether to prioritize lower sequencing coverage on larger sample sizes, or higher coverage on smaller sample sizes. High-coverage sequencing affords several advantages, such as genotype accuracy and improved identification of very rare variants, but this comes at increased cost. However, the magnitude of the contribution of very rare variants to the statistical power of gene-based association tests is unknown. By using Sanger sequence data on seven genes from 1998 subjects with simulated phenotypes, we provide evidence that excluding very rare variants, in general, reduces the statistical power of rare variant association tests only modestly. However, if the probability of being causal and the effect size of the causal variants are inversely related to the minor allele frequency, then very rare variants do contribute to some power, however the absolute power remains low. As very rare variants constitute the majority of variants identified in sequencing studies, these findings suggest that careful attention need to be placed on the plausible relationship that exist between very rare variants and common disease. PMID:23321613

  13. Multilocus sequence analysis of Bacillus thuringiensis serovars navarrensis, bolivia and vazensis and Bacillus weihenstephanensis reveals a common phylogeny.

    PubMed

    Soufiane, Brahim; Baizet, Mathilde; Côté, Jean-Charles

    2013-01-01

    The Bacillus cereus group sensu lato includes six closely-related bacterial species: Bacillus cereus, Bacillus anthracis, Bacillus thuringiensis, Bacillus mycoides, Bacillus pseudomycoides and Bacillus weihenstephanensis. B. thuringiensis is distinguished from the other species mainly by the appearance of an inclusion body upon sporulation. B. weihenstephanensis is distinguished based on its psychrotolerance and the presence of specific signature sequences in the 16S rRNA gene and cspA genes. A total of seven housekeeping genes (glpF, gmK, ilvD, pta, purH, pycA and tpi) from different B. thuringiensis serovars and B. weihenstephanensis strains were amplified and their nucleotide sequences determined. A maximum likelihood phylogenetic tree was inferred from comparisons of the concatenated sequences. B. thuringiensis serovars navarrensis, bolivia and vazensis clustered not with the other B. thuringiensis serovars but rather with the B. weihenstephanensis strains, indicative of a common phylogeny. In addition, specific signature sequences and single nucleotide polymorphisms common to B. thuringiensis serovars navarrensis, bolivia and vazensis and the B. weihenstephanensis strains, and absent in the other B. thuringiensis serovars, were identified.

  14. Sequence-based introgression mapping identifies candidate white mold tolerance genes in common bean

    USDA-ARS?s Scientific Manuscript database

    White mold disease, caused by the necrotrophic fungus Sclerotinia sclerotiorum (Lib.) de Bary, is a major pathogen of common bean (Phaseolus vulgaris L.). More than 20 QTL were reported using multiple bi-parental populations. To study the disease in more detail, advanced back-cross populations seg...

  15. Complete Genome Sequence of a Genomovirus Associated with Common Bean Plant Leaves in Brazil

    PubMed Central

    Lamas, Natalia Silva; Fontenele, Rafaela Salgado; Melo, Fernando Lucas; Costa, Antonio Felix; Varsani, Arvind

    2016-01-01

    A new genomovirus has been identified in three common bean plants in Brazil. This virus has a circular genome of 2,220 nucleotides and 3 major open reading frames. It shares 80.7% genome-wide pairwise identity with a genomovirus recovered from Tongan fruit bat guano. PMID:27834705

  16. Massively parallel sequencing of 17 commonly used forensic autosomal STRs and amelogenin with small amplicons.

    PubMed

    Kim, Eun Hye; Lee, Hwan Young; Yang, In Seok; Jung, Sang-Eun; Yang, Woo Ick; Shin, Kyoung-Jin

    2016-05-01

    The next-generation sequencing (NGS) method has been utilized to analyze short tandem repeat (STR) markers, which are routinely used for human identification purposes in the forensic field. Some researchers have demonstrated the successful application of the NGS system to STR typing, suggesting that NGS technology may be an alternative or additional method to overcome limitations of capillary electrophoresis (CE)-based STR profiling. However, there has been no available multiplex PCR system that is optimized for NGS analysis of forensic STR markers. Thus, we constructed a multiplex PCR system for the NGS analysis of 18 markers (13CODIS STRs, D2S1338, D19S433, Penta D, Penta E and amelogenin) by designing amplicons in the size range of 77-210 base pairs. Then, PCR products were generated from two single-sources, mixed samples and artificially degraded DNA samples using a multiplex PCR system, and were prepared for sequencing on the MiSeq system through construction of a subsequent barcoded library. By performing NGS and analyzing the data, we confirmed that the resultant STR genotypes were consistent with those of CE-based typing. Moreover, sequence variations were detected in targeted STR regions. Through the use of small-sized amplicons, the developed multiplex PCR system enables researchers to obtain successful STR profiles even from artificially degraded DNA as well as STR loci which are analyzed with large-sized amplicons in the CE-based commercial kits. In addition, successful profiles can be obtained from mixtures up to a 1:19 ratio. Consequently, the developed multiplex PCR system, which produces small size amplicons, can be successfully applied to STR NGS analysis of forensic casework samples such as mixtures and degraded DNA samples.

  17. Substitution of Feline Leukemia Virus Long Terminal Repeat Sequences into Murine Leukemia Virus Alters the Pattern of Insertional Activation and Identifies New Common Insertion Sites

    PubMed Central

    Johnson, Chassidy; Lobelle-Rich, Patricia A.; Puetter, Adriane; Levy, Laura S.

    2005-01-01

    The recombinant retrovirus, MoFe2-MuLV (MoFe2), was constructed by replacing the U3 region of Moloney murine leukemia virus (M-MuLV) with homologous sequences from the FeLV-945 LTR. NIH/Swiss mice neonatally inoculated with MoFe2 developed T-cell lymphomas of immature thymocyte surface phenotype. MoFe2 integrated infrequently (0 to 9%) near common insertion sites (CISs) previously identified for either parent virus. Using three different strategies, CISs in MoFe2-induced tumors were identified at six loci, none of which had been previously reported as CISs in tumors induced by either parent virus in wild-type animals. Two of the newly identified CISs had not previously been implicated in lymphoma in any retrovirus model. One of these, designated 3-19, encodes the p101 regulatory subunit of phosphoinositide-3-kinase-gamma. The other, designated Rw1, is predicted to encode a protein that functions in the immune response to virus infection. Thus, substitution of FeLV-945 U3 sequences into the M-MuLV long terminal repeat (LTR) did not alter the target tissue for M-MuLV transformation but significantly altered the pattern of CIS utilization in the induction of T-cell lymphoma. These observations support a growing body of evidence that the distinctive sequence and/or structure of the retroviral LTR determines its pattern of insertional activation. The findings also demonstrate the oligoclonal nature of retrovirus-induced lymphomas by demonstrating proviral insertions at CISs in subdominant populations in the tumor mass. Finally, the findings demonstrate the utility of novel recombinant retroviruses such as MoFe2 to contribute new genes potentially relevant to the induction of lymphoid malignancy. PMID:15596801

  18. Next-generation sequencing of common osteogenesis imperfecta-related genes in clinical practice

    PubMed Central

    Árvai, Kristóf; Horváth, Péter; Balla, Bernadett; Tobiás, Bálint; Kató, Karina; Kirschner, Gyöngyi; Klujber, Valéria; Lakatos, Péter; Kósa, János P.

    2016-01-01

    Next generation sequencing (NGS) is a rapidly developing area in genetics. Utilizing this technology in the management of disorders with complex genetic background and not recurrent mutation hot spots can be extremely useful. In this study, we applied NGS, namely semiconductor sequencing to determine the most significant osteogenesis imperfecta-related genetic variants in the clinical practice. We selected genes coding collagen type I alpha-1 and-2 (COL1A1, COL1A2) which are responsible for more than 90% of all cases. CRTAP and LEPRE1/P3H1 genes involved in the background of the recessive forms with relatively high frequency (type VII and VIII) represent less than 10% of the disease. In our six patients (1–41 years), we identified 23 different variants. We found a total of 14 single nucleotide variants (SNV) in COL1A1 and COL1A2, 5 in CRTAP and 4 in LEPRE1. Two novel and two already well-established pathogenic SNVs have been identified. Among the newly recognized mutations, one results in an amino acid change and one of them is a stop codon. We have shown that a new full-scale cost-effective NGS method can be developed and utilized to supplement diagnostic process of osteogenesis imperfecta with molecular genetic data in clinical practice. PMID:27335225

  19. Next-generation sequencing of common osteogenesis imperfecta-related genes in clinical practice.

    PubMed

    Árvai, Kristóf; Horváth, Péter; Balla, Bernadett; Tobiás, Bálint; Kató, Karina; Kirschner, Gyöngyi; Klujber, Valéria; Lakatos, Péter; Kósa, János P

    2016-06-23

    Next generation sequencing (NGS) is a rapidly developing area in genetics. Utilizing this technology in the management of disorders with complex genetic background and not recurrent mutation hot spots can be extremely useful. In this study, we applied NGS, namely semiconductor sequencing to determine the most significant osteogenesis imperfecta-related genetic variants in the clinical practice. We selected genes coding collagen type I alpha-1 and-2 (COL1A1, COL1A2) which are responsible for more than 90% of all cases. CRTAP and LEPRE1/P3H1 genes involved in the background of the recessive forms with relatively high frequency (type VII and VIII) represent less than 10% of the disease. In our six patients (1-41 years), we identified 23 different variants. We found a total of 14 single nucleotide variants (SNV) in COL1A1 and COL1A2, 5 in CRTAP and 4 in LEPRE1. Two novel and two already well-established pathogenic SNVs have been identified. Among the newly recognized mutations, one results in an amino acid change and one of them is a stop codon. We have shown that a new full-scale cost-effective NGS method can be developed and utilized to supplement diagnostic process of osteogenesis imperfecta with molecular genetic data in clinical practice.

  20. A regulatory sequence from the retinoid X receptor γ gene directs expression to horizontal cells and photoreceptors in the embryonic chicken retina

    PubMed Central

    Blixt, Maria K. E.

    2016-01-01

    Purpose Combining techniques of episomal vector gene-specific Cre expression and genomic integration using the piggyBac transposon system enables studies of gene expression–specific cell lineage tracing in the chicken retina. In this work, we aimed to target the retinal horizontal cell progenitors. Methods A 208 bp gene regulatory sequence from the chicken retinoid X receptor γ gene (RXRγ208) was used to drive Cre expression. RXRγ is expressed in progenitors and photoreceptors during development. The vector was combined with a piggyBac “donor” vector containing a floxed STOP sequence followed by enhanced green fluorescent protein (EGFP), as well as a piggyBac helper vector for efficient integration into the host cell genome. The vectors were introduced into the embryonic chicken retina with in ovo electroporation. Tissue electroporation targets specific developmental time points and in specific structures. Results Cells that drove Cre expression from the regulatory RXRγ208 sequence excised the floxed STOP-sequence and expressed GFP. The approach generated a stable lineage with robust expression of GFP in retinal cells that have activated transcription from the RXRγ208 sequence. Furthermore, GFP was expressed in cells that express horizontal or photoreceptor markers when electroporation was performed between developmental stages 22 and 28. Electroporation of a stage 12 optic cup gave multiple cell types in accordance with RXRγ gene expression in the early retina. Conclusions In this study, we describe an easy, cost-effective, and time-efficient method for testing regulatory sequences in general. More specifically, our results open up the possibility for further studies of the RXRγ-gene regulatory network governing the formation of photoreceptor and horizontal cells. In addition, the method presents approaches to target the expression of effector genes, such as regulators of cell fate or cell cycle progression, to these cells and their progenitor. PMID

  1. SNP discovery in common bean by restriction-associated DNA (RAD) sequencing for genetic diversity and population structure analysis.

    PubMed

    Valdisser, Paula Arielle M R; Pappas, Georgios J; de Menezes, Ivandilson P P; Müller, Bárbara S F; Pereira, Wendell J; Narciso, Marcelo G; Brondani, Claudio; Souza, Thiago L P O; Borba, Tereza C O; Vianello, Rosana P

    2016-06-01

    Researchers have made great advances into the development and application of genomic approaches for common beans, creating opportunities to driving more real and applicable strategies for sustainable management of the genetic resource towards plant breeding. This work provides useful polymorphic single-nucleotide polymorphisms (SNPs) for high-throughput common bean genotyping developed by RAD (restriction site-associated DNA) sequencing. The RAD tags were generated from DNA pooled from 12 common bean genotypes, including breeding lines of different gene pools and market classes. The aligned sequences identified 23,748 putative RAD-SNPs, of which 3357 were adequate for genotyping; 1032 RAD-SNPs with the highest ADT (assay design tool) score are presented in this article. The RAD-SNPs were structurally annotated in different coding (47.00 %) and non-coding (53.00 %) sequence components of genes. A subset of 384 RAD-SNPs with broad genome distribution was used to genotype a diverse panel of 95 common bean germplasms and revealed a successful amplification rate of 96.6 %, showing 73 % of polymorphic SNPs within the Andean group and 83 % in the Mesoamerican group. A slightly increased He (0.161, n = 21) value was estimated for the Andean gene pool, compared to the Mesoamerican group (0.156, n = 74). For the linkage disequilibrium (LD) analysis, from a group of 580 SNPs (289 RAD-SNPs and 291 BARC-SNPs) genotyped for the same set of genotypes, 70.2 % were in LD, decreasing to 0.10 %in the Andean group and 0.77 % in the Mesoamerican group. Haplotype patterns spanning 310 Mb of the genome (60 %) were characterized in samples from different origins. However, the haplotype frameworks were under-represented for the Andean (7.85 %) and Mesoamerican (5.55 %) gene pools separately. In conclusion, RAD sequencing allowed the discovery of hundreds of useful SNPs for broad genetic analysis of common bean germplasm. From now, this approach provides an excellent panel

  2. Exome Sequencing Links Corticospinal Motor Neuron Disease to Common Neurodegenerative Disorders

    PubMed Central

    Hofree, Matan; Silhavy, Jennifer L.; Heiberg, Andrew D.; Abdellateef, Mostafa; Rosti, Basak; Scott, Eric; Mansour, Lobna; Masri, Amira; Kayserili, Hulya; Al-Aama, Jumana Y.; Abdel-Salam, Ghada M. H.; Karminejad, Ariana; Kara, Majdi; Kara, Bulent; Bozorgmehri, Bita; Ben-Omran, Tawfeg; Mojahedi, Faezeh; El Din Mahmoud, Iman Gamal; Bouslam, Naima; Bouhouche, Ahmed; Benomar, Ali; Hanein, Sylvain; Raymond, Laure; Forlani, Sylvie; Mascaro, Massimo; Selim, Laila; Shehata, Nabil; Al-Allawi, Nasir; Bindu, P.S.; Azam, Matloob; Gunel, Murat; Caglayan, Ahmet; Bilguvar, Kaya; Tolun, Aslihan; Issa, Mahmoud Y.; Schroth, Jana; Spencer, Emily G.; Rosti, Rasim O.; Akizu, Naiara; Vaux, Keith K.; Johansen, Anide; Koh, Alice A.; Megahed, Hisham; Durr, Alexandra; Brice, Alexis; Stevanin, Giovanni; Gabriel, Stacy B.; Ideker, Trey; Gleeson, Joseph G.

    2014-01-01

    Hereditary spastic paraplegias (HSPs) are neurodegenerative motor neuron diseases characterized by progressive age-dependent loss of corticospinal motor tract function. Although the genetic basis is partly understood, only a fraction of cases can receive a genetic diagnosis, and a global view of HSP is lacking. By using whole-exome sequencing in combination with network analysis, we identified 18 previously unknown putative HSP genes and validated nearly all of these genes functionally or genetically. The pathways highlighted by these mutations link HSP to cellular transport, nucleotide metabolism, and synapse and axon development. Network analysis revealed a host of further candidate genes, of which three were mutated in our cohort. Our analysis links HSP to other neurodegenerative disorders and can facilitate gene discovery and mechanistic understanding of disease. PMID:24482476

  3. RegTransBase--a database of regulatory sequences and interactions based on literature: a resource for investigating transcriptional regulation in prokaryotes.

    PubMed

    Cipriano, Michael J; Novichkov, Pavel N; Kazakov, Alexey E; Rodionov, Dmitry A; Arkin, Adam P; Gelfand, Mikhail S; Dubchak, Inna

    2013-04-02

    Due to the constantly growing number of sequenced microbial genomes, comparative genomics has been playing a major role in the investigation of regulatory interactions in bacteria. Regulon inference mostly remains a field of semi-manual examination since absence of a knowledgebase and informatics platform for automated and systematic investigation restricts opportunities for computational prediction. Additionally, confirming computationally inferred regulons by experimental data is critically important. RegTransBase is an open-access platform with a user-friendly web interface publicly available at http://regtransbase.lbl.gov. It consists of two databases - a manually collected hierarchical regulatory interactions database based on more than 7000 scientific papers which can serve as a knowledgebase for verification of predictions, and a large set of curated by experts transcription factor binding sites used in regulon inference by a variety of tools. RegTransBase captures the knowledge from published scientific literature using controlled vocabularies and contains various types of experimental data, such as: the activation or repression of transcription by an identified direct regulator; determination of the transcriptional regulatory function of a protein (or RNA) directly binding to DNA or RNA; mapping of binding sites for a regulatory protein; characterization of regulatory mutations. Analysis of the data collected from literature resulted in the creation of Putative Regulons from Experimental Data that are also available in RegTransBase. RegTransBase is a powerful user-friendly platform for the investigation of regulation in prokaryotes. It uses a collection of validated regulatory sequences that can be easily extracted and used to infer regulatory interactions by comparative genomics techniques thus assisting researchers in the interpretation of transcriptional regulation data.

  4. A Next-Generation Sequencing Strategy for Evaluating the Most Common Genetic Abnormalities in Multiple Myeloma.

    PubMed

    Jiménez, Cristina; Jara-Acevedo, María; Corchete, Luis A; Castillo, David; Ordóñez, Gonzalo R; Sarasquete, María E; Puig, Noemí; Martínez-López, Joaquín; Prieto-Conde, María I; García-Álvarez, María; Chillón, María C; Balanzategui, Ana; Alcoceba, Miguel; Oriol, Albert; Rosiñol, Laura; Palomera, Luis; Teruel, Ana I; Lahuerta, Juan J; Bladé, Joan; Mateos, María V; Orfão, Alberto; San Miguel, Jesús F; González, Marcos; Gutiérrez, Norma C; García-Sanz, Ramón

    2017-01-01

    Identification and characterization of genetic alterations are essential for diagnosis of multiple myeloma and may guide therapeutic decisions. Currently, genomic analysis of myeloma to cover the diverse range of alterations with prognostic impact requires fluorescence in situ hybridization (FISH), single nucleotide polymorphism arrays, and sequencing techniques, which are costly and labor intensive and require large numbers of plasma cells. To overcome these limitations, we designed a targeted-capture next-generation sequencing approach for one-step identification of IGH translocations, V(D)J clonal rearrangements, the IgH isotype, and somatic mutations to rapidly identify risk groups and specific targetable molecular lesions. Forty-eight newly diagnosed myeloma patients were tested with the panel, which included IGH and six genes that are recurrently mutated in myeloma: NRAS, KRAS, HRAS, TP53, MYC, and BRAF. We identified 14 of 17 IGH translocations previously detected by FISH and three confirmed translocations not detected by FISH, with the additional advantage of breakpoint identification, which can be used as a target for evaluating minimal residual disease. IgH subclass and V(D)J rearrangements were identified in 77% and 65% of patients, respectively. Mutation analysis revealed the presence of missense protein-coding alterations in at least one of the evaluating genes in 16 of 48 patients (33%). This method may represent a time- and cost-effective diagnostic method for the molecular characterization of multiple myeloma. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  5. Role of common and rare APP DNA sequence variants in Alzheimer disease

    PubMed Central

    Hooli, B.V.; Mohapatra, G.; Mattheisen, M.; Parrado, A.R.; Roehr, J.T.; Shen, Y.; Gusella, J.F.; Moir, R.; Saunders, A.J.; Lange, C.; Tanzi, R.E.

    2012-01-01

    Objectives: More than 30 different rare mutations, including copy number variants (CNVs), in the amyloid precursor protein gene (APP) cause early-onset familial Alzheimer disease (EOFAD), whereas the contribution of common APP variants to disease risk remains controversial. In this study we systematically assessed the role of both rare and common APP DNA variants in Alzheimer disease (AD) families. Methods: Families with EOFAD genetically linked to the APP region were screened for missense mutations and locus duplications of APP. Further, using genome-wide DNA microarray data, we examined the APP locus for CNVs in a total of 797 additional early- and late-onset AD pedigrees. Finally, 423 single nucleotide polymorphisms (SNPs) in the APP locus, including 2 promoter polymorphisms previously associated with AD risk, were tested in up to 4,200 individuals from multiplex AD families. Results: Analyses of 8 21q21-linked families revealed one family carrying a nonsynonymous mutation in exon 17 (Val717Leu) and another family with a partially penetrant 3.5-Mb locus duplication encompassing APP. CNV analysis in the APP locus revealed an additional family carrying a fully penetrant 380-kb duplication, merely spanning APP. Last, contrary to previous reports, association analyses of more than 400 different SNPs in or near APP failed to show significant effects on AD risk. Conclusion: Our study shows that APP mutations and locus duplications are a very rare cause of EOFAD and that the contribution of common APP variants to AD susceptibility is insignificant. Furthermore, duplications of APP may not be fully penetrant, possibly indicating the existence of hitherto unknown protective genetic factors. PMID:22491860

  6. cGRNB: a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets.

    PubMed

    Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue

    2013-01-01

    We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets

  7. The evolution of heat shock protein sequences, cis-regulatory elements, and expression profiles in the eusocial Hymenoptera.

    PubMed

    Nguyen, Andrew D; Gotelli, Nicholas J; Cahan, Sara Helms

    2016-01-19

    The eusocial Hymenoptera have radiated across a wide range of thermal environments, exposing them to significant physiological stressors. We reconstructed the evolutionary history of three families of Heat Shock Proteins (Hsp90, Hsp70, Hsp40), the primary molecular chaperones protecting against thermal damage, across 12 Hymenopteran species and four other insect orders. We also predicted and tested for thermal inducibility of eight Hsps from the presence of cis-regulatory heat shock elements (HSEs). We tested whether Hsp induction patterns in ants were associated with different thermal environments. We found evidence for duplications, losses, and cis-regulatory changes in two of the three gene families. One member of the Hsp90 gene family, hsp83, duplicated basally in the Hymenoptera, with shifts in HSE motifs in the novel copy. Both copies were retained in bees, but ants retained only the novel HSE copy. For Hsp70, Hymenoptera lack the primary heat-inducible orthologue from Drosophila melanogaster and instead induce the cognate form, hsc70-4, which also underwent an early duplication. Episodic diversifying selection was detected along the branch predating the duplication of hsc70-4 and continued along one of the paralogue branches after duplication. Four out of eight Hsp genes were heat-inducible and matched the predictions based on presence of conserved HSEs. For the inducible homologues, the more thermally tolerant species, Pogonomyrmex barbatus, had greater Hsp basal expression and induction in response to heat stress than did the less thermally tolerant species, Aphaenogaster picea. Furthermore, there was no trade-off between basal expression and induction. Our results highlight the unique evolutionary history of Hsps in eusocial Hymenoptera, which has been shaped by gains, losses, and changes in cis-regulation. Ants, and most likely other Hymenoptera, utilize lineage-specific heat inducible Hsps, whose expression patterns are associated with adaptive

  8. Mesoamerican origin of the common bean (Phaseolus vulgaris L.) is revealed by sequence data.

    PubMed

    Bitocchi, Elena; Nanni, Laura; Bellucci, Elisa; Rossi, Monica; Giardini, Alessandro; Zeuli, Pierluigi Spagnoletti; Logozzo, Giuseppina; Stougaard, Jens; McClean, Phillip; Attene, Giovanna; Papa, Roberto

    2012-04-03

    Knowledge about the origins and evolution of crop species represents an important prerequisite for efficient conservation and use of existing plant materials. This study was designed to solve the ongoing debate on the origins of the common bean by investigating the nucleotide diversity at five gene loci of a large sample that represents the entire geographical distribution of the wild forms of this species. Our data clearly indicate a Mesoamerican origin of the common bean. They also strongly support the occurrence of a bottleneck during the formation of the Andean gene pool that predates the domestication, which was suggested by recent studies based on multilocus molecular markers. Furthermore, a remarkable result was the genetic structure that was seen for the Mesoamerican accessions, with the identification of four different genetic groups that have different relationships with the sets of wild accessions from the Andes and northern Peru-Ecuador. This finding implies that both of the gene pools from South America originated through different migration events from the Mesoamerican populations that were characteristic of central Mexico.

  9. A regulatory governance perspective on health technology assessment (HTA) in France: the contextual mediation of common functional pressures.

    PubMed

    Barron, Anthony J G; Klinger, Corinna; Shah, Sara Mehmood Birchall; Wright, John S F

    2015-02-01

    The new regulatory governance perspective has introduced several insights to the study of health technology assessment (HTA): it has broadened the scope for the analysis of HTA; it has provided a more sophisticated account of national diversity and the potential for cross-border policy learning; and, it has dissolved the distinction between HTA assessment and appraisal processes. In this paper, we undertake a qualitative study of the French process for HTA with a view to introducing a fourth insight: that the emergence and continuing function of national agencies for HTA follows a broadly evolutionary pattern in which contextual factors play an important mediating role. We demonstrate that the French process for HTA is characterised by distinctive institutions, processes and evidential requirements. Consistent with the mediating role of this divergent policy context, we argue that even initiatives for the harmonisation of national approaches to HTA are likely to meet with divergent national policy responses.

  10. Genome sequence and genetic diversity of the common carp, Cyprinus carpio.

    PubMed

    Xu, Peng; Zhang, Xiaofeng; Wang, Xumin; Li, Jiongtang; Liu, Guiming; Kuang, Youyi; Xu, Jian; Zheng, Xianhu; Ren, Lufeng; Wang, Guoliang; Zhang, Yan; Huo, Linhe; Zhao, Zixia; Cao, Dingchen; Lu, Cuiyun; Li, Chao; Zhou, Yi; Liu, Zhanjiang; Fan, Zhonghua; Shan, Guangle; Li, Xingang; Wu, Shuangxiu; Song, Lipu; Hou, Guangyuan; Jiang, Yanliang; Jeney, Zsigmond; Yu, Dan; Wang, Li; Shao, Changjun; Song, Lai; Sun, Jing; Ji, Peifeng; Wang, Jian; Li, Qiang; Xu, Liming; Sun, Fanyue; Feng, Jianxin; Wang, Chenghui; Wang, Shaolin; Wang, Baosen; Li, Yan; Zhu, Yaping; Xue, Wei; Zhao, Lan; Wang, Jintu; Gu, Ying; Lv, Weihua; Wu, Kejing; Xiao, Jingfa; Wu, Jiayan; Zhang, Zhang; Yu, Jun; Sun, Xiaowen

    2014-11-01

    The common carp, Cyprinus carpio, is one of the most important cyprinid species and globally accounts for 10% of freshwater aquaculture production. Here we present a draft genome of domesticated C. carpio (strain Songpu), whose current assembly contains 52,610 protein-coding genes and approximately 92.3% coverage of its paleotetraploidized genome (2n = 100). The latest round of whole-genome duplication has been estimated to have occurred approximately 8.2 million years ago. Genome resequencing of 33 representative individuals from worldwide populations demonstrates a single origin for C. carpio in 2 subspecies (C. carpio Haematopterus and C. carpio carpio). Integrative genomic and transcriptomic analyses were used to identify loci potentially associated with traits including scaling patterns and skin color. In combination with the high-resolution genetic map, the draft genome paves the way for better molecular studies and improved genome-assisted breeding of C. carpio and other closely related species.

  11. Natriuretic peptide pharmacogenetics: membrane metallo-endopeptidase (MME): common gene sequence variation, functional characterization and degradation.

    PubMed

    Pereira, Naveen L; Aksoy, Pinar; Moon, Irene; Peng, Yi; Redfield, Margaret M; Burnett, John C; Wieben, Eric D; Yee, Vivien C; Weinshilboum, Richard M

    2010-11-01

    Membrane metallo-endopeptidase (MME), also known as neutral endopeptidase 24.11 (EC 3.4.24.11), is involved in the metabolism of natriuretic peptides that play a key role in modulating cardiac structure and function. Common genetic variation in MME has not been addressed by resequencing the gene using DNA from different ethnic populations. We set out to identify and functionally characterize common genetic variation in MME in three ethnic groups. DNA samples from 96 European-American, 96 African-American, and 96 Han Chinese-American healthy subjects were used to resequence MME. Ninety polymorphisms, 65 novel, were identified, including 8 nonsynonymous single nucleotide polymorphisms (nsSNPs). Expression constructs for the nsSNPs were created and COS-1 cells were transfected with constructs for wild type (WT) and variant allozymes. Recombinant proteins were analyzed by quantitative Western blot analysis and by a one-step fluorometric assay. A significant reduction in enzyme activity (21% of WT) and immunoreactive protein (29% of WT) for the Val73 variant allozyme was observed. Proteasome-mediated degradation and autophagy participated in the degradation of this variant allozyme. The chaperone proteins, BiP and GRP94, were upregulated after transfection with Val73 MME, suggesting protein misfolding, compatible with conclusions based on the MME X-ray crystal structure. Multiple novel polymorphisms of MME were identified in three ethnic groups. The Val73 variant allozyme displayed a significant decrease in MME protein quantity and activity, with degradation mediated by both proteasome and autophagy pathways. This polymorphism could have a significant effect on the metabolism of natriuretic peptides. Copyright © 2010 Elsevier Ltd. All rights reserved.

  12. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    PubMed

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. Copyright © 2016 Manousaki et al.

  13. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae

    PubMed Central

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences. PMID:28617867

  14. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae.

    PubMed

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.

  15. Complete Coding Sequence of Usutu Virus Strain Gracula religiosa/U1609393/Belgium/2016 Obtained from the Brain Tissue of an Infected Captive Common Hill Myna (Gracula religiosa)

    PubMed Central

    Lambrecht, Bénédicte; Vandenbussche, Frank; Steensels, Mieke

    2017-01-01

    ABSTRACT The complete and annotated coding sequence and partial noncoding sequence of an Usutu virus genome were sequenced from RNA extracted from a clinical brain tissue sample obtained from a common hill myna (Gracula religiosa), demonstrating close homology with Usutu viruses circulating in Europe. PMID:28336592

  16. Molecular identification of the traditional herbal medicines, Arisaematis Rhizoma and Pinelliae Tuber, and common adulterants via universal DNA barcode sequences.

    PubMed

    Moon, B C; Kim, W J; Ji, Y; Lee, Y M; Kang, Y M; Choi, G

    2016-02-19

    Methods to identify Pinelliae Tuber and Arisaematis Rhizoma are required because of frequent reciprocal substitution between these two herbal medicines and the existence of several closely related plant materials. As a result of the morphological similarity of dried tubers, correct discrimination of authentic herbal medicines is difficult by conventional methods. Therefore, we analyzed DNA barcode sequences to identify each herbal medicine and the common adulterants at a species level. To verify the identity of these herbal medicines, we collected five authentic species (Pinellia ternata for Pinelliae Tuber, and Arisaema amurense, A. amurense var. serratum, A. erubescens, and A. heterophyllum for Arisaematis Rhizoma) and six common adulterant plant species. Maturase K (matK) and ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit (rbcL) genes were then amplified using universal primers. In comparative analyses of two DNA barcode sequences, we obtained 45 species-specific nucleotides sufficient to identify each species (except A. erubescens with matK) and 28 marker nucleotides for each species (except P. pedatisecta with rbcL). Sequence differences at corresponding positions of the two combined DNA barcodes provided genetic marker nucleotides that could be used to identify specimens of the correct species among the analyzed medicinal plants. Furthermore, we generated a phylogenetic tree showing nine distinct groups depending on the species. These results can be used to authenticate Pinelliae Tuber and Arisaematis Rhizoma from their adulterants and to identify each species. Thus, comparative analyses of plant DNA barcode sequences identified useful genetic markers for the authentication of Pinelliae Tuber and Arisaematis Rhizoma from several adulterant herbal materials.

  17. A common cis-acting sequence in the DiGeorge critical region regulates bi-directional transcription of UFD1L and CDC45L.

    PubMed

    Kunte, A; Ivey, K; Yamagishi, C; Garg, V; Yamagishi, H; Srivastava, D

    2001-10-01

    Two to three megabase deletions on chromosome 22q11 are the cytogenetic findings most commonly associated with cardiac and craniofacial defects in humans. The constellation of clinical findings associated with these deletions is termed the 22q11 deletion syndrome. We had earlier described a patient with the 22q11 deletion phenotype who was hemizygous for an atypical 20 kb microdeletion in this region. The deletion included coding regions of two genes organized head-to-head, UFD1L and CDC45L, along with an 884 bp CpG-rich intervening region. Based on this genomic organization, we hypothesized that both genes may be co-expressed and co-regulated by sequences within this region. We demonstrate that expression of both genes is enhanced in a similar pattern in precursors of structures affected by the deletion. The intergenic region is sufficient to direct transcription most strongly in the developing pharyngeal arches and limb buds of transgenic mice and can also direct bi-directional transcriptional activation in a neural crest-derived cell line. Deletion analyses revealed that a 404 bp fragment closest to UFD1L is necessary and sufficient to direct this bi-directional transcriptional activity. These results reveal the presence of a conserved regulatory region in the 22q11 deletion locus that can direct simultaneous transcription of genes involved in ubiquitin mediated protein processing (UFD1L) and cell cycle control (CDC45L).

  18. p38 MAPK down-regulates fibulin 3 expression through methylation of gene regulatory sequences: role in migration and invasion.

    PubMed

    Arechederra, María; Priego, Neibla; Vázquez-Carballo, Ana; Sequera, Celia; Gutiérrez-Uzquiza, Álvaro; Cerezo-Guisado, María Isabel; Ortiz-Rivero, Sara; Roncero, Cesáreo; Cuenda, Ana; Guerrero, Carmen; Porras, Almudena

    2015-02-13

    p38 MAPKs regulate migration and invasion. However, the mechanisms involved are only partially known. We had previously identified fibulin 3, which plays a role in migration, invasion, and tumorigenesis, as a gene regulated by p38α. We have characterized in detail how p38 MAPK regulates fibulin 3 expression and its role. We describe here for the first time that p38α, p38γ, and p38δ down-regulate fibulin 3 expression. p38α has a stronger effect, and it does so through hypermethylation of CpG sites in the regulatory sequences of the gene. This would be mediated by the DNA methylase, DNMT3A, which is down-regulated in cells lacking p38α, but once re-introduced represses Fibulin 3 expression. p38α through HuR stabilizes dnmt3a mRNA leading to an increase in DNMT3A protein levels. Moreover, by knocking-down fibulin 3, we have found that Fibulin 3 inhibits migration and invasion in MEFs by mechanisms involving p38α/β inhibition. Hence, p38α pro-migratory/invasive effect might be, at least in part, mediated by fibulin 3 down-regulation in MEFs. In contrast, in HCT116 cells, Fibulin 3 promotes migration and invasion through a mechanism dependent on p38α and/or p38β activation. Furthermore, Fibulin 3 promotes in vitro and in vivo tumor growth of HCT116 cells through a mechanism dependent on p38α, which surprisingly acts as a potent inducer of tumor growth. At the same time, p38α limits fibulin 3 expression, which might represent a negative feed-back loop.

  19. Regulatory Mechanisms That Prevent Re-initiation of DNA Replication Can Be Locally Modulated at Origins by Nearby Sequence Elements

    PubMed Central

    Richardson, Christopher D.; Li, Joachim J.

    2014-01-01

    Eukaryotic cells must inhibit re-initiation of DNA replication at each of the thousands of origins in their genome because re-initiation can generate genomic alterations with extraordinary frequency. To minimize the probability of re-initiation from so many origins, cells use a battery of regulatory mechanisms that reduce the activity of replication initiation proteins. Given the global nature of these mechanisms, it has been presumed that all origins are inhibited identically. However, origins re-initiate with diverse efficiencies when these mechanisms are disabled, and this diversity cannot be explained by differences in the efficiency or timing of origin initiation during normal S phase replication. This observation raises the possibility of an additional layer of replication control that can differentially regulate re-initiation at distinct origins. We have identified novel genetic elements that are necessary for preferential re-initiation of two origins and sufficient to confer preferential re-initiation on heterologous origins when the control of re-initiation is partially deregulated. The elements do not enhance the S phase timing or efficiency of adjacent origins and thus are specifically acting as re-initiation promoters (RIPs). We have mapped the two RIPs to ∼60 bp AT rich sequences that act in a distance- and sequence-dependent manner. During the induction of re-replication, Mcm2-7 reassociates both with origins that preferentially re-initiate and origins that do not, suggesting that the RIP elements can overcome a block to re-initiation imposed after Mcm2-7 associates with origins. Our findings identify a local level of control in the block to re-initiation. This local control creates a complex genomic landscape of re-replication potential that is revealed when global mechanisms preventing re-replication are compromised. Hence, if re-replication does contribute to genomic alterations, as has been speculated for cancer cells, some regions of the genome

  20. Large Sequence Diversity within the Biosynthesis Locus and Common Biochemical Features of Campylobacter coli Lipooligosaccharides.

    PubMed

    Culebro, Alejandra; Revez, Joana; Pascoe, Ben; Friedmann, Yasmin; Hitchings, Matthew D; Stupak, Jacek; Sheppard, Samuel K; Li, Jianjun; Rossi, Mirko

    2016-10-15

    Despite the importance of lipooligosaccharides (LOSs) in the pathogenicity of campylobacteriosis, little is known about the genetic and phenotypic diversity of LOS in Campylobacter coli In this study, we investigated the distribution of LOS locus classes among a large collection of unrelated C. coli isolates sampled from several different host species. Furthermore, we paired C. coli genomic information and LOS chemical composition for the first time to investigate possible associations between LOS locus class sequence diversity and biochemical heterogeneity. After identifying three new LOS locus classes, only 85% of the 144 isolates tested were assigned to a class, suggesting higher genetic diversity than previously thought. This genetic diversity is at the basis of a completely unexplored LOS structural heterogeneity. Mass spectrometry analysis of the LOSs of nine isolates, representing four different LOS classes, identified two features distinguishing C. coli LOS from that of Campylobacter jejuni 2-Amino-2-deoxy-d-glucose (GlcN)-GlcN disaccharides were present in the lipid A backbone, in contrast to the β-1'-6-linked 3-diamino-2,3-dideoxy-d-glucopyranose (GlcN3N)-GlcN backbone observed in C. jejuni Moreover, despite the fact that many of the genes putatively involved in 3-acylamino-3,6-dideoxy-d-glucose (Quip3NAcyl) were apparently absent from the genomes of various isolates, this rare sugar was found in the outer core of all C. coli isolates. Therefore, regardless of the high genetic diversity of the LOS biosynthesis locus in C. coli, we identified species-specific phenotypic features of C. coli LOS that might explain differences between C. jejuni and C. coli in terms of population dynamics and host adaptation. Despite the importance of C. coli to human health and its controversial role as a causative agent of Guillain-Barré syndrome, little is known about the genetic and phenotypic diversity of C. coli LOSs. Therefore, we paired C. coli genomic information

  1. Opossum carboxylesterases: sequences, phylogeny and evidence for CES gene duplication events predating the marsupial-eutherian common ancestor

    PubMed Central

    2008-01-01

    Background Carboxylesterases (CES) perform diverse metabolic roles in mammalian organisms in the detoxification of a broad range of drugs and xenobiotics and may also serve in specific roles in lipid, cholesterol, pheromone and lung surfactant metabolism. Five CES families have been reported in mammals with human CES1 and CES2 the most extensively studied. Here we describe the genetics, expression and phylogeny of CES isozymes in the opossum and report on the sequences and locations of CES1, CES2 and CES6 'like' genes within two gene clusters on chromosome one. We also discuss the likely sequence of gene duplication events generating multiple CES genes during vertebrate evolution. Results We report a cDNA sequence for an opossum CES and present evidence for CES1 and CES2 like genes expressed in opossum liver and intestine and for distinct gene locations of five opossum CES genes,CES1, CES2.1, CES2.2, CES2.3 and CES6, on chromosome 1. Phylogenetic and sequence alignment studies compared the predicted amino acid sequences for opossum CES with those for human, mouse, chicken, frog, salmon and Drosophila CES gene products. Phylogenetic analyses produced congruent phylogenetic trees depicting a rapid early diversification into at least five distinct CES gene family clusters: CES2, CES1, CES7, CES3, and CES6. Molecular divergence estimates based on a Bayesian relaxed clock approach revealed an origin for the five mammalian CES gene families between 328–378 MYA. Conclusion The deduced amino acid sequence for an opossum cDNA was consistent with its identity as a mammalian CES2 gene product (designated CES2.1). Distinct gene locations for opossum CES1 (1: 446,222,550–446,274,850), three CES2 genes (1: 677,773,395–677,927,030) and a CES6 gene (1: 677,585,520–677,730,419) were observed on chromosome 1. Opossum CES1 and multiple CES2 genes were expressed in liver and intestine. Amino acid sequences for opossum CES1 and three CES2 gene products revealed conserved

  2. Opossum carboxylesterases: sequences, phylogeny and evidence for CES gene duplication events predating the marsupial-eutherian common ancestor.

    PubMed

    Holmes, Roger S; Chan, Jeannie; Cox, Laura A; Murphy, William J; VandeBerg, John L

    2008-02-20

    Carboxylesterases (CES) perform diverse metabolic roles in mammalian organisms in the detoxification of a broad range of drugs and xenobiotics and may also serve in specific roles in lipid, cholesterol, pheromone and lung surfactant metabolism. Five CES families have been reported in mammals with human CES1 and CES2 the most extensively studied. Here we describe the genetics, expression and phylogeny of CES isozymes in the opossum and report on the sequences and locations of CES1, CES2 and CES6 'like' genes within two gene clusters on chromosome one. We also discuss the likely sequence of gene duplication events generating multiple CES genes during vertebrate evolution. We report a cDNA sequence for an opossum CES and present evidence for CES1 and CES2 like genes expressed in opossum liver and intestine and for distinct gene locations of five opossum CES genes,CES1, CES2.1, CES2.2, CES2.3 and CES6, on chromosome 1. Phylogenetic and sequence alignment studies compared the predicted amino acid sequences for opossum CES with those for human, mouse, chicken, frog, salmon and Drosophila CES gene products. Phylogenetic analyses produced congruent phylogenetic trees depicting a rapid early diversification into at least five distinct CES gene family clusters: CES2, CES1, CES7, CES3, and CES6. Molecular divergence estimates based on a Bayesian relaxed clock approach revealed an origin for the five mammalian CES gene families between 328-378 MYA. The deduced amino acid sequence for an opossum cDNA was consistent with its identity as a mammalian CES2 gene product (designated CES2.1). Distinct gene locations for opossum CES1 (1: 446,222,550-446,274,850), three CES2 genes (1: 677,773,395-677,927,030) and a CES6 gene (1: 677,585,520-677,730,419) were observed on chromosome 1. Opossum CES1 and multiple CES2 genes were expressed in liver and intestine. Amino acid sequences for opossum CES1 and three CES2 gene products revealed conserved residues previously reported for human CES1

  3. A Common Functional Regulatory Variant at a Type 2 Diabetes Locus Upregulates ARAP1 Expression in the Pancreatic Beta Cell

    PubMed Central

    Kulzer, Jennifer R.; Stitzel, Michael L.; Morken, Mario A.; Huyghe, Jeroen R.; Fuchsberger, Christian; Kuusisto, Johanna; Laakso, Markku; Boehnke, Michael; Collins, Francis S.; Mohlke, Karen L.

    2014-01-01

    Genome-wide association studies (GWASs) have identified more than 70 loci associated with type 2 diabetes (T2D), but for most, the underlying causal variants, associated genes, and functional mechanisms remain unknown. At a T2D- and fasting-proinsulin-associated locus on 11q13.4, we have identified a functional regulatory DNA variant, a candidate target gene, and a plausible underlying molecular mechanism. Fine mapping, conditional analyses, and exome array genotyping in 8,635 individuals from the Metabolic Syndrome in Men study confirmed a single major association signal between fasting proinsulin and noncoding variants (p = 7.4 × 10−50). Measurement of allele-specific mRNA levels in human pancreatic islet samples heterozygous for rs11603334 showed that the T2D-risk and proinsulin-decreasing allele (C) is associated with increased ARAP1 expression (p < 0.02). We evaluated four candidate functional SNPs for allelic effects on transcriptional activity by performing reporter assays in rodent pancreatic beta cell lines. The C allele of rs11603334, located near one of the ARAP1 promoters, exhibited 2-fold higher transcriptional activity than did the T allele (p < 0.0001); three other candidate SNPs showed no allelic differences. Electrophoretic mobility shift assays demonstrated decreased binding of pancreatic beta cell transcriptional regulators PAX6 and PAX4 to the rs11603334 C allele. Collectively, these data suggest that the T2D-risk allele of rs11603334 could abrogate binding of a complex containing PAX6 and PAX4 and thus lead to increased promoter activity and ARAP1 expression in human pancreatic islets. This work suggests that increased ARAP1 expression might contribute to T2D susceptibility at this GWAS locus. PMID:24439111

  4. Large-Scale Evaluation of Common Variation in Regulatory T Cell-Related Genes and Ovarian Cancer Outcome

    PubMed Central

    Charbonneau, Bridget; Moysich, Kirsten B.; Kalli, Kimberly R.; Oberg, Ann L.; Vierkant, Robert A.; Fogarty, Zachary C.; Block, Matthew S.; Maurer, Matthew J.; Goergen, Krista M.; Fridley, Brooke L.; Cunningham, Julie M.; Rider, David N.; Preston, Claudia; Hartmann, Lynn C.; Lawrenson, Kate; Wang, Chen; Tyrer, Jonathan; Song, Honglin; deFazio, Anna; Johnatty, Sharon E.; Doherty, Jennifer A.; Phelan, Catherine M.; Sellers, Thomas A.; Ramirez, Starr M.; Vitonis, Allison F.; Terry, Kathryn L.; Van Den Berg, David; Pike, Malcolm C.; Wu, Anna H.; Berchuck, Andrew; Gentry-Maharaj, Aleksandra; Ramus, Susan J.; Diergaarde, Brenda; Shen, Howard; Jensen, Allan; Menkiszak, Janusz; Cybulski, Cezary; Lubiński, Jan; Ziogas, Argyrios; Rothstein, Joseph H.; McGuire, Valerie; Sieh, Weiva; Lester, Jenny; Walsh, Christine; Vergote, Ignace; Lambrechts, Sandrina; Despierre, Evelyn; Garcia-Closas, Montserrat; Yang, Hannah; Brinton, Louise A.; Spiewankiewicz, Beata; Rzepecka, Iwona K.; Dansonka-Mieszkowska, Agnieszka; Seibold, Petra; Rudolph, Anja; Paddock, Lisa E.; Orlow, Irene; Lundvall, Lene; Olson, Sara H.; Hogdall, Claus K.; Schwaab, Ira; du Bois, Andreas; Harter, Philipp; Flanagan, James M.; Brown, Robert; Paul, James; Ekici, Arif B.; Beckmann, Matthias W.; Hein, Alexander; Eccles, Diana; Lurie, Galina; Hays, Laura E.; Bean, Yukie T.; Pejovic, Tanja; Goodman, Marc T.; Campbell, Ian; Fasching, Peter A.; Konecny, Gottfried; Kaye, Stanley B.; Heitz, Florian; Hogdall, Estrid; Bandera, Elisa V.; Chang-Claude, Jenny; Kupryjanczyk, Jolanta; Wentzensen, Nicolas; Lambrechts, Diether; Karlan, Beth Y.; Whittemore, Alice S.; Culver, Hoda Anton; Gronwald, Jacek; Levine, Douglas A.; Kjaer, Susanne K.; Menon, Usha; Schildkraut, Joellen M.; Pearce, Celeste Leigh; Cramer, Daniel W.; Rossing, Mary Anne; Chenevix-Trench, Georgia; Pharoah, Paul D.P.; Gayther, Simon A.; Ness, Roberta B.; Odunsi, Kunle; Sucheston, Lara E.; Knutson, Keith L.; Goode, Ellen L.

    2014-01-01

    The presence of regulatory T cells (Tregs) in solid tumors is known to play a role in patient survival in ovarian cancer and other malignancies. We assessed inherited genetic variations via 749 tag SNPs in 25 Treg-associated genes (CD28, CTLA4, FOXP3, IDO1, IL10, IL10RA, IL15, 1L17RA, IL23A, IL23R, IL2RA, IL6, IL6R, IL8, LGALS1, LGALS9, MAP3K8, STAT5A, STAT5B, TGFB1, TGFB2, TGFB3, TGFBR1, TGRBR2, and TGFBR3) in relation to ovarian cancer survival. We analyzed genotype and overall survival in 10,084 women with invasive epithelial ovarian cancer, including 5,248 high-grade serous, 1,452 endometrioid, 795 clear cell, and 661 mucinous carcinoma cases of European descent across 28 studies from the Ovarian Cancer Association Consortium (OCAC). The strongest associations were found for endometrioid carcinoma and IL2RA SNPs rs11256497 [HR=1.42, 95% CI: 1.22–1.64; p=5.7 × 10−6], rs791587 [HR=1.36, 95% CI:1.17–1.57; p=6.2 × 10−5], rs2476491 [HR=1.40, 95% CI: 1.191.64; p=5.6 × 10−5], and rs10795763 [HR=1.35, 95% CI: 1.17–1.57; p=7.9 × 10−5], and for clear cell carcinoma and CTLA4 SNP rs231775 [HR=0.67, 95% CI: 0.54–0.82; p=9.3 × 10−5] after adjustment for age, study site, population stratification, stage, grade, and oral contraceptive use. The rs231775 allele associated with improved survival in our study also results in an amino acid change in CTLA4 and previously has been reported to be associated with autoimmune conditions. Thus, we found evidence that SNPs in genes related to Tregs appear to play a role in ovarian cancer survival, particularly in patients with clear cell and endometrioid EOC. PMID:24764580

  5. Targeted re-sequencing analysis of 25 genes commonly mutated in myeloid disorders in del(5q) myelodysplastic syndromes

    PubMed Central

    Fernandez-Mercado, Marta; Burns, Adam; Pellagatti, Andrea; Giagounidis, Aristoteles; Germing, Ulrich; Agirre, Xabier; Prosper, Felipe; Aul, Carlo; Killick, Sally; Wainscoat, James S.; Schuh, Anna; Boultwood, Jacqueline

    2013-01-01

    Interstitial deletion of chromosome 5q is the most common chromosomal abnormality in myelodysplastic syndromes. The catalogue of genes involved in the molecular pathogenesis of myelodysplastic syndromes is rapidly expanding and next-generation sequencing technology allows detection of these mutations at great depth. Here we describe the design, validation and application of a targeted next-generation sequencing approach to simultaneously screen 25 genes mutated in myeloid malignancies. We used this method alongside single nucleotide polymorphism-array technology to characterize the mutational and cytogenetic profile of 43 cases of early or advanced del(5q) myelodysplastic syndromes. A total of 29 mutations were detected in our cohort. Overall, 45% of early and 66.7% of advanced cases had at least one mutation. Genes with the highest mutation frequency among advanced cases were TP53 and ASXL1 (25% of patients each). These showed a lower mutation frequency in cases of 5q- syndrome (4.5% and 13.6%, respectively), suggesting a role in disease progression in del(5q) myelodysplastic syndromes. Fifty-two percent of mutations identified were in genes involved in epigenetic regulation (ASXL1, TET2, DNMT3A and JAK2). Six mutations had allele frequencies <20%, likely below the detection limit of traditional sequencing methods. Genomic array data showed that cases of advanced del(5q) myelodysplastic syndrome had a complex background of cytogenetic aberrations, often encompassing genes involved in myeloid disorders. Our study is the first to investigate the molecular pathogenesis of early and advanced del(5q) myelodysplastic syndromes using next-generation sequencing technology on a large panel of genes frequently mutated in myeloid malignancies, further illuminating the molecular landscape of del(5q) myelodysplastic syndromes. PMID:23831921

  6. Immunity related genes in dipterans share common enrichment of AT-rich motifs in their 5' regulatory regions that are potentially involved in nucleosome formation

    PubMed Central

    Hernandez-Romano, Jesus; Carlos-Rivera, Francisco J; Salgado, Heladia; Lamadrid-Figueroa, Hector; Valverde-Garduño, Veronica; Rodriguez, Mario H; Martinez-Barnetche, Jesus

    2008-01-01

    Background Understanding the transcriptional regulation mechanisms in response to environmental challenges is of fundamental importance in biology. Transcription factors associated to response elements and the chromatin structure had proven to play important roles in gene expression regulation. We have analyzed promoter regions of dipteran genes induced in response to immune challenge, in search for particular sequence patterns involved in their transcriptional regulation. Results 5' upstream regions of D. melanogaster and A. gambiae immunity-induced genes and their corresponding orthologous genes in 11 non-melanogaster drosophilid species and Ae. aegypti share enrichment in AT-rich short motifs. AT-rich motifs are associated with nucleosome formation as predicted by two different algorithms. In A. gambiae and D. melanogaster, many immunity genes 5' upstream sequences also showed NFκB response elements, located within 500 bp from the transcription start site. In A. gambiae, the frequency of ATAA motif near the NFκB response elements was increased, suggesting a functional link between nucleosome formation/remodelling and NFκB regulation of transcription. Conclusion AT-rich motif enrichment in 5' upstream sequences in A. gambiae, Ae. aegypti and the Drosophila genus immunity genes suggests a particular pattern of nucleosome formation/chromatin organization. The co-occurrence of such motifs with the NFκB response elements suggests that these sequence signatures may be functionally involved in transcriptional activation during dipteran immune response. AT-rich motif enrichment in regulatory regions in this group of co-regulated genes could represent an evolutionary constrained signature in dipterans and perhaps other distantly species. PMID:18613977

  7. Common interruptions in the repeating tripeptide sequence of non-fibrillar collagens: sequence analysis and structural studies on triple-helix peptide models.

    PubMed

    Thiagarajan, Geetha; Li, Yingjie; Mohs, Angela; Strafaci, Christopher; Popiel, Magdalena; Baum, Jean; Brodsky, Barbara

    2008-02-22

    Interruptions in the repeating (Gly-X1-X2)(n) amino acid sequence pattern are found in the triple-helix domains of all non-fibrillar collagens, and perturbations to the triple-helix at such sites are likely to play a role in collagen higher-order structure and function. This study defines the sequence features and structural consequences of the most common interruption, where one residue is missing from the tripeptide pattern, Gly-X1-X2-Gly-AA(1)-Gly-X1-X2, designated G1G interruptions. Residues found within G1G interruptions are predominantly hydrophobic (70%), followed by a significant amount of charged residues (16%), and the Gly-X1-X2 triplets flanking the interruption are atypical. Studies on peptide models indicate the degree of destabilization is much greater when Pro is in the interruption, GP, than when hydrophobic residues (GF, GY) are present, and a rigid Gly-Pro-Hyp tripeptide adjacent to the interruption leads to greater destabilization than a flexible Gly-Ala-Ala sequence. Modeling based on NMR data indicates the Phe residue within a GF interruption is located on the outside of the triple helix. The G1G interruptions resemble a previously studied collagen interruption GPOGAAVMGPO, designated G4G-type, in that both are destabilizing, but allow continuation of rod-like triple helices and maintenance of the single residue stagger throughout the imperfection, with a loss of axial register of the superhelix on both sides. Both kinds of interruptions result in a highly localized perturbation in hydrogen bonding and dihedral angles, but the hydrophobic residue of a G4G interruption packs near the central axis of the superhelix, while the hydrophobic residue of a G1G interruption is located on the triple-helix surface. The different structural consequences of G1G and G4G interruptions in the repeating tripeptide sequence pattern suggest a physical basis for their differential susceptibility to matrix metalloproteinases in type X collagen.

  8. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  9. Evaluation of global sequence comparison and one-to-one FASTA local alignment in regulatory allergenicity assessment of transgenic proteins in food crops.

    PubMed

    Song, Ping; Herman, Rod A; Kumpatla, Siva

    2014-09-01

    To address the high false positive rate using >35% identity over 80 amino acids in the regulatory assessment of transgenic proteins for potential allergenicity and the change of E-value with database size, the Needleman-Wunsch global sequence alignment and a one-to-one (1:1) local FASTA search (one protein in the target database at a time) using FASTA were evaluated by comparing proteins randomly selected from Arabidopsis, rice, corn, and soybean with known allergens in a peer-reviewed allergen database (http://www.allergenonline.org/). Compared with the approach of searching >35%/80aa+, the false positive rate measured by specificity rate for identification of true allergens was reduced by a 1:1 global sequence alignment with a cut-off threshold of ≧30% identity and a 1:1 FASTA local alignment with a cut-off E-value of ≦1.0E-09 while maintaining the same sensitivity. Hence, a 1:1 sequence comparison, especially using the FASTA local alignment tool with a biological relevant E-value of 1.0E-09 as a threshold, is recommended for the regulatory assessment of sequence identities between transgenic proteins in food crops and known allergens.

  10. Amino acid sequence of a neurotoxic phospholipase A2 enzyme from common death adder (Acanthophis antracticus) venom.

    PubMed

    van der Weyden, L; Hains, P; Broady, K; Shaw, D; Milburn, P

    2001-02-01

    The amino acid sequence of the first neurotoxic phospholipase A2, acanthoxin A1, purified from the venom of the Common death adder (Acanthophis antarcticus) was determined. Acanthoxin A1 shows high homology with other Australian elapid PLA2 neurotoxins, in particular Acanthin-I and -II, also from Death adder, Pseudexin A from the Red-bellied black snake (Pseudechis porphyriacus), and Pa-12a and Pa-9c from the King brown snake (Pseudechis australis). Acanthoxin A1 is a single-chain 118 amino acid residue PLA2, including 14 half cystine residues and the essential residues forming the ubiquitous calcium binding pocket and catalytic site. Critical analysis of the residues hypothesized to be important for neurotoxicity is presented.

  11. Genomic sequence analysis of the MHC class I G/F segment in common marmoset (Callithrix jacchus).

    PubMed

    Kono, Azumi; Brameier, Markus; Roos, Christian; Suzuki, Shingo; Shigenari, Atsuko; Kametani, Yoshie; Kitaura, Kazutaka; Matsutani, Takaji; Suzuki, Ryuji; Inoko, Hidetoshi; Walter, Lutz; Shiina, Takashi

    2014-04-01

    The common marmoset (Callithrix jacchus) is a New World monkey that is used frequently as a model for various human diseases. However, detailed knowledge about the MHC is still lacking. In this study, we sequenced and annotated a total of 854 kb of the common marmoset MHC region that corresponds to the HLA-A/G/F segment (Caja-G/F) between the Caja-G1 and RNF39 genes. The sequenced region contains 19 MHC class I genes, of which 14 are of the MHC-G (Caja-G) type, and 5 are of the MHC-F (Caja-F) type. Six putatively functional Caja-G and Caja-F genes (Caja-G1, Caja-G3, Caja-G7, Caja-G12, Caja-G13, and Caja-F4), 13 pseudogenes related either to Caja-G or Caja-F, three non-MHC genes (ZNRD1, PPPIR11, and RNF39), two miscRNA genes (ZNRD1-AS1 and HCG8), and one non-MHC pseudogene (ETF1P1) were identified. Phylogenetic analysis suggests segmental duplications of units consisting of basically five (four Caja-G and one Caja-F) MHC class I genes, with subsequent expansion/deletion of genes. A similar genomic organization of the Caja-G/F segment has not been observed in catarrhine primates, indicating that this genomic segment was formed in New World monkeys after the split of New World and Old World monkeys.

  12. Commonality among fluoroquinolone-resistant sequence type ST131 extraintestinal Escherichia coli isolates from humans and companion animals in Australia.

    PubMed

    Platell, Joanne L; Cobbold, Rowland N; Johnson, James R; Heisig, Anke; Heisig, Peter; Clabots, Connie; Kuskowski, Michael A; Trott, Darren J

    2011-08-01

    Escherichia coli sequence type 131 (ST131), an emergent multidrug-resistant extraintestinal pathogen, has spread epidemically among humans and was recently isolated from companion animals. To assess for human-companion animal commonality among ST131 isolates, 214 fluoroquinolone-resistant extraintestinal E. coli isolates (205 from humans, 9 from companion animals) from diagnostic laboratories in Australia, provisionally identified as ST131 by PCR, selectively underwent PCR-based O typing and bla(CTX-M-15) detection. A subset then underwent multilocus sequence typing (MLST), pulsed-field gel electrophoresis (PFGE) analysis, extended virulence genotyping, antimicrobial susceptibility testing, and fluoroquinolone resistance genotyping. All isolates were O25b positive, except for two O16 isolates and one O157 isolate, which (along with six O25b-positive isolates) were confirmed by MLST to be ST131. Only 12% of isolates (25 human, 1 canine) exhibited bla(CTX-M-15). PFGE analysis of 20 randomly selected human and all 9 companion animal isolates showed multiple instances of ≥94% profile similarity across host species; 12 isolates (6 human, 6 companion animal) represented pulsotype 968, the most prevalent ST131 pulsotype in North America (representing 23% of a large ST131 reference collection). Virulence gene and antimicrobial resistance profiles differed minimally, without host species specificity. The analyzed ST131 isolates also exhibited a conserved, host species-independent pattern of chromosomal fluoroquinolone resistance mutations. However, eight (89%) companion animal isolates, versus two (10%) human isolates, possessed the plasmid-borne qnrB gene (P < 0.001). This extensive across-species strain commonality, plus the similarities between Australian and non-Australian ST131 isolates, suggest that ST131 isolates are exchanged between humans and companion animals both within Australia and intercontinentally.

  13. [Cloning and function identification of gene 'admA' and up-stream regulatory sequence related to antagonistic activity of Enterobacter cloacae B8].

    PubMed

    Zhu, Jun-Li; Li, De-Bao; Yu, Xu-Ping

    2012-04-01

    To reveal the antagonistic mechanism of B8 strain to Xanthomonas oryzae pv. oryzae, transposon tagging method and chromosome walking were deployed to clone antagonistic related fragments around Tn5 insertion site in the mutant strain B8B. The function of up-stream regulatory sequence of gene 'admA' involved in the antagonistic activity was further identified by gene knocking out technique. An antagonistic related left fragment of Tn5 insertion site, 2 608 bp in length, was obtained by tagging with Kan resistance gene of Tn5. A 2 354 bp right fragment of Tn5 insertion site was amplified with 2 rounds of chromosome walking. The length of the B contig around the Tn5 insertion site was 4 611 bp, containing 7 open reading frames (ORFs). Bioinformatic analysis revealed that these ORFs corresponded to the partial coding regions of glyceraldehyde-3-phosphate dehydrogenase, two LysR family transcriptional regulators, hypothetical protein VSWAT3-20465 of Vibrionales and admA, admB, and partial sequence of admC gene of Pantoea agglomerans biosynthetic gene cluster, respectively. Tn5 was inserted in the up-stream of 200 bp or 894 bp of the sequence corresponding to anrP ORF or admA gene on B8B, respectively. The B-1 and B-2 mutants that lost antagonistic activity were selected by homeologuous recombination technology in association with knocking out plasmid pMB-BG. These results suggested that the transcription and expression of anrP gene might be disrupted as a result of the knocking out of up-stream regulatory sequence by Tn5 in B8B strain, further causing biosythesis regulation of the antagonistic related gene cluster. Thus, the antagonistic related genes in B8 strain is a gene family similar as andrimid biosynthetic gene cluster, and the upstream regulatory region appears to be critical for the antibiotics biosynthesis.

  14. The upstream regulatory sequence of the light harvesting complex Lhcf2 gene of the marine diatom Phaeodactylum tricornutum enhances transcription in an orientation- and distance-independent fashion.

    PubMed

    Russo, Monia Teresa; Annunziata, Rossella; Sanges, Remo; Ferrante, Maria Immacolata; Falciatore, Angela

    2015-12-01

    Diatoms are a key phytoplankton group in the contemporary ocean, showing extraordinary adaptation capacities to rapidly changing environments. The recent availability of whole genome sequences from representative species has revealed distinct features in their genomes, like novel combinations of genes encoding distinct metabolisms and a significant number of diatom-specific genes. However, the regulatory mechanisms driving diatom gene expression are still largely uncharacterized. Considering the wide variety of fields of study orbiting diatoms, ranging from ecology, evolutionary biology to biotechnology, it is thus essential to increase our understanding of fundamental gene regulatory processes such as transcriptional regulation. To this aim, we explored the functional properties of the 5'-flanking region of the Phaeodatylum tricornutum Lhcf2 gene, encoding a member of the Light Harvesting Complex superfamily and we showed that this region enhances transcription of a GUS reporter gene in an orientation- and distance-independent fashion. This represents the first example of a cis-regulatory sequence with enhancer-like features discovered in diatoms and it is instrumental for the generation of novel genetic tools and diatom exploitation in different areas of study.

  15. Phylogeography and population structure of the common warthog (Phacochoerus africanus) inferred from variation in mitochondrial DNA sequences and microsatellite loci.

    PubMed

    Muwanika, V B; Nyakaana, S; Siegismund, H R; Arctander, P

    2003-10-01

    Global climate fluctuated considerably throughout the Pliocene and Pleistocene, influencing the evolutionary history of a wide range of species. Using both mitochondrial sequences and microsatellites, we have investigated the evolutionary consequences of such environmental fluctuation for the patterns of genetic variation in the common warthog, sampled from 24 localities in Africa. In the sample of 181 individuals, 70 mitochondrial DNA haplotypes were identified and an overall nucleotide diversity of 4.0% was observed. The haplotypes cluster in three well-differentiated clades (estimated net sequence divergence of 3.1-6.6%) corresponding to the geographical origins of individuals (i.e. eastern, western and southern African clades). At the microsatellite loci, high polymorphism was observed both in the number of alleles per locus (6-21), and in the gene diversity (in each population 0.59-0.80). Analysis of population differentiation indicates greater subdivision at the mitochondrial loci (FST=0.85) than at nuclear loci (FST=0.20), but both mitochondrial and nuclear loci support the existence of the three warthog lineages. We interpret our results in terms of the large-scale climatic fluctuations of the Pleistocene.

  16. Prokaryotic regulatory systems biology: Common principles governing the functional architectures of Bacillus subtilis and Escherichia coli unveiled by the natural decomposition approach.

    PubMed

    Freyre-González, Julio A; Treviño-Quintanilla, Luis G; Valtierra-Gutiérrez, Ilse A; Gutiérrez-Ríos, Rosa María; Alonso-Pavón, José A

    2012-10-31

    Escherichia coli and Bacillus subtilis are two of the best-studied prokaryotic model organisms. Previous analyses of their transcriptional regulatory networks have shown that they exhibit high plasticity during evolution and suggested that both converge to scale-free-like structures. Nevertheless, beyond this suggestion, no analyses have been carried out to identify the common systems-level components and principles governing these organisms. Here we show that these two phylogenetically distant organisms follow a set of common novel biologically consistent systems principles revealed by the mathematically and biologically founded natural decomposition approach. The discovered common functional architecture is a diamond-shaped, matryoshka-like, three-layer (coordination, processing, and integration) hierarchy exhibiting feedback, which is shaped by four systems-level components: global transcription factors (global TFs), locally autonomous modules, basal machinery and intermodular genes. The first mathematical criterion to identify global TFs, the κ-value, was reassessed on B. subtilis and confirmed its high predictive power by identifying all the previously reported, plus three potential, master regulators and eight sigma factors. The functionally conserved cores of modules, basal cell machinery, and a set of non-orthologous common physiological global responses were identified via both orthologous genes and non-orthologous conserved functions. This study reveals novel common systems principles maintained between two phylogenetically distant organisms and provides a comparison of their lifestyle adaptations. Our results shed new light on the systems-level principles and the fundamental functions required by bacteria to sustain life. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Characterization of mutations of the phosphoinositide-3-kinase regulatory subunit, PIK3R2, in perisylvian polymicrogyria: a next generation sequencing study

    PubMed Central

    Mirzaa, Ghayda; Conti, Valerio; Timms, Andrew E.; Smyser, Christopher D.; Ahmed, Sarah; Carter, Melissa; Barnett, Sarah; Hufnagel, Robert B.; Goldstein, Amy; Narumi-Kishimoto, Yoko; Olds, Carissa; Collins, Sarah; Johnston, Kathreen; Deleuze, Jean-François; Nitschké, Patrick; Friend, Kathryn; Harris, Catharine; Goetsch, Allison; Martin, Beth; Boyle, Evan August; Parrini, Elena; Mei, Davide; Tattini, Lorenzo; Slavotinek, Anne; Blair, Ed; Barnett, Christopher; Shendure, Jay; Chelly, Jamel; Dobyns, William B.; Guerrini, Renzo

    2015-01-01

    SUMMARY Background Bilateral perisylvian polymicrogyria (BPP), the most common form of regional polymicrogyria, causes the congenital bilateral perisylvian syndrome, featuring oromotor dysfunction, cognitive impairment and epilepsy. BPP is etiologically heterogeneous, but only a few genetic causes have been reported. The aim of this study was to identify additional genetic etiologies of BPP and delineate their frequency in this patient population. Methods We performed child-parent (trio)-based whole exome sequencing (WES) on eight children with BPP. Following the identification of mosaic PIK3R2 mutations in two of these eight children, we performed targeted screening of PIK3R2 in a cohort of 118 children with BPP who were ascertained from 1980 until 2015 using two methods. First, we performed targeted sequencing of the entire PIK3R2 gene by single molecule molecular inversion probes (smMIPs) on 38 patients with BPP with normal-large head size. Second, we performed amplicon sequencing of the recurrent PIK3R2 mutation (p.Gly373Arg) on 80 children with various types of polymicrogyria including BPP. One additional patient underwent clinical WES independently, and was included in this study given the phenotypic similarity to our cohort. All patients included in this study were children (< 18 years of age) with polymicrogyria enrolled in our research program. Findings Using WES, we identified a mosaic mutation (p.Gly373Arg) in the regulatory subunit of the PI3K-AKT-MTOR pathway, PIK3R2, in two children with BPP. Of the 38 patients with BPP and normal-large head size who underwent targeted next generation sequencing by smMIPs, we identified constitutional and mosaic PIK3R2 mutations in 17 additional children. In parallel, one patient was found to have the recurrent PIK3R2 mutation by clinical WES. Seven patients had BPP alone, and 13 had BPP in association with features of the megalencephaly-polymicrogyria-polydactyly-hydrocephalus syndrome (MPPH). Nineteen patients had

  18. Fully automated segmentation and tracking of the intima media thickness in ultrasound video sequences of the common carotid artery.

    PubMed

    Ilea, Dana E; Duffy, Caoimhe; Kavanagh, Liam; Stanton, Alice; Whelan, Paul F

    2013-01-01

    The robust identification and measurement of the intima media thickness (IMT) has a high clinical relevance because it represents one of the most precise predictors used in the assessment of potential future cardiovascular events. To facilitate the analysis of arterial wall thickening in serial clinical investigations, in this paper we have developed a novel fully automatic algorithm for the segmentation, measurement, and tracking of the intima media complex (IMC) in B-mode ultrasound video sequences. The proposed algorithm entails a two-stage image analysis process that initially addresses the segmentation of the IMC in the first frame of the ultrasound video sequence using a model-based approach; in the second step, a novel customized tracking procedure is applied to robustly detect the IMC in the subsequent frames. For the video tracking procedure, we introduce a spatially coherent algorithm called adaptive normalized correlation that prevents the tracking process from converging to wrong arterial interfaces. This represents the main contribution of this paper and was developed to deal with inconsistencies in the appearance of the IMC over the cardiac cycle. The quantitative evaluation has been carried out on 40 ultrasound video sequences of the common carotid artery (CCA) by comparing the results returned by the developed algorithm with respect to ground truth data that has been manually annotated by clinical experts. The measured IMT(mean) ± standard deviation recorded by the proposed algorithm is 0.60 mm ± 0.10, with a mean coefficient of variation (CV) of 2.05%, whereas the corresponding result obtained for the manually annotated ground truth data is 0.60 mm ± 0.11 with a mean CV equal to 5.60%. The numerical results reported in this paper indicate that the proposed algorithm is able to correctly segment and track the IMC in ultrasound CCA video sequences, and we were encouraged by the stability of our technique when applied to data captured under

  19. Variations in a hotspot region of chloroplast DNAs among common wheat and Aegilops revealed by nucleotide sequence analysis.

    PubMed

    Guo, Chang-Hong; Terachi, Toru

    2005-08-01

    The second largest BamHI fragment (B2) of the chloroplast DNA in Triticum (wheat) and Aegilops contains a highly variable region (a hotspot), resulting in four types of B2 of different size, i.e. B2l (10.5kb), B2m (10.2kb), B2 (9.6kb) and B2s (9.4kb). In order to gain a better understanding of the molecular nature of the variations in length and explain unexpected identity among B2 of Ae. ovata, Ae. speltoides and common wheat (T. aestivum), the nucleotide sequence between a stop codon of rbcL and a HindIII site in cemA in the hotspot was determined for Ae. ovata, Ae. speltoides, Ae. caudata and Ae. mutica. The total number of nucleotides in the region was 2808, 2810, 3302, and 3594 bp, for Ae. speltoides, Ae. ovata, Ae. caudata and Ae. mutica, respectively, and the sequences were compared with the corresponding ones of Ae. crassa 4x, T. aestivum and Ae. squarrosa. Compared with the largest B2l fragment of Ae. mutica, a 791bp and a 793 bp deletion were found in Ae. speltoides and Ae. ovata, respectively, and the possible site of deletion in the two species is the same as that of T. aestivum. However, a deleted segment in Ae. ovata is 2 bp longer than that of Ae. speltoides (and T. aestivum), demonstrating that recurrent deletions had occurred in the chloroplast genomes of both species. Comparison of the sequences from Ae. caudata and Ae. crassa 4x with that of Ae. mutica revealed a 289 bp and a 61 bp deletion at the same site in Ae. caudata and Ae. crassa 4x, respectively. Sequence comparison using wild Aegilops plants showed that the large length variations in a hotspot are fixed to each species. A considerable number of polymorphisms are observed in a loop in the 3' of rbcL. The study reveals the relative importance of the large and small indels and minute inversions to account for variations in the chloroplast genomes among closely related species.

  20. Exome Sequencing and cis-Regulatory Mapping Identify Mutations in MAK, a Gene Encoding a Regulator of Ciliary Length, as a Cause of Retinitis Pigmentosa

    PubMed Central

    Özgül, Rıza Köksal; Siemiatkowska, Anna M.; Yücel, Didem; Myers, Connie A.; Collin, Rob W.J.; Zonneveld, Marijke N.; Beryozkin, Avigail; Banin, Eyal; Hoyng, Carel B.; van den Born, L. Ingeborgh; Bose, Ron; Shen, Wei; Sharon, Dror; Cremers, Frans P.M.; Klevering, B. Jeroen; den Hollander, Anneke I.; Corbo, Joseph C.

    2011-01-01

    A fundamental challenge in analyzing exome-sequence data is distinguishing pathogenic mutations from background polymorphisms. To address this problem in the context of a genetically heterogeneous disease, retinitis pigmentosa (RP), we devised a candidate-gene prioritization strategy called cis-regulatory mapping that utilizes ChIP-seq data for the photoreceptor transcription factor CRX to rank candidate genes. Exome sequencing combined with this approach identified a homozygous nonsense mutation in male germ cell-associated kinase (MAK) in the single affected member of a consanguineous Turkish family with RP. MAK encodes a cilium-associated mitogen-activated protein kinase whose function is conserved from the ciliated alga, Chlamydomonas reinhardtii, to humans. Mutations in MAK orthologs in mice and other model organisms result in abnormally long cilia and, in mice, rapid photoreceptor degeneration. Subsequent sequence analyses of additional individuals with RP identified five probands with missense mutations in MAK. Two of these mutations alter amino acids that are conserved in all known kinases, and an in vitro kinase assay indicates that these mutations result in a loss of kinase activity. Thus, kinase activity appears to be critical for MAK function in humans. This study highlights a previously underappreciated role for CRX as a direct transcriptional regulator of ciliary genes in photoreceptors. In addition, it demonstrates the effectiveness of CRX-based cis-regulatory mapping in prioritizing candidate genes from exome data and suggests that this strategy should be generally applicable to a range of retinal diseases. PMID:21835304

  1. RegTransBase - A Database Of Regulatory Sequences and Interactionsin a Wide Range of Prokaryotic Genomes

    SciTech Connect

    Kazakov, Alexei E.; Cipriano, Michael J.; Novichkov, Pavel S.; Minovitsky, Simon; Vinogradov, Dmitry V.; Arkin, Adam; Mironov, AndreyA.; Gelfand, Mikhail S.; Dubchak, Inna

    2006-07-01

    RegTransBase, a manually curated database of regulatoryinteractions in prokaryotes, captures the knowledge in publishedscientific literature using a controlled vocabulary. Although a number ofdatabases describing interactions between regulatory proteins and theirbinding sites are currently being maintained, they focus mostly on themodel organisms Escherichia coli and Bacillus subtilis, or are entirelycomputationally derived. RegTransBase describes a large number ofregulatory interactions reported in many organisms and contains varioustypes of experimental data, in particular: the activation or repressionof transcription by an identified direct regulator; determining thetranscriptional regulatory function of a protein (or RNA) directlybinding to DNA (RNA); mapping or prediction of binding site for aregulatory protein; characterization of regulatory mutations. Currently,the RegTransBase content is derived from about 3000 relevant articlesdescribing over 7000 experiments in relation to 128 microbes. It containsdata on the regulation of about 7500 genes and evidence for 6500interactions with 650 regulators. RegTransBase also contains manuallycreated position weight matrices (PWM) that can be used to identifycandidate regulatory sites in over 60 species. RegTransBase is availableat http://regtransbase.lbl.gov.

  2. Exploring Temporal Sequences of Regulatory Phases and Associated Interactions in Low- and High-Challenge Collaborative Learning Sessions

    ERIC Educational Resources Information Center

    Sobocinski, Márta; Malmberg, Jonna; Järvelä, Sanna

    2017-01-01

    Investigating the temporal order of regulatory processes can explain in more detail the mechanisms behind success or lack of success during collaborative learning. The aim of this study is to explore the differences between high- and low-challenge collaborative learning sessions. This is achieved through examining how the three phases of…

  3. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions

    PubMed Central

    Besemer, John; Lomsadze, Alexandre; Borodovsky, Mark

    2001-01-01

    Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-coding regions and models of regulatory sites near gene start within an iterative Hidden Markov model based algorithm. The new gene prediction method, called GeneMarkS, utilizes a non-supervised training procedure and can be used for a newly sequenced prokaryotic genome with no prior knowledge of any protein or rRNA genes. The GeneMarkS implementation uses an improved version of the gene finding program GeneMark.hmm, heuristic Markov models of coding and non-coding regions and the Gibbs sampling multiple alignment program. GeneMarkS predicted precisely 83.2% of the translation starts of GenBank annotated Bacillus subtilis genes and 94.4% of translation starts in an experimentally validated set of Escherichia coli genes. We have also observed that GeneMarkS detects prokaryotic genes, in terms of identifying open reading frames containing real genes, with an accuracy matching the level of the best currently used gene detection methods. Accurate translation start prediction, in addition to the refinement of protein sequence N-terminal data, provides the benefit of precise positioning of the sequence region situated upstream to a gene start. Therefore, sequence motifs related to transcription and translation regulatory sites can be revealed and analyzed with higher precision. These motifs were shown to possess a significant variability, the functional and evolutionary connections of which are discussed. PMID:11410670

  4. Species delimitation of common reef corals in the genus Pocillopora using nucleotide sequence phylogenies, population genetics and symbiosis ecology.

    PubMed

    Pinzón, Jorge H; LaJeunesse, Todd C

    2011-01-01

    Stony corals in the genus Pocillopora are among the most common and widely distributed of Indo-Pacific corals and, as such, are often the subject of physiological and ecological research. In the far Tropical Eastern Pacific (TEP), they are major constituents of shallow coral communities, exhibiting considerable variability in colony shape and branch morphology and marked differences in response to thermal stress. Numerous intermediates occur between morphospecies that may relate to extensive hybridization. The diversity of the Pocillopora genus in the TEP was analysed genetically using nuclear ribosomal (ITS2) and mitochondrial (ORF) sequences, and population genetic markers (seven microsatellite loci). The resident dinoflagellate endosymbiont (Symbiodinium sp.) in each sample was also characterized using sequences of the internal transcribed spacer 2 (ITS2) rDNA and the noncoding region of the chloroplast psbA minicircle. From these analyses, three symbiotically distinct, reproductively isolated, nonhybridizing, evolutionarily divergent animal lineages were identified. Designated types 1, 2 and 3, these groupings were incongruent with traditional morphospecies classification. Type 1 was abundant and widespread throughout the TEP; type 2 was restricted to the Clipperton Atoll; and type 3 was found only in Panama and the Galapagos Islands. Each type harboured a different Symbiodinium'species lineage' in Clade C, and only type 1 associated with the 'stress-tolerant'Symbiodinium glynni (D1). The accurate delineation of species and implementation of a proper taxonomy may profoundly improve our assessment of Pocillopora's reproductive biology, biogeographic distributions, and resilience to climate warming, information that must be considered when planning for the conservation of reef corals. © 2010 Blackwell Publishing Ltd.

  5. Characterization of the FoxL2 proximal promoter and coding sequence from the common snapping turtle (Chelydra serpentina).

    PubMed

    Guo, Lei; Rhen, Turk

    2017-10-01

    Sex is determined by temperature during embryogenesis in snapping turtles, Chelydra serpentina. Previous studies in this species show that dihydrotestosterone (DHT) induces ovarian development at temperatures that normally produce males or mixed sex ratios. The feminizing effect of DHT is associated with increased expression of FoxL2, suggesting that androgens regulate transcription of FoxL2. To test this hypothesis, we cloned the proximal promoter (1.6kb) and coding sequence for snapping turtle FoxL2 (tFoxL2) in frame with mCherry to produce a fluorescent reporter. The tFoxL2-mCherry fusion plasmid or mCherry control plasmid were stably transfected into mouse KK1 granulosa cells. These cells were then treated with 0, 1, 10, or 100nM DHT to assess androgen effects on tFoxL2-mCherry expression. In contrast to the main hypothesis, DHT did not alter expression of the tFoxL2-mCherry reporter. However, normal serum increased expression of tFoxL2-mCherry when compared to charcoal-stripped serum, indicating that the cloned region of tFoxL2 contains cis regulatory elements. We also used the tFoxL2-mCherry plasmid as an expression vector to test the hypothesis that DHT and tFoxL2 interact to regulate expression of endogenous genes in granulosa cells. While tFoxL2-mCherry and DHT had independent effects on mouse FoxL2, FshR, GnRHR, and StAR expression, tFoxL2-mCherry potentiated low concentration DHT effects on mouse aromatase expression. Further studies will be required to determine whether synergistic regulation of aromatase by DHT and FoxL2 also occurs in turtle gonads during the sex-determining period, which would explain the feminizing effect of DHT in this species. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Sequencing the GRHL3 Coding Region Reveals Rare Truncating Mutations and a Common Susceptibility Variant for Nonsyndromic Cleft Palate

    PubMed Central

    Mangold, Elisabeth; Böhmer, Anne C.; Ishorst, Nina; Hoebel, Ann-Kathrin; Gültepe, Pinar; Schuenke, Hannah; Klamt, Johanna; Hofmann, Andrea; Gölz, Lina; Raff, Ruth; Tessmann, Peter; Nowak, Stefanie; Reutter, Heiko; Hemprich, Alexander; Kreusch, Thomas; Kramer, Franz-Josef; Braumann, Bert; Reich, Rudolf; Schmidt, Gül; Jäger, Andreas; Reiter, Rudolf; Brosch, Sibylle; Stavusis, Janis; Ishida, Miho; Seselgyte, Rimante; Moore, Gudrun E.; Nöthen, Markus M.; Borck, Guntram; Aldhorae, Khalid A.; Lace, Baiba; Stanier, Philip; Knapp, Michael; Ludwig, Kerstin U.

    2016-01-01

    Nonsyndromic cleft lip with/without cleft palate (nsCL/P) and nonsyndromic cleft palate only (nsCPO) are the most frequent subphenotypes of orofacial clefts. A common syndromic form of orofacial clefting is Van der Woude syndrome (VWS) where individuals have CL/P or CPO, often but not always associated with lower lip pits. Recently, ∼5% of VWS-affected individuals were identified with mutations in the grainy head-like 3 gene (GRHL3). To investigate GRHL3 in nonsyndromic clefting, we sequenced its coding region in 576 Europeans with nsCL/P and 96 with nsCPO. Most strikingly, nsCPO-affected individuals had a higher minor allele frequency for rs41268753 (0.099) than control subjects (0.049; p = 1.24 × 10−2). This association was replicated in nsCPO/control cohorts from Latvia, Yemen, and the UK (pcombined = 2.63 × 10−5; ORallelic = 2.46 [95% CI 1.6–3.7]) and reached genome-wide significance in combination with imputed data from a GWAS in nsCPO triads (p = 2.73 × 10−9). Notably, rs41268753 is not associated with nsCL/P (p = 0.45). rs41268753 encodes the highly conserved p.Thr454Met (c.1361C>T) (GERP = 5.3), which prediction programs denote as deleterious, has a CADD score of 29.6, and increases protein binding capacity in silico. Sequencing also revealed four novel truncating GRHL3 mutations including two that were de novo in four families, where all nine individuals harboring mutations had nsCPO. This is important for genetic counseling: given that VWS is rare compared to nsCPO, our data suggest that dominant GRHL3 mutations are more likely to cause nonsyndromic than syndromic CPO. Thus, with rare dominant mutations and a common risk variant in the coding region, we have identified an important contribution for GRHL3 in nsCPO. PMID:27018475

  7. Post-common envelope binaries from SDSS - VII. A catalogue of white dwarf-main sequence binaries

    NASA Astrophysics Data System (ADS)

    Rebassa-Mansergas, A.; Gänsicke, B. T.; Schreiber, M. R.; Koester, D.; Rodríguez-Gil, P.

    2010-02-01

    We present a catalogue of 1602 white-dwarf-main-sequence (WDMS) binaries from the spectroscopic Sloan Digital Sky Survey Data Release 6 (SDSS DR6). Among these, we identify 440 as new WDMS binaries. We select WDMS binary candidates by template fitting all 1.27 million DR6 spectra, using combined constraints in both χ2 and signal-to-noise ratio. In addition, we use Galaxy Evolution Explorer (GALEX) and UKIRT Infrared Sky Survey (UKIDSS) magnitudes to search for objects in which one of the two components dominates the SDSS spectrum. We use a decomposition/fitting technique to measure the effective temperatures, surface gravities, masses and distances to the white dwarfs, as well as the spectral types and distances to the companions in our catalogue. Distributions and density maps obtained from these stellar parameters are then used to study both the general properties and the selection effects of WDMS binaries in the SDSS. A comparison between the distances measured to the white dwarfs and the main-sequence companions shows dsec > dwd for approximately one-fifth of the systems, a tendency already found in our previous work. The hypothesis that magnetic activity raises the temperature of the inter-spot regions in active stars that are heavily covered by cool spots, leading to a bluer optical colour compared to inactive stars, remains the best explanation for this behaviour. We also make use of SDSS-GALEX-UKIDSS magnitudes to investigate the distribution of WDMS binaries, as well as their white-dwarf effective temperatures and companion star spectral types, in ultraviolet to infrared colour space. We show that WDMS binaries can be very efficiently separated from single main-sequence stars and white dwarfs when using a combined ultraviolet, optical and infrared colour selection. Finally, we also provide radial velocities for 1068 systems measured from the NaI λλ8183.27, 8194.81 absorption doublet and/or the Hα emission line. Among the systems with multiple SDSS

  8. A systematic computational analysis of the rRNA–3′ UTR sequence complementarity suggests a regulatory mechanism influencing post-termination events in metazoan translation

    PubMed Central

    Pánek, Josef; Kolář, Michal; Herrmannová, Anna; Valášek, Leoš Shivaya

    2016-01-01

    Nucleic acid sequence complementarity underlies many fundamental biological processes. Although first noticed a long time ago, sequence complementarity between mRNAs and ribosomal RNAs still lacks a meaningful biological interpretation. Here we used statistical analysis of large-scale sequence data sets and high-throughput computing to explore complementarity between 18S and 28S rRNAs and mRNA 3′ UTR sequences. By the analysis of 27,646 full-length 3′ UTR sequences from 14 species covering both protozoans and metazoans, we show that the computed 18S rRNA complementarity creates an evolutionarily conserved localization pattern centered around the ribosomal mRNA entry channel, suggesting its biological relevance and functionality. Based on this specific pattern and earlier data showing that post-termination 80S ribosomes are not stably anchored at the stop codon and can migrate in both directions to codons that are cognate to the P-site deacylated tRNA, we propose that the 18S rRNA–mRNA complementarity selectively stabilizes post-termination ribosomal complexes to facilitate ribosome recycling. We thus demonstrate that the complementarity between 18S rRNA and 3′ UTRs has a non-random nature and very likely carries information with a regulatory potential for translational control. PMID:27190231

  9. Functional equivalence of common and unique sequences in the 3' untranslated regions of alfalfa mosaic virus RNAs 1, 2, and 3.

    PubMed Central

    van Rossum, C M; Brederode, F T; Neeleman, L; Bol, J F

    1997-01-01

    The 3' untranslated regions (UTRs) of alfalfa mosaic virus (AMV) RNAs 1, 2, and 3 consist of a common 3'-terminal sequence of 145 nucleotides (nt) and upstream sequences of 18 to 34 nt that are unique for each RNA. The common sequence can be folded into five stem-loop structures, A to E, despite the occurrence of 22 nt differences between the three RNAs in this region. Exchange of the common sequences or full-length UTRs between the three genomic RNAs did not affect the replication of these RNAs in vivo, indicating that the UTRs are functionally equivalent. Mutations that disturbed base pairing in the stem of hairpin E reduced or abolished RNA replication, whereas compensating mutations restored RNA replication. In vitro, the 3' UTRs of the three RNAs were recognized with similar efficiencies by the AMV RNA-dependent RNA polymerase (RdRp). A deletion analysis of template RNAs indicated that a 3'-terminal sequence of 127 nt in each of the three AMV RNAs was not sufficient for recognition by the RdRp. Previously, it has been shown that this 127-nt sequence is sufficient for coat protein binding. Apparently, sequences required for recognition of AMV RNAs by the RdRp are longer than sequences required for CP binding. PMID:9094656

  10. TTS Mapping: integrative WEB tool for analysis of triplex formation target DNA Sequences, G-quadruplets and non-protein coding regulatory DNA elements in the human genome

    PubMed Central

    2009-01-01

    Background DNA triplexes can naturally occur, co-localize and interact with many other regulatory DNA elements (e.g. G-quadruplex (G4) DNA motifs), specific DNA-binding proteins (e.g. transcription factors (TFs)), and micro-RNA (miRNA) precursors. Specific genome localizations of triplex target DNA sites (TTSs) may cause abnormalities in a double-helix DNA structure and can be directly involved in some human diseases. However, genome localization of specific TTSs, their interconnection with regulatory DNA elements and physiological roles in a cell are poor defined. Therefore, it is important to identify comprehensive and reliable catalogue of specific potential TTSs (pTTSs) and their co-localization patterns with other regulatory DNA elements in the human genome. Results "TTS mapping" database is a web-based search engine developed here, which is aimed to find and annotate pTTSs within a region of interest of the human genome. The engine provides descriptive statistics of pTTSs in a given region and its sequence context. Different annotation tracks of TTS-overlapping gene region(s), G4 motifs, CpG Island, miRNA precursors, miRNA targets, transcription factor binding sites (TFBSs), Single Nucleotide Polymorphisms (SNPs), small nucleolar RNAs (snoRNA), and repeat elements are also mapped based onto a sequence location provided by UCSC genome browser, G4 database http://www.quadruplex.org and several other datasets. The results pages provide links to UCSC genome browser annotation tracks and relative DBs. BLASTN program was included to check the uniqueness of a given pTTS in the human genome. Recombination- and mutation-prone genes (e.g. EVI-1, MYC) were found to be significantly enriched by TTSs and multiple co-occurring with our regulatory DNA elements. TTS mapping reveals that a high-complementary and evolutionarily conserved polypurine and polypyrimidine DNA sequence pair linked by a non-conserved short DNA sequence can form miR-483 transcribed from intron 2 of

  11. TTS mapping: integrative WEB tool for analysis of triplex formation target DNA sequences, G-quadruplets and non-protein coding regulatory DNA elements in the human genome.

    PubMed

    Jenjaroenpun, Piroon; Kuznetsov, Vladimir A

    2009-12-03

    DNA triplexes can naturally occur, co-localize and interact with many other regulatory DNA elements (e.g. G-quadruplex (G4) DNA motifs), specific DNA-binding proteins (e.g. transcription factors (TFs)), and micro-RNA (miRNA) precursors. Specific genome localizations of triplex target DNA sites (TTSs) may cause abnormalities in a double-helix DNA structure and can be directly involved in some human diseases. However, genome localization of specific TTSs, their interconnection with regulatory DNA elements and physiological roles in a cell are poor defined. Therefore, it is important to identify comprehensive and reliable catalogue of specific potential TTSs (pTTSs) and their co-localization patterns with other regulatory DNA elements in the human genome. "TTS mapping" database is a web-based search engine developed here, which is aimed to find and annotate pTTSs within a region of interest of the human genome. The engine provides descriptive statistics of pTTSs in a given region and its sequence context. Different annotation tracks of TTS-overlapping gene region(s), G4 motifs, CpG Island, miRNA precursors, miRNA targets, transcription factor binding sites (TFBSs), Single Nucleotide Polymorphisms (SNPs), small nucleolar RNAs (snoRNA), and repeat elements are also mapped based onto a sequence location provided by UCSC genome browser, G4 database http://www.quadruplex.org and several other datasets. The results pages provide links to UCSC genome browser annotation tracks and relative DBs. BLASTN program was included to check the uniqueness of a given pTTS in the human genome. Recombination- and mutation-prone genes (e.g. EVI-1, MYC) were found to be significantly enriched by TTSs and multiple co-occurring with our regulatory DNA elements. TTS mapping reveals that a high-complementary and evolutionarily conserved polypurine and polypyrimidine DNA sequence pair linked by a non-conserved short DNA sequence can form miR-483 transcribed from intron 2 of IGF2 gene and bound

  12. Drosophila melanogaster is polymorphic for a specific repeated (CATA) sequence in the regulatory region of hsp23.

    PubMed

    Frydenberg, J; Pierpaoli, M; Loeschcke, V

    1999-08-20

    To identify sequence variation associated with a selection response for heat tolerance in Drosophila melanogaster, we sequenced 1400bp of the heat shock protein 23 gene (hsp23) promoter region in four heat-selected and two control lines. The region was found to be variable for a specific (CATA) repeated sequence, and the sequence CTT seems to be a hot spot for mutation. The repeated tetranucleotide sequence was located in several short repeats scattered throughout the entire region. Similar variable repeats are also located downstream the of hsp23 gene in the intergenic region between hsp23 and hsp27. We detected nine different hsp23 alleles. Their frequencies in the selection and control lines seemed to be mainly determined by genetic drift. The function of the CATA repeats is not yet known, though these regions have homology to SAR elements located in the intergenic region between two hsp70 genes, suggesting a similar function.

  13. Two distinct nuclear factors bind the conserved regulatory sequences of a rabbit major histocompatibility complex class II gene.

    PubMed Central

    Sittisombut, N

    1988-01-01

    The constitutive coexpression of the major histocompatibility complex (MHC) class II genes in B lymphocytes requires positive, trans-acting transcriptional factors. The need for these trans-acting factors has been suggested by the reversion of the MHC class II-negative phenotype of rare B-lymphocyte mutants through somatic cell fusion with B cells or T-cell lines. The mechanism by which the trans-acting factors exert their effect on gene transcription is unknown. The possibility that two highly conserved DNA sequences, located 90 to 100 base pairs (bp) (the A sequence) and 60 to 70 bp (the B sequence) upstream of the transcription start site of the class II genes, are recognized by the trans-acting factors was investigated in this study. By using the gel electrophoresis retardation assay, a minimum of two proteins which specifically bound the conserved A or B sequence of a rabbit DP beta gene were identified in murine nuclear extracts of a B-lymphoma cell line, A20-2J. Fractionation of nuclear extract through a heparin-agarose column allowed the identification of one protein, designated NF-MHCIIB, which bound an oligonucleotide containing the B sequence and protected the entire B sequence in the DNase I protection analysis. Another protein, designated NF-MHCIIA, which bound an oligonucleotide containing the A sequence and partially protected the 3' half of this sequence, was also identified. NF-MHCIIB did not protect a CCAAT sequence located 17 bp downstream of the B sequence. The possible relationship between these DNA-binding factors and the trans-acting factors identified in the cell fusion experiments is discussed. Images PMID:3133552

  14. Single-cell whole genome sequencing reveals no evidence for common aneuploidy in normal and Alzheimer's disease neurons.

    PubMed

    van den Bos, Hilda; Spierings, Diana C J; Taudt, Aaron S; Bakker, Bjorn; Porubský, David; Falconer, Ester; Novoa, Carolina; Halsema, Nancy; Kazemier, Hinke G; Hoekstra-Wakker, Karina; Guryev, Victor; den Dunnen, Wilfred F A; Foijer, Floris; Tatché, Maria Colomé; Boddeke, Hendrikus W G M; Lansdorp, Peter M

    2016-05-31

    Alzheimer's disease (AD) is a neurodegenerative disease of the brain and the most common form of dementia in the elderly. Aneuploidy, a state in which cells have an abnormal number of chromosomes, has been proposed to play a role in neurodegeneration in AD patients. Several studies using fluorescence in situ hybridization have shown that the brains of AD patients contain an increased number of aneuploid cells. However, because the reported rate of aneuploidy in neurons ranges widely, a more sensitive method is needed to establish a possible role of aneuploidy in AD pathology. In the current study, we used a novel single-cell whole genome sequencing (scWGS) approach to assess aneuploidy in isolated neurons from the frontal cortex of normal control individuals (n = 6) and patients with AD (n = 10). The sensitivity and specificity of our method was shown by the presence of three copies of chromosome 21 in all analyzed neuronal nuclei of a Down's syndrome sample (n = 36). Very low levels of aneuploidy were found in the brains from control individuals (n = 589) and AD patients (n = 893). In contrast to other studies, we observe no selective gain of chromosomes 17 or 21 in neurons of AD patients. scWGS showed no evidence for common aneuploidy in normal and AD neurons. Therefore, our results do not support an important role for aneuploidy in neuronal cells in the pathogenesis of AD. This will need to be confirmed by future studies in larger cohorts.

  15. De novo sequencing of root transcriptome reveals complex cadmium-responsive regulatory networks in radish (Raphanus sativus L.).

    PubMed

    Xu, Liang; Wang, Yan; Liu, Wei; Wang, Jin; Zhu, Xianwen; Zhang, Keyun; Yu, Rugang; Wang, Ronghua; Xie, Yang; Zhang, Wei; Gong, Yiqin; Liu, Liwang

    2015-07-01

    Cadmium (Cd) is a nonessential metallic trace element that poses potential chronic toxicity to living organisms. To date, little is known about the Cd-responsive regulatory network in root vegetable crops including radish. In this study, 31,015 unigenes representing 66,552 assembled unique transcripts were isolated from radish root under Cd stress based on de novo transcriptome assembly. In all, 1496 differentially expressed genes (DEGs) consisted of 3579 transcripts were identified from Cd-free (CK) and Cd-treated (Cd200) libraries. Gene Ontology and pathway enrichment analysis indicated that the up- and down-regulated DEGs were predominately involved in glucosinolate biosynthesis as well as cysteine and methionine-related pathways, respectively. RT-qPCR showed that the expression profiles of DEGs were in consistent with results from RNA-Seq analysis. Several candidate genes encoding phytochelatin synthase (PCS), metallothioneins (MTs), glutathione (GSH), zinc iron permease (ZIPs) and ABC transporter were responsible for Cd uptake, accumulation, translocation and detoxification in radish. The schematic model of DEGs and microRNAs-involved in Cd-responsive regulatory network was proposed. This study represents a first comprehensive transcriptome-based characterization of Cd-responsive DEGs in radish. These results could provide fundamental insight into complex Cd-responsive regulatory networks and facilitate further genetic manipulation of Cd accumulation in root vegetable crops. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Molecular characterization of three common olive (Olea europaea L.) cultivars in Palestine, using simple sequence repeat (SSR) markers

    PubMed Central

    Obaid, Ramiz; Abu-Qaoud, Hassan; Arafeh, Rami

    2014-01-01

    Eight accessions of olive trees from three common varieties in Palestine, Nabali Baladi, Nabali Mohassan and Surri, were genetically evaluated using five simple sequence repeat (SSR) markers. A total of 17 alleles from 5 loci were observed in which 15 (88.2%) were polymorphic and 2 (11.8%) were monomorphic. An average of 3.4 alleles per locus was found ranging from 2.0 alleles with the primers GAPU-103 and DCA-9 to 5.0 alleles with U9932 and DCA-16. The smallest amplicon size observed was 50 bp with the primer DCA-16, whereas the largest one (450 bp) with the primer U9932. Cluster analysis with the unweighted pair group method with arithmetic average (UPGMA) showed three clusters: a cluster with four accessions from the ‘Nabali Baladi’ cultivar, another cluster with three accessions that represents the ‘Nabali Mohassen’ cultivar and finally the ‘Surri’ cultivar. The similarity coefficient for the eight olive tree samples ranged from a maximum of 100% between two accessions from Nabali Baladi and also in two other samples from Nabali Mohassan, to a minimum similarity coefficient (0.315) between the Surri and two Nabali Baladi accessions. The results in this investigation clearly highlight the genetic dissimilarity between the three main olive cultivars that have been misidentified and mixed up in the past, based on conventional morphological characters. PMID:26019564

  17. Deep sequencing of large library selections allows computational discovery of diverse sets of zinc fingers that bind common targets.

    PubMed

    Persikov, Anton V; Rowland, Elizabeth F; Oakes, Benjamin L; Singh, Mona; Noyes, Marcus B

    2014-02-01

    The Cys2His2 zinc finger (ZF) is the most frequently found sequence-specific DNA-binding domain in eukaryotic proteins. The ZF's modular protein-DNA interface has also served as a platform for genome engineering applications. Despite decades of intense study, a predictive understanding of the DNA-binding specificities of either natural or engineered ZF domains remains elusive. To help fill this gap, we developed an integrated experimental-computational approach to enrich and recover distinct groups of ZFs that bind common targets. To showcase the power of our approach, we built several large ZF libraries and demonstrated their excellent diversity. As proof of principle, we used one of these ZF libraries to select and recover thousands of ZFs that bind several 3-nt targets of interest. We were then able to computationally cluster these recovered ZFs to reveal several distinct classes of proteins, all recovered from a single selection, to bind the same target. Finally, for each target studied, we confirmed that one or more representative ZFs yield the desired specificity. In sum, the described approach enables comprehensive large-scale selection and characterization of ZF specificities and should be a great aid in furthering our understanding of the ZF domain.

  18. Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer

    PubMed Central

    2012-01-01

    Background Cancers arise through an evolutionary process in which cell populations are subjected to selection; however, to date, the process of bladder cancer, which is one of the most common cancers in the world, remains unknown at a single-cell level. Results We carried out single-cell exome sequencing of 66 individual tumor cells from a muscle-invasive bladder transitional cell carcinoma (TCC). Analyses of the somatic mutant allele frequency spectrum and clonal structure revealed that the tumor cells were derived from a single ancestral cell, but that subsequent evolution occurred, leading to two distinct tumor cell subpopulations. By analyzing recurrently mutant genes in an additional cohort of 99 TCC tumors, we identified genes that might play roles in the maintenance of the ancestral clone and in the muscle-invasive capability of subclones of this bladder cancer, respectively. Conclusions This work provides a new approach of investigating the genetic details of bladder tumoral changes at the single-cell level and a new method for assessing bladder cancer evolution at a cell-population level. PMID:23587365

  19. Identification of an upstream regulatory sequence that mediates the transcription of mox genes in Methylobacterium extorquens AM1.

    PubMed

    Zhang, Meng; FitzGerald, Kelly A; Lidstrom, Mary E

    2005-11-01

    A multiple A-tract sequence has been identified in the promoter regions for the mxaF, pqqA, mxaW, mxbD and mxcQ genes involved in methanol oxidation in Methylobacterium extorquens AM1, a facultative methylotroph. Site-directed mutagenesis was exploited to delete or change this conserved sequence. Promoter-xylE transcriptional fusions were used to assess promoter activity in these mutants. A fiftyfold drop in the XylE activity was observed for the mxaF and pqqA promoters without this sequence, and a five- to sixfold drop in the XylE activity was observed for the mxbD and mxcQ promoters without this sequence. Mutants were generated in the chromosomal copies in which this sequence was either deleted or altered, and these mutants were unable to grow on methanol. When one of these sequences was added to Plac of Escherichia coli, which is a weak constitutive promoter in M. extorquens AM1, the activity increased two- to threefold. These results suggest that this sequence is essential for normal expression of these genes in M. extorquens AM1, and may serve as a general enhancer element for genetic constructs in this bacterium.

  20. Deciphering the molecular mechanisms underlying the binding of the TWIST1/E12 complex to regulatory E-box sequences

    PubMed Central

    Bouard, Charlotte; Terreux, Raphael; Honorat, Mylène; Manship, Brigitte; Ansieau, Stéphane; Vigneron, Arnaud M.; Puisieux, Alain; Payen, Léa

    2016-01-01

    Abstract The TWIST1 bHLH transcription factor controls embryonic development and cancer processes. Although molecular and genetic analyses have provided a wealth of data on the role of bHLH transcription factors, very little is known on the molecular mechanisms underlying their binding affinity to the E-box sequence of the promoter. Here, we used an in silico model of the TWIST1/E12 (TE) heterocomplex and performed molecular dynamics (MD) simulations of its binding to specific (TE-box) and modified E-box sequences. We focused on (i) active E-box and inactive E-box sequences, on (ii) modified active E-box sequences, as well as on (iii) two box sequences with modified adjacent bases the AT- and TA-boxes. Our in silico models were supported by functional in vitro binding assays. This exploration highlighted the predominant role of protein side-chain residues, close to the heart of the complex, at anchoring the dimer to DNA sequences, and unveiled a shift towards adjacent ((-1) and (-1*)) bases and conserved bases of modified E-box sequences. In conclusion, our study provides proof of the predictive value of these MD simulations, which may contribute to the characterization of specific inhibitors by docking approaches, and their use in pharmacological therapies by blocking the tumoral TWIST1/E12 function in cancers. PMID:27151200

  1. Deciphering the molecular mechanisms underlying the binding of the TWIST1/E12 complex to regulatory E-box sequences.

    PubMed

    Bouard, Charlotte; Terreux, Raphael; Honorat, Mylène; Manship, Brigitte; Ansieau, Stéphane; Vigneron, Arnaud M; Puisieux, Alain; Payen, Léa

    2016-06-20

    The TWIST1 bHLH transcription factor controls embryonic development and cancer processes. Although molecular and genetic analyses have provided a wealth of data on the role of bHLH transcription factors, very little is known on the molecular mechanisms underlying their binding affinity to the E-box sequence of the promoter. Here, we used an in silico model of the TWIST1/E12 (TE) heterocomplex and performed molecular dynamics (MD) simulations of its binding to specific (TE-box) and modified E-box sequences. We focused on (i) active E-box and inactive E-box sequences, on (ii) modified active E-box sequences, as well as on (iii) two box sequences with modified adjacent bases the AT- and TA-boxes. Our in silico models were supported by functional in vitro binding assays. This exploration highlighted the predominant role of protein side-chain residues, close to the heart of the complex, at anchoring the dimer to DNA sequences, and unveiled a shift towards adjacent ((-1) and (-1*)) bases and conserved bases of modified E-box sequences. In conclusion, our study provides proof of the predictive value of these MD simulations, which may contribute to the characterization of specific inhibitors by docking approaches, and their use in pharmacological therapies by blocking the tumoral TWIST1/E12 function in cancers.

  2. The positive regulatory function of the 5'-proximal open reading frames in GCN4 mRNA can be mimicked by heterologous, short coding sequences.

    PubMed Central

    Williams, N P; Mueller, P P; Hinnebusch, A G

    1988-01-01

    Translational control of GCN4 expression in the yeast Saccharomyces cerevisiae is mediated by multiple AUG codons present in the leader of GCN4 mRNA, each of which initiates a short open reading frame of only two or three codons. Upstream AUG codons 3 and 4 are required to repress GCN4 expression in normal growth conditions; AUG codons 1 and 2 are needed to overcome this repression in amino acid starvation conditions. We show that the regulatory function of AUG codons 1 and 2 can be qualitatively mimicked by the AUG codons of two heterologous upstream open reading frames (URFs) containing the initiation regions of the yeast genes PGK and TRP1. These AUG codons inhibit GCN4 expression when present singly in the mRNA leader; however, they stimulate GCN4 expression in derepressing conditions when inserted upstream from AUG codons 3 and 4. This finding supports the idea that AUG codons 1 and 2 function in the control mechanism as translation initiation sites and further suggests that suppression of the inhibitory effects of AUG codons 3 and 4 is a general consequence of the translation of URF 1 and 2 sequences upstream. Several observations suggest that AUG codons 3 and 4 are efficient initiation sites; however, these sequences do not act as positive regulatory elements when placed upstream from URF 1. This result suggests that efficient translation is only one of the important properties of the 5' proximal URFs in GCN4 mRNA. We propose that a second property is the ability to permit reinitiation following termination of translation and that URF 1 is optimized for this regulatory function. Images PMID:3065626

  3. Integration of small RNAs, degradome and transcriptome sequencing in hyperaccumulator Sedum alfredii uncovers a complex regulatory network and provides insights into cadmium phytoremediation.

    PubMed

    Han, Xiaojiao; Yin, Hengfu; Song, Xixi; Zhang, Yunxing; Liu, Mingying; Sang, Jiang; Jiang, Jing; Li, Jihong; Zhuo, Renying

    2016-06-01

    The hyperaccumulating ecotype of Sedum alfredii Hance is a cadmium (Cd)/zinc/lead co-hyperaccumulating species of Crassulaceae. It is a promising phytoremediation candidate accumulating substantial heavy metal ions without obvious signs of poisoning. However, few studies have focused on the regulatory roles of miRNAs and their targets in the hyperaccumulating ecotype of S. alfredii. Here, we combined analyses of the transcriptomics, sRNAs and the degradome to generate a comprehensive resource focused on identifying key regulatory miRNA-target circuits under Cd stress. A total of 87 721 unigenes and 356 miRNAs were identified by deep sequencing, and 79 miRNAs were differentially expressed under Cd stress. Furthermore, 754 target genes of 194 miRNAs were validated by degradome sequencing. A gene ontology (GO) enrichment analysis of differential miRNA targets revealed that auxin, redox-related secondary metabolism and metal transport pathways responded to Cd stress. An integrated analysis uncovered 39 pairs of miRNA targets that displayed negatively correlated expression profiles. Ten miRNA-target pairs also exhibited negative correlations according to a real-time quantitative PCR analysis. Moreover, a coexpression regulatory network was constructed based on profiles of differentially expressed genes. Two hub genes, ARF4 (auxin response factor 4) and AAP3 (amino acid permease 3), which might play central roles in the regulation of Cd-responsive genes, were uncovered. These results suggest that comprehensive analyses of the transcriptomics, sRNAs and the degradome provided a useful platform for investigating Cd hyperaccumulation in S. alfredii, and may provide new insights into the genetic engineering of phytoremediation.

  4. Common and distinguishing regulatory and expression characteristics of the highly related KorB proteins of streptomycete plasmids pIJ101 and pSB24.2.

    PubMed

    Ducote, Matthew J; Pettis, Gregg S

    2003-07-01

    The conjugative plasmid pIJ101 of the spore-forming bacterium Streptomyces lividans contains a regulatory gene, korB, whose product is required to repress potentially lethal expression of the pIJ101 kilB gene. The KorB protein also autoregulates korB gene expression and may be involved in control of pIJ101 copy number. KorB (pIJ101) is expressed as a 10-kDa protein in S. lividans that is immediately processed to a mature 6-kDa repressor molecule. The conjugative Streptomyces cyanogenus plasmid pSB24.1 is deleted upon entry into S. lividans to form pSB24.2, a nonconjugative derivative that contains a korB gene nearly identical to that of pIJ101. Previous evidence that korB of pSB24.2 is capable of overriding pIJ101 kilB-associated lethality supported the notion that pIJ101 and pSB24.2 encode highly related, perhaps even identical conjugation systems. Here we show that KorB (pIJ101) and KorB (pSB24.2) repress transcription from the pIJ101 kilB promoter equally well, although differences exist with respect to their interactions with kilB promoter sequences. Despite high sequence and functional similarities, KorB (pSB24.2) was found to exist as multiple stable forms ranging in size from 10 to 6 kDa both in S. lividans and S. cyanogenus. Immediate processing of KorB (pIJ101) exclusively to the 6-kDa repressor form meanwhile was conserved between the two species. A feature common to both proteins was a marked increase in expression or accumulation upon sporulation, an occurrence that may indicate a particular need for increased quantities of this regulatory protein upon spore germination and resumption of active growth of plasmid-containing cells.

  5. Common and Distinguishing Regulatory and Expression Characteristics of the Highly Related KorB Proteins of Streptomycete Plasmids pIJ101 and pSB24.2

    PubMed Central

    Ducote, Matthew J.; Pettis, Gregg S.

    2003-01-01

    The conjugative plasmid pIJ101 of the spore-forming bacterium Streptomyces lividans contains a regulatory gene, korB, whose product is required to repress potentially lethal expression of the pIJ101 kilB gene. The KorB protein also autoregulates korB gene expression and may be involved in control of pIJ101 copy number. KorB (pIJ101) is expressed as a 10-kDa protein in S. lividans that is immediately processed to a mature 6-kDa repressor molecule. The conjugative Streptomyces cyanogenus plasmid pSB24.1 is deleted upon entry into S. lividans to form pSB24.2, a nonconjugative derivative that contains a korB gene nearly identical to that of pIJ101. Previous evidence that korB of pSB24.2 is capable of overriding pIJ101 kilB-associated lethality supported the notion that pIJ101 and pSB24.2 encode highly related, perhaps even identical conjugation systems. Here we show that KorB (pIJ101) and KorB (pSB24.2) repress transcription from the pIJ101 kilB promoter equally well, although differences exist with respect to their interactions with kilB promoter sequences. Despite high sequence and functional similarities, KorB (pSB24.2) was found to exist as multiple stable forms ranging in size from 10 to 6 kDa both in S. lividans and S. cyanogenus. Immediate processing of KorB (pIJ101) exclusively to the 6-kDa repressor form meanwhile was conserved between the two species. A feature common to both proteins was a marked increase in expression or accumulation upon sporulation, an occurrence that may indicate a particular need for increased quantities of this regulatory protein upon spore germination and resumption of active growth of plasmid-containing cells. PMID:12813071

  6. Differentiated evolutionary relationships among chordates from comparative alignments of multiple sequences of MyoD and MyoG myogenic regulatory factors.

    PubMed

    Oliani, L C; Lidani, K C F; Gabriel, J E

    2015-10-16

    MyoD and MyoG are transcription factors that have essential roles in myogenic lineage determination and muscle differentiation. The purpose of this study was to compare multiple amino acid sequences of myogenic regulatory proteins to infer evolutionary relationships among chordates. Protein sequences from Mus musculus (P10085 and P12979), human Homo sapiens (P15172 and P15173), bovine Bos taurus (Q7YS82 and Q7YS81), wild pig Sus scrofa (P49811 and P49812), quail Coturnix coturnix (P21572 and P34060), chicken Gallus gallus (P16075 and P17920), rat Rattus norvegicus (Q02346 and P20428), domestic water buffalo Bubalus bubalis (D2SP11 and A7L034), and sheep Ovis aries (Q90477 and D3YKV7) were searched from a non-redundant protein sequence database UniProtKB/Swiss-Prot, and subsequently analyzed using the Mega6.0 software. MyoD evolutionary analyses revealed the presence of three main clusters with all mammals branched in one cluster, members of the order Rodentia (mouse and rat) in a second branch linked to the first, and birds of the order Galliformes (chicken and quail) remaining isolated in a third. MyoG evolutionary analyses aligned sequences in two main clusters, all mammalian specimens grouped in different sub-branches, and birds clustered in a second branch. These analyses suggest that the evolution of MyoD and MyoG was driven by different pathways.

  7. DNA sequence of Rhizobium trifolii nodulation genes reveals a reiterated and potentially regulatory sequence preceding nodABC and nodFE.

    PubMed Central

    Schofield, P R; Watson, J M

    1986-01-01

    The Rhizobium trifolii nod genes required for host-specific nodulation of clovers are located on 14 kb of Sym (symbiotic) plasmid DNA. Analysis of the nucleotide sequence of a 3.7 kb portion of this region has revealed open reading frames corresponding to the nodABCDEF genes. A DNA sequencing technique, using primer extension from within Tn5, has been used to determine the precise locations of Tn5 mutations within the nod genes and the phenotypes of the corresponding mutants correlate with their mapped locations. The predicted nodA and nodB genes overlap by four nucleotides and the nod F and nodE genes overlap by a single nucleotide, suggesting that translational coupling may ensure the synthesis of equimolar amounts of these gene products. The nodABC and nodFE genes constitute separate transcriptional units and each is preceded by a conserved 76-bp sequence which may be involved in the regulation of expression of these genes. Images PMID:3008100

  8. Epidemiological study on the penicillin resistance of clinical Streptococcus pneumoniae isolates identified as the common sequence types.

    PubMed

    Wei, Gao; Wei, Shi; Changhui, Chen; Denian, Wen; Jin, Tian; Kaihu, Yao

    2016-10-20

    There were some limitation in the current interpretation about the penicillin resistance mechanism of clinical Streptococcus pneumoniae isolates at the strain level. To explore the possibilities of studying the mechanism based on the sequence types (ST) of this bacteria, 488 isolates collected in Beijing from 1997-2014 and 88 isolates collected in Youyang County, Chongqing and Zhongjiang County, Sichuan in 2015 were analyzed by penicillin minimum inhibitory concentration (MIC) distribution and annual distribution. The results showed that the penicillin MICs of the all isolates covering by the given ST in Beijing have a defined range, either <0.25 mg/L or≥0.25 mg/L, except for the ST342. The isolates with penicillin MIC <0.25 mg/L were mainly collected before 2001, after which the isolates with MIC≥0.25 mg/L occurred and became the major population gradually. This law of year distribution, however, was not obvious for any specific ST. The isolates covering by any given ST could be determined with different penicillin MICs in the first few years after it was identified. The penicillin MIC of isolates identified as common STs and collected in Youyang County, Chongqing and Sichuan Zhongjiang County, including the ST271, ST320 and ST81, was around 0.25~2 mg/L (≥0.25 mg/L). Our study revealed the epidemiological distribution of penicillin MICs of the given STs determined in clinical S. pneumoniae isolates, suggesting that it is reasonable to research the penicillin resistance mechanism based on the STs of this bacteria.

  9. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico

    PubMed Central

    2014-01-01

    Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others1, with the potential to illuminate pathophysiology, health disparities, and the population genetic origins of disease alleles. We analyzed 9.2 million single nucleotide polymorphisms (SNPs) in each of 8,214 Mexicans and Latin Americans: 3,848 with type 2 diabetes (T2D) and 4,366 non-diabetic controls. In addition to replicating previous findings2–4, we identified a novel locus associated with T2D at genome-wide significance spanning the solute carriers SLC16A11 and SLC16A13 (P=3.9×10−13; odds ratio (OR)=1.29). The association was stronger in younger, leaner people with T2D, and replicated in independent samples (P=1.1×10−4; OR=1.20). The risk haplotype carries four amino acid substitutions, all in SLC16A11; it is present at ≈50% frequency in Native American samples and ≈10% in East Asian, but rare in European and African samples. Analysis of an archaic genome sequence indicated the risk haplotype introgressed into modern humans via admixture with Neandertals. The SLC16A11 mRNA is expressed in liver, and V5-tagged SLC16A11 protein localizes to the endoplasmic reticulum. Expression of SLC16A11 in heterologous cells alters lipid metabolism, most notably causing an increase in intracellular triacylglycerol levels. Despite T2D having been well studied by genome-wide association studies (GWAS) in other populations, analysis in Mexican and Latin American individuals identified SLC16A11 as a novel candidate gene for T2D with a possible role in triacylglycerol metabolism. PMID:24390345

  10. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico.

    PubMed

    Williams, Amy L; Jacobs, Suzanne B R; Moreno-Macías, Hortensia; Huerta-Chagoya, Alicia; Churchhouse, Claire; Márquez-Luna, Carla; García-Ortíz, Humberto; Gómez-Vázquez, María José; Burtt, Noël P; Aguilar-Salinas, Carlos A; González-Villalpando, Clicerio; Florez, Jose C; Orozco, Lorena; Haiman, Christopher A; Tusié-Luna, Teresa; Altshuler, David

    2014-02-06

    Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others, with the potential to illuminate pathophysiology, health disparities, and the population genetic origins of disease alleles. Here we analysed 9.2 million single nucleotide polymorphisms (SNPs) in each of 8,214 Mexicans and other Latin Americans: 3,848 with type 2 diabetes and 4,366 non-diabetic controls. In addition to replicating previous findings, we identified a novel locus associated with type 2 diabetes at genome-wide significance spanning the solute carriers SLC16A11 and SLC16A13 (P = 3.9 × 10(-13); odds ratio (OR) = 1.29). The association was stronger in younger, leaner people with type 2 diabetes, and replicated in independent samples (P = 1.1 × 10(-4); OR = 1.20). The risk haplotype carries four amino acid substitutions, all in SLC16A11; it is present at ~50% frequency in Native American samples and ~10% in east Asian, but is rare in European and African samples. Analysis of an archaic genome sequence indicated that the risk haplotype introgressed into modern humans via admixture with Neanderthals. The SLC16A11 messenger RNA is expressed in liver, and V5-tagged SLC16A11 protein localizes to the endoplasmic reticulum. Expression of SLC16A11 in heterologous cells alters lipid metabolism, most notably causing an increase in intracellular triacylglycerol levels. Despite type 2 diabetes having been well studied by genome-wide association studies in other populations, analysis in Mexican and Latin American individuals identified SLC16A11 as a novel candidate gene for type 2 diabetes with a possible role in triacylglycerol metabolism.

  11. RNA sequencing of laser-capture microdissected compartments of the maize kernel identifies regulatory modules associated with endosperm cell differentiation.

    PubMed

    Zhan, Junpeng; Thakare, Dhiraj; Ma, Chuang; Lloyd, Alan; Nixon, Neesha M; Arakaki, Angela M; Burnett, William J; Logan, Kyle O; Wang, Dongfang; Wang, Xiangfeng; Drews, Gary N; Yadegari, Ramin

    2015-03-01

    Endosperm is an absorptive structure that supports embryo development or seedling germination in angiosperms. The endosperm of cereals is a main source of food, feed, and industrial raw materials worldwide. However, the genetic networks that regulate endosperm cell differentiation remain largely unclear. As a first step toward characterizing these networks, we profiled the mRNAs in five major cell types of the differentiating endosperm and in the embryo and four maternal compartments of the maize (Zea mays) kernel. Comparisons of these mRNA populations revealed the diverged gene expression programs between filial and maternal compartments and an unexpected close correlation between embryo and the aleurone layer of endosperm. Gene coexpression network analysis identified coexpression modules associated with single or multiple kernel compartments including modules for the endosperm cell types, some of which showed enrichment of previously identified temporally activated and/or imprinted genes. Detailed analyses of a coexpression module highly correlated with the basal endosperm transfer layer (BETL) identified a regulatory module activated by MRP-1, a regulator of BETL differentiation and function. These results provide a high-resolution atlas of gene activity in the compartments of the maize kernel and help to uncover the regulatory modules associated with the differentiation of the major endosperm cell types. © 2015 American Society of Plant Biologists. All rights reserved.

  12. High-Throughput Sequencing Reveals H2O2 Stress-Associated MicroRNAs and a Potential Regulatory Network in Brachypodium distachyon Seedlings

    PubMed Central

    Lv, Dong-Wen; Zhen, Shoumin; Zhu, Geng-Rui; Bian, Yan-Wei; Chen, Guan-Xing; Han, Cai-Xia; Yu, Zi-Tong; Yan, Yue-Ming

    2016-01-01

    Oxidative stress in plants can be triggered by many environmental stress factors, such as drought and salinity. Brachypodium distachyon is a model organism for the study of biofuel plants and crops, such as wheat. Although recent studies have found many oxidative stress response-related proteins, the mechanism of microRNA (miRNA)-mediated oxidative stress response is still unclear. Using next generation high-throughput sequencing technology, the small RNAs were sequenced from the model plant B. distachyon 21 (Bd21) under H2O2 stress and normal growth conditions. In total, 144 known B. distachyon miRNAs and 221 potential new miRNAs were identified. Further analysis of potential new miRNAs suggested that 36 could be clustered into known miRNA families, while the remaining 185 were identified as B. distachyon-specific new miRNAs. Differential analysis of miRNAs from the normal and H2O2 stress libraries identified 31 known and 30 new H2O2 stress responsive miRNAs. The expression patterns of seven representative miRNAs were verified by reverse transcription quantitative polymerase chain reaction (RT-qPCR) analysis, which produced results consistent with those of the deep sequencing method. Moreover, we also performed RT-qPCR analysis to verify the expression levels of 13 target genes and the cleavage site of 5 target genes by known or novel miRNAs were validated experimentally by 5′ RACE. Additionally, a miRNA-mediated gene regulatory network for H2O2 stress response was constructed. Our study identifies a set of H2O2-responsive miRNAs and their target genes and reveals the mechanism of oxidative stress response and defense at the post-transcriptional regulatory level. PMID:27812362

  13. Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing

    PubMed Central

    Crampton, Mollee; Sripathi, Venkateswara R.; Hossain, Khwaja; Kalavacharla, Venu

    2016-01-01

    Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar (“Sierra”) using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation. PMID:27199997

  14. Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing.

    PubMed

    Crampton, Mollee; Sripathi, Venkateswara R; Hossain, Khwaja; Kalavacharla, Venu

    2016-01-01

    Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar ("Sierra") using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation.

  15. Developmental appearance of factors that bind specifically to cis-regulatory sequences of a gene expressed in the sea urchin embryo.

    PubMed

    Calzone, F J; Thézé, N; Thiebaud, P; Hill, R L; Britten, R J; Davidson, E H

    1988-09-01

    Previous gene-transfer experiments have identified a 2500-nucleotide 5' domain of the CyIIIa cytoskeletal actin gene, which contains cis-regulatory sequences that are necessary and sufficient for spatial and temporal control of CyIIIa gene expression during embryogenesis. This gene is activated in late cleavage, exclusively in aboral ectoderm cell lineages. In this study, we focus on interactions demonstrated in vitro between sequences of the regulatory domain and proteins present in crude extracts derived from sea urchin embryo nuclei and from unfertilized eggs. Quantitative gel-shift measurements are utilized to estimate minimum numbers of factor molecules per embryo at 24 hr postfertilization, when the CyIIIa gene is active, at 7 hr, when it is still silent, and in the unfertilized egg. We also estimate the binding affinity preferences (Kr) of the various factors for their respective sites, relative to their affinity for synthetic DNA competitors. At least 14 different specific interactions occur within the regulatory regions, some of which produce multiple DNA-protein complexes. Values of Kr range from approximately 2 x 10(4) to approximately 2 x 10(6) for these factors under the conditions applied. With one exception, the minimum factor prevalences that we measured in the 400-cell 24-hr embryo nuclear extracts fell within the range of 2 x 10(5) to 2 x 10(6) molecules per embryo, i.e., a few hundred to a few thousand molecules per nucleus. Three developmental patterns were observed with respect to factor prevalence: Factors reacting at one site were found in unfertilized egg cytoplasm at about the same level per egg or embryo as in 24-hr embryo nuclei; factors reacting with five other regions of the regulatory domain are not detectable in egg cytoplasm but in 7-hr mid-cleavage-stage embryo, nuclei are already at or close to their concentrations in the 24-hr embryo nuclei; and factors reacting with five additional regions are not detectable in egg cytoplasm and

  16. A re-sequencing based assessment of genomic heterogeneity and fast neutron-induced deletions in a common bean cultivar

    USDA-ARS?s Scientific Manuscript database

    A small fast neutron mutant population has been established from Phaseolus vulgaris cv. Red Hawk. We leveraged the available P. vulgaris genome sequence and high throughput next generation DNA sequencing to examine the genomic structure of five Phaseolus vulgaris cv. Red Hawk fast neutron mutants wi...

  17. T Cell Receptor CDR3 Sequence but Not Recognition Characteristics Distinguish Autoreactive Effector and Foxp3+ Regulatory T Cells

    PubMed Central

    Liu, Xin; Nguyen, Phuong; Liu, Wei; Cheng, Cheng; Steeves, Meredith; Obenauer, John C.; Ma, Jing; Geiger, Terrence L.

    2010-01-01

    SUMMARY The source, specificity, and plasticity of the forkhead box transcription factor 3 (Foxp3)+ regulatory T (Treg) and conventional T (Tconv) cell populations active at sites of autoimmune pathology are not well characterized. To evaluate this, we combined global repertoire analyses and functional assessments of isolated T cell receptors (TCR) from TCRα retrogenic mice with autoimmune encephalomyelitis. Treg and Tconv cell TCR repertoires were distinct, and autoantigen-specific Treg and Tconv cells were enriched in diseased tissue. Autoantigen sensitivity and fine specificity of these cells intersected, implying that differences in responsiveness were not responsible for lineage specification. Notably, autoreactive Treg and Tconv cells could be fully distinguished by an acidic versus aliphatic variation at a single TCR CDR3 residue. Our results imply that ontogenically distinct Treg and Tconv cell repertoires with convergent specificities for autoantigen respond during autoimmunity and argue against more than limited plasticity between Treg and Tconv cells during autoimmune inflammation. PMID:20005134

  18. De novo sequencing and assembly of Centella asiatica leaf transcriptome for mapping of structural, functional and regulatory genes with special reference to secondary metabolism.

    PubMed

    Sangwan, Rajender S; Tripathi, Sandhya; Singh, Jyoti; Narnoliya, Lokesh K; Sangwan, Neelam S

    2013-08-01

    Centella asiatica (L.) Urban is an important medicinal plant and has been used since ancient times in traditional systems of medicine. C. asiatica mainly contains ursane skeleton based triterpenoid sapogenins and saponins predominantly in its leaves. This investigation employed Illumina next generation sequencing (NGS) strategy on a pool of three cDNAs from expanding leaf of C. asiatica and developed an assembled transcriptome sequence resource of the plant. The short transcript reads (STRs) generated and assembled into contigs and singletons, representing majority of the genes expressed in C. asiatica, were termed as 'tentative unique transcripts' (TUTs). The TUT dataset was analyzed with the objectives of (i) development of a transcriptome assembly of C. asiatica, and (ii) classification/characterization of the genes into categories like structural, functional, regulatory etc. based on their function. Overall, 68.49% of the 46,171,131 reads generated in the NGS process could be assembled into a total of 79,041 contigs. Gene ontology and functional annotation of sequences resulted into the identification of genes related to different sets of cellular functions including identification of genes related to primary and secondary metabolism. The wet lab validation of seventeen assembled gene sequences identified to be involved in secondary metabolic pathways and control of reactive oxygen species (ROS) was established by semi-quantitative and real time PCR (qRT-PCR). The validation also included sequencing/size matching of a set of semi-quantitative PCR amplicons with their in silico assembled contig/gene. This confirmed the appropriateness of assembling the reads and contigs. Thus, the present study constitutes the largest report to date on C. asiatica transcriptome based gene resource that may contribute substantially to the understanding of the basal biological functions and biochemical pathways of secondary metabolites as well as the transcriptional regulatory

  19. Generation of glucocorticoid-responsive Moloney murine leukemia virus by insertion of regulatory sequences from murine mammary tumor virus into the long terminal repeat.

    PubMed Central

    Overhauser, J; Fan, H

    1985-01-01

    The glucocorticoid-regulatory sequences from the murine mammary tumor virus long terminal repeat (MMTV LTR) were introduced into the LTR of Moloney murine leukemia virus (M-MuLV) by recombinant DNA techniques. The site of insertion was in the M-MuLV LTR U3 region at -150 base pairs with respect to the RNA cap site. Infectious M-MuLVs carrying the altered LTRs (Mo + MMTV M-MuLVs) were recovered by transfection of proviral clones into NIH-3T3 cells. The Mo + MMTV M-MuLVs were hormonally responsive in that infection was 3 logs more efficient when performed in the presence of dexamethasone, irrespective of the orientation of the inserted MMTV sequences. However, even in the presence of hormone, the Mo + MMTV M-MuLVs were less infectious than wild-type M-MuLV. In contrast to the large effect on infectivity, dexamethasone induced virus-specific RNA levels in chronically Mo + MMTV M-MuLV-infected cells only two- to fourfold. Fusion plasmids between the altered LTRs and the bacterial chloramphenicol acetyltransferase gene allowed the investigation of LTR promoter strength by the transient chloramphenicol acetyltransferase expression assay. The chloramphenicol acetyltransferase assays indicated that the insertion of MMTV sequences into the M-MuLV LTR reduced promoter activity in the absence of glucocorticoids but that promoter activity could be induced two- to fivefold by dexamethasone. The Mo + MMTV M-MuLVs were also tested for the possibility that viral DNA synthesis or integration during initial infection was enhanced by dexamethasone. However, no significant difference was detected between cultures infected in the presence or absence of hormone. The insertion of MMTV sequences into an M-MuLV LTR deleted of its enhancer sequences did not yield infectious virus or active promoters, even in the presence of dexamethasone. Images PMID:2983110

  20. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions

    PubMed Central

    Nolte-’t Hoen, Esther N. M.; Buermans, Henk P. J.; Waasdorp, Maaike; Stoorvogel, Willem; Wauben, Marca H. M.; ’t Hoen, Peter A. C.

    2012-01-01

    Cells release RNA-carrying vesicles and membrane-free RNA/protein complexes into the extracellular milieu. Horizontal vesicle-mediated transfer of such shuttle RNA between cells allows dissemination of genetically encoded messages, which may modify the function of target cells. Other studies used array analysis to establish the presence of microRNAs and mRNA in cell-derived vesicles from many sources. Here, we used an unbiased approach by deep sequencing of small RNA released by immune cells. We found a large variety of small non-coding RNA species representing pervasive transcripts or RNA cleavage products overlapping with protein coding regions, repeat sequences or structural RNAs. Many of these RNAs were enriched relative to cellular RNA, indicating that cells destine specific RNAs for extracellular release. Among the most abundant small RNAs in shuttle RNA were sequences derived from vault RNA, Y-RNA and specific tRNAs. Many of the highly abundant small non-coding transcripts in shuttle RNA are evolutionary well-conserved and have previously been associated to gene regulatory functions. These findings allude to a wider range of biological effects that could be mediated by shuttle RNA than previously expected. Moreover, the data present leads for unraveling how cells modify the function of other cells via transfer of specific non-coding RNA species. PMID:22821563

  1. A site-specific, single-copy transgenesis strategy to identify 5' regulatory sequences of the mouse testis-determining gene Sry.

    PubMed

    Quinn, Alexander; Kashimada, Kenichi; Davidson, Tara-Lynne; Ng, Ee Ting; Chawengsaksophak, Kallayanee; Bowles, Josephine; Koopman, Peter

    2014-01-01

    The Y-chromosomal gene SRY acts as the primary trigger for male sex determination in mammalian embryos. Correct regulation of SRY is critical: aberrant timing or level of Sry expression is known to disrupt testis development in mice and we hypothesize that mutations that affect regulation of human SRY may account for some of the many cases of XY gonadal dysgenesis that currently remain unexplained. However, the cis-sequences involved in regulation of Sry have not been identified, precluding a test of this hypothesis. Here, we used a transgenic mouse approach aimed at identifying mouse Sry 5' flanking regulatory sequences within 8 kb of the Sry transcription start site (TSS). To avoid problems associated with conventional pronuclear injection of transgenes, we used a published strategy designed to yield single-copy transgene integration at a defined, transcriptionally open, autosomal locus, Col1a1. None of the Sry transgenes tested was expressed at levels compatible with activation of Sox9 or XX sex reversal. Our findings indicate either that the Col1a1 locus does not provide an appropriate context for the correct expression of Sry transgenes, or that the cis-sequences required for Sry expression in the developing gonads lie beyond 8 kb 5' of the TSS.

  2. ChIP and Re-ChIP assays: investigating interactions between regulatory proteins, histone modifications, and the DNA sequences to which they bind.

    PubMed

    Truax, Agnieszka D; Greer, Susanna F

    2012-01-01

    Chromatin immunoprecipitation (ChIP) assays were developed in order to comprehensively describe physiological interactions between DNA sequences, transcriptional regulators, and the modification status of associated chromatin. In ChIP assays, living cells are treated with chemical cross-linkers to covalently bind proteins to each other and to their DNA targets. Once cross-linked to associated proteins, chromatin is extracted and fragmented by sonication and protein-DNA complexes are isolated using specific antibodies against a target protein. The cross-links that bind proteins to DNA are then reversed, and purified DNA fragments are analyzed by qPCR to determine if a specific sequence is present. As DNA regulatory elements frequently rely on the interaction of multiple transcription factors and cofactors to regulate gene expression, Re-ChIP methods were developed to allow for the identification of multiple (concurrently binding) proteins on a single DNA sequence. Re-ChIP assays have enabled the analysis of multiple, simultaneous, posttranslational modifications to histones in order to determine the combinatorial pattern of modifications associated with transcriptional status of a gene. Together, ChIP and Re-ChIP have contributed to the elucidation of the epigenetic code-regulating gene expression and have enhanced our understanding of physiological binding of proteins to DNA targets. The protocols that follow describe general strategies used to perform ChIP and Re-ChIP assays for the study of specific protein-DNA interactions.

  3. Structural analysis of the regulatory elements of the type-II procollagen gene. Conservation of promoter and first intron sequences between human and mouse.

    PubMed Central

    Vikkula, M; Metsäranta, M; Syvänen, A C; Ala-Kokko, L; Vuorio, E; Peltonen, L

    1992-01-01

    Transcription of the type-II procollagen gene (COL2A1) is very specifically restricted to a limited number of tissues, particularly cartilages. In order to identify transcription-control motifs we have sequenced the promoter region and the first intron of the human and mouse COL2A1 genes. With the assumption that these motifs should be well conserved during evolution, we have searched for potential elements important for the tissue-specific transcription of the COL2A1 gene by aligning the two sequences with each other and with the available rat type-II procollagen sequence for the promoter. With this approach we could identify specific evolutionarily well-conserved motifs in the promoter area. On the other hand, several suggested regulatory elements in the promoter region did not show evolutionary conservation. In the middle of the first intron we found a cluster of well-conserved transcription-control elements and we conclude that these conserved motifs most probably possess a significant function in the control of the tissue-specific transcription of the COL2A1 gene. We also describe locations of additional, highly conserved nucleotide stretches, which are good candidate regions in the search for binding sites of yet-uncharacterized cartilage-specific transcription regulators of the COL2A1 gene. PMID:1637314

  4. Identifying Distal cis-acting Gene-Regulatory Sequences by Expressing BACs Functionalized with loxP-Tn10 Transposons in Zebrafish.

    PubMed

    Chatterjee, Pradeep K; Shakes, Leighcraft A; Wolf, Hope M; Mujalled, Mohammad A; Zhou, Constance; Hatcher, Charles; Norford, Derek C

    2013-06-21

    Bacterial Artificial Chromosomes (BACs) are large pieces of DNA from the chromosomes of organisms propagated faithfully in bacteria as large extra-chromosomal plasmids. Expression of genes contained in BACs can be monitored after functionalizing the BAC DNA with reporter genes and other sequences that allow stable maintenance and propagation of the DNA in the new host organism. The DNA in BACs can be altered within its bacterial host in several ways. Here we discuss one such approach, using Tn10 mini-transposons, to introduce exogenous sequences into BACs for a variety of purposes. The largely random insertions of Tn10 transposons carrying lox sites have been used to position mammalian cell-selectable antibiotic resistance genes, enhancer-traps and inverted repeat ends of the vertebrate transposon Tol2 precisely at the ends of the genomic DNA insert in BACs. These modified BACs are suitable for expression in zebrafish or mouse, and have been used to functionally identify important long-range gene regulatory sequences in both species. Enhancer-trapping using BACs should prove uniquely useful in analyzing multiple discontinuous DNA domains that act in concert to regulate expression of a gene, and is not limited by genome accessibility issues of traditional enhancer-trapping methods.

  5. On universal common ancestry, sequence similarity, and phylogenetic structure: the sins of P-values and the virtues of Bayesian evidence

    PubMed Central

    2011-01-01

    Background The universal common ancestry (UCA) of all known life is a fundamental component of modern evolutionary theory, supported by a wide range of qualitative molecular evidence. Nevertheless, recently both the status and nature of UCA has been questioned. In earlier work I presented a formal, quantitative test of UCA in which model selection criteria overwhelmingly choose common ancestry over independent ancestry, based on a dataset of universally conserved proteins. These model-based tests are founded in likelihoodist and Bayesian probability theory, in opposition to classical frequentist null hypothesis tests such as Karlin-Altschul E-values for sequence similarity. In a recent comment, Koonin and Wolf (K&W) claim that the model preference for UCA is "a trivial consequence of significant sequence similarity". They support this claim with a computational simulation, derived from universally conserved proteins, which produces similar sequences lacking phylogenetic structure. The model selection tests prefer common ancestry for this artificial data set. Results For the real universal protein sequences, hierarchical phylogenetic structure (induced by genealogical history) is the overriding reason for why the tests choose UCA; sequence similarity is a relatively minor factor. First, for cases of conflicting phylogenetic structure, the tests choose independent ancestry even with highly similar sequences. Second, certain models, like star trees and K&W's profile model (corresponding to their simulation), readily explain sequence similarity yet lack phylogenetic structure. However, these are extremely poor models for the real proteins, even worse than independent ancestry models, though they explain K&W's artificial data well. Finally, K&W's simulation is an implementation of a well-known phylogenetic model, and it produces sequences that mimic homologous proteins. Therefore the model selection tests work appropriately with the artificial data. Conclusions For K

  6. On universal common ancestry, sequence similarity, and phylogenetic structure: the sins of P-values and the virtues of Bayesian evidence.

    PubMed

    Theobald, Douglas L

    2011-11-24

    The universal common ancestry (UCA) of all known life is a fundamental component of modern evolutionary theory, supported by a wide range of qualitative molecular evidence. Nevertheless, recently both the status and nature of UCA has been questioned. In earlier work I presented a formal, quantitative test of UCA in which model selection criteria overwhelmingly choose common ancestry over independent ancestry, based on a dataset of universally conserved proteins. These model-based tests are founded in likelihoodist and Bayesian probability theory, in opposition to classical frequentist null hypothesis tests such as Karlin-Altschul E-values for sequence similarity. In a recent comment, Koonin and Wolf (K&W) claim that the model preference for UCA is "a trivial consequence of significant sequence similarity". They support this claim with a computational simulation, derived from universally conserved proteins, which produces similar sequences lacking phylogenetic structure. The model selection tests prefer common ancestry for this artificial data set. For the real universal protein sequences, hierarchical phylogenetic structure (induced by genealogical history) is the overriding reason for why the tests choose UCA; sequence similarity is a relatively minor factor. First, for cases of conflicting phylogenetic structure, the tests choose independent ancestry even with highly similar sequences. Second, certain models, like star trees and K&W's profile model (corresponding to their simulation), readily explain sequence similarity yet lack phylogenetic structure. However, these are extremely poor models for the real proteins, even worse than independent ancestry models, though they explain K&W's artificial data well. Finally, K&W's simulation is an implementation of a well-known phylogenetic model, and it produces sequences that mimic homologous proteins. Therefore the model selection tests work appropriately with the artificial data. For K&W's artificial protein data

  7. Characterization of the cis elements in the proximal promoter regions of the anthocyanin pathway genes reveals a common regulatory logic that governs pathway regulation

    PubMed Central

    Zhu, Zhixin; Wang, Hailong; Wang, Yiting; Guan, Shan; Wang, Fang; Tang, Jingyu; Zhang, Ruijuan; Xie, Lulu; Lu, Yingqing

    2015-01-01

    Cellular activities such as compound synthesis often require the transcriptional activation of an entire pathway; however, the molecular mechanisms underlying pathway activation have rarely been explained. Here, the cis regulatory architecture of the anthocyanin pathway genes targeted by the transcription factor (TF) complex including MYB, bHLH, and WDR was systematically analysed in one species and the findings extended to others. In Ipomoea purpurea, the IpMYB1-IpbHLH2-IpWDR1 (IpMBW) complex was found to be orthologous to the PAP1-GL3-TTG1 (AtPGT) complex of Arabidopsis thaliana, and interacted with a 7-bp MYB-recognizing element (MRE) and a 6-bp bHLH-recognizing element (BRE) at the proximal promoter region of the pathway genes. There was little transcription of the gene in the absence of the MRE or BRE. The cis elements identified experimentally converged on two syntaxes, ANCNNCC for MREs and CACN(A/C/T)(G/T) for BREs, and our bioinformatic analysis showed that these were present within anthocyanin gene promoters in at least 35 species, including both gymnosperms and angiosperms. For the anthocyanin pathway, IpMBW and AtPGT recognized the interspecific promoters of both early and later genes. In A. thaliana, the seed-specific TF complex (TT2, TT8, and TTG1) may regulate all the anthocyanin pathway genes, in addition to the proanthocyanidin-specific BAN. When multiple TF complexes in the anthocyanin pathway were compared, the cis architecture played a role larger than the TF complex in determining the variation in promoter activity. Collectively, a cis logic common to the pathway gene promoters was found, and this logic is essential for the trans factors to regulate the pathway. PMID:25911741

  8. Retinoic acid-induced down-regulation of the interleukin-2 promoter via cis-regulatory sequences containing an octamer motif.

    PubMed Central

    Felli, M P; Vacca, A; Meco, D; Screpanti, I; Farina, A R; Maroder, M; Martinotti, S; Petrangeli, E; Frati, L; Gulino, A

    1991-01-01

    Retinoic acid (RA) is known to influence the proliferation and differentiation of a wide variety of transformed and developing cells. We found that RA and the specific RA receptor (RAR) ligand Ch55 inhibited the phorbol ester and calcium ionophore-induced expression of the T-cell growth factor interleukin-2 (IL-2) gene. Expression of transiently transfected chloramphenicol acetyltransferase vectors containing the 5'-flanking region of the IL-2 gene was also inhibited by RA. RA-induced down-regulation of the IL-2 enhancer is mediated by RAR, since overexpression of transfected RARs increased RA sensitivity of the IL-2 promoter. Functional analysis of chloramphenicol acetyltransferase vectors containing either internal deletion mutants of the region from -317 to +47 bp of the IL-2 enhancer or multimerized cis-regulatory elements showed that the RA-responsive element in the IL-2 promoter mapped to sequences containing an octamer motif. RAR also inhibited the transcriptional activity of the octamer motif of the immunoglobulin heavy chain enhancer. In spite of the transcriptional inhibition of the IL-2 octamer motif, RA did not decrease the in vitro DNA-binding capability of octamer-1 protein. These results identify a regulatory pathway within the IL-2 promoter which involves the octamer motif and RAR. Images PMID:1652063

  9. Nucleotide sequence of the Escherichia coli regulatory gene mprA and construction and characterization of mprA-deficient mutants.

    PubMed Central

    del Castillo, I; González-Pastor, J E; San Millán, J L; Moreno, F

    1991-01-01

    In high copy number, the Escherichia coli mprA gene reduces the synthesis of peptide microcins B17 and C7 (MccB17 and MccC7) and blocks the osmoinduction of the proU operon at the transcriptional level. mprA has been sequenced and shown to encode a polypeptide of 176 amino acids (Mr, 20,563). Insertion and deletion mutant mprA alleles were constructed and then transferred to the chromosome by allelic replacement. In these mutants, expression of two mcb-lacZ fusions was fivefold derepressed, indicating a negative regulatory role of mprA on the mcb operon (MccB17). In contrast, no effect of the MprA- mutations on the expression of mcc operon (MccC7) or on the osmoinduction of proU operon was observed. PMID:1840583

  10. Structure analysis of two Toxoplasma gondii and Neospora caninum satellite DNA families and evolution of their common monomeric sequence.

    PubMed

    Clemente, Marina; de Miguel, Natalia; Lia, Veronica V; Matrajt, Mariana; Angel, Sergio O

    2004-05-01

    A family of repetitive DNA elements of approximately 350 bp-Sat350-that are members of Toxoplasma gondii satellite DNA was further analyzed. Sequence analysis identified at least three distinct repeat types within this family, called types A, B, and C. B repeats were divided into the subtypes B1 and B2. A search for internal repetitions within this family permitted the identification of conserved regions and the design of PCR primers that amplify almost all these repetitive elements. These primers amplified the expected 350-bp repeats and a novel 680-bp repetitive element (Sat680) related to this family. Two additional tandemly repeated high-order structures corresponding to this satellite DNA family were found by searching the Toxoplasma genome database with these sequences. These studies were confirmed by sequence analysis and identified: (1). an arrangement of AB1CB2 350-bp repeats and (2). an arrangement of two 350-bp-like repeats, resulting in a 680-bp monomer. Sequence comparison and phylogenetic analysis indicated that both high-order structures may have originated from the same ancestral 350-bp repeat. PCR amplification, sequence analysis and Southern blot showed that similar high-order structures were also found in the Toxoplasma-sister taxon Neospora caninum. The Toxoplasma genome database (http://ToxoDB.org ) permitted the assembly of a contig harboring Sat350 elements at one end and a long nonrepetitive DNA sequence flanking this satellite DNA. The region bordering the Sat350 repeats contained two differentially expressed sequence-related regions and interstitial telomeric sequences.

  11. Plasmid pKM101 encodes two nonhomologous antirestriction proteins (ArdA and ArdB) whose expression is controlled by homologous regulatory sequences.

    PubMed Central

    Belogurov, A A; Delver, E P; Rodzevich, O V

    1993-01-01

    The IncN plasmid pKM101 (a derivative of R46) encodes the antirestriction protein ArdB (alleviation of restriction of DNA) in addition to another antirestriction protein, ArdA, described previously. The relevant gene, ardB, was located in the leading region of pKM101, about 7 kb from oriT. The nucleotide sequence of ardB was determined, and an appropriate polypeptide was identified in maxicells of Escherichia coli. Like ArdA, ArdB efficiently inhibits restriction by members of the three known families of type I systems of E. coli and only slightly affects the type II enzyme, EcoRI. However, in contrast to ArdA, ArdB is ineffective against the modification activity of the type I (EcoK) system. Comparison of deduced amino acid sequences of ArdA and ArdB revealed only one small region of similarity (nine residues), suggesting that this region may be somehow involved in the interaction with the type I restriction systems. We also found that the expression of both ardA and ardB genes is controlled jointly by two pKM101-encoded proteins, ArdK and ArdR, with molecular weights of about 15,000 and 20,000, respectively. The finding that the sequences immediately upstream of ardA and ardB share about 94% identity over 218 bp suggests that their expression may be controlled by ArdK and ArdR at the transcriptional level. Deletion studies and promoter probe analysis of these sequences revealed the regions responsible for the action of ArdK and ArdR as regulatory proteins. We propose that both types of antirestriction proteins may play a pivotal role in overcoming the host restriction barrier by self-transmissible broad-host-range plasmids. It seems likely that the ardKR-dependent regulatory system serves in this case as a genetic switch that controls the expression of plasmid-encoded antirestriction functions during mating. Images PMID:8393008

  12. Preliminary whole-exome sequencing reveals mutations that imply common tumorigenicity pathways in multiple endocrine neoplasia type 1 patients

    PubMed Central

    Arenas, Minerva Angélica Romero; Fowler, Richard G.; Lucas, F. Anthony San; Shen, Jie; Rich, Thereasa A.; Grubbs, Elizabeth G.; Lee, Jeffrey E.; Scheet, Paul; Perrier, Nancy D.; Zhao, Hua

    2016-01-01

    Background Whole-exome sequencing studies have not established definitive somatic mutation patterns among patients with sporadic hyperparathyroidism (HPT). No sequencing has evaluated multiple endocrine neoplasia type 1 (MEN1)-related HPT. We sought to perform whole-exome sequencing in HPT patients to identify somatic mutations and associated biological pathways and tumorigenic networks. Methods Whole-exome sequencing was performed on blood and tissue from HPT patients (MEN1 and sporadic) and somatic single nucleotide variants (SNVs) were identified. Stop-gain and stop-loss SNVs were analyzed with Ingenuity Pathways Analysis (IPA). Loss of heterozygosity (LOH) was also assessed. Results Sequencing was performed on 4 MEN1 and 10 sporadic cases. Eighteen stop-gain/stop-loss SNV mutations were identified in 3 MEN1 patients. One complex network was identified on IPA: Cellular function and maintenance, tumor morphology, and cardiovascular disease (IPA score = 49). A nonsynonymous SNV of TP53 (lysine-to-glutamic acid change at codon 81) identified in a MEN1 patient was suggested to be a driver mutation (Cancer-specific High-throughput Annotation of Somatic Mutations; P = .002). All MEN1 and 3/10 sporadic specimens demonstrated LOH of chromosome 11. Conclusion Whole-exome sequencing revealed somatic mutations in MEN1 associated with a single tumorigenic network, whereas sporadic pathogenesis seemed to be more diverse. A somatic TP53 mutation was also identified. LOH of chromosome 11 was seen in all MEN1 and 3 of 10 sporadic patients. PMID:25456907

  13. High frequency of HMW-GS sequence variation through somatic hybridization between Agropyron elongatum and common wheat.

    PubMed

    Gao, Xin; Liu, Shu Wei; Sun, Qun; Xia, Guang Min

    2010-01-01

    A symmetric somatic hybridization was performed to combine the protoplasts of tall wheatgrass (Agropyron elongatum) and bread wheat (Triticum aestivum). Fertile regenerants were obtained which were morphologically similar to tall wheatgrass, but which contained some introgression segments from wheat. An SDS-PAGE analysis showed that a number of non-parental high-molecular weight glutenin subunits (HMW-GS) were present in the symmetric somatic hybridization derivatives. These sequences were amplified, cloned and sequenced, to deliver 14 distinct HMW-GS coding sequences, eight of which were of the y-type (Hy1-Hy8) and six x-type (Hx1-Hx6). Five of the cloned HMW-GS sequences were successfully expressed in E. coli. The analysis of their deduced peptide sequences showed that they all possessed the typical HMW-GS primary structure. Sequence alignments indicated that Hx5 and Hy1 were probably derived from the tall wheatgrass genes Aex5 and Aey6, while Hy2, Hy3, Hx1 and Hy6 may have resulted from slippage in the replication of a related biparental gene. We found that both symmetric and asymmetric somatic hybridization could promote the emergence of novel alleles. We discussed the origination of allelic variation of HMW-GS genes in somatic hybridization, which might be the result from the response to genomic shock triggered by the merger and interaction of biparent genomes.

  14. Identification of regulatory networks and hub genes controlling soybean seed set and size using RNA sequencing analysis.

    PubMed

    Du, Juan; Wang, Shoudong; He, Cunman; Zhou, Bin; Ruan, Yong-Ling; Shou, Huixia

    2017-04-01

    To understand the gene expression networks controlling soybean seed set and size, transcriptome analyses were performed in three early seed developmental stages, using two genotypes with contrasting seed size. The two-dimensional data set provides a comprehensive and systems-level view on dynamic gene expression networks underpinning soybean seed set and subsequent development. Using pairwise comparisons and weighted gene coexpression network analyses, we identified modules of coexpressed genes and hub genes for each module. Of particular importance are the discoveries of specific modules for the large seed size variety and for seed developmental stages. A large number of candidate regulators for seed size, including those involved in hormonal signaling pathways and transcription factors, were transiently and specifically induced in the early developmental stages. The soybean homologs of a brassinosteroid signaling receptor kinase, a brassinosteroid-signaling kinase, were identified as hub genes operating in the seed coat network in the early seed maturation stage. Overexpression of a candidate seed size regulatory gene, GmCYP78A5, in transgenic soybean resulted in increased seed size and seed weight. Together, these analyses identified a large number of potential key regulators controlling soybean seed set, seed size, and, consequently, yield potential, thereby providing new insights into the molecular networks underlying soybean seed development. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  15. Transposable Elements and DNA Methylation Create in Embryonic Stem Cells Human-Specific Regulatory Sequences Associated with Distal Enhancers and Noncoding RNAs

    PubMed Central

    Glinsky, Gennadi V.

    2015-01-01

    Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions

  16. Heterozygous triplication of upstream regulatory sequences leads to dysregulation of matrix metalloproteinase 19 in patients with cavitary optic disc anomaly.

    PubMed

    Hazlewood, Ralph J; Roos, Benjamin R; Solivan-Timpe, Frances; Honkanen, Robert A; Jampol, Lee M; Gieser, Stephen C; Meyer, Kacie J; Mullins, Robert F; Kuehn, Markus H; Scheetz, Todd E; Kwon, Young H; Alward, Wallace L M; Stone, Edwin M; Fingert, John H

    2015-03-01

    Patients with a congenital optic nerve disease, cavitary optic disc anomaly (CODA), are born with profound excavation of the optic nerve resembling glaucoma. We previously mapped the gene that causes autosomal-dominant CODA in a large pedigree to a chromosome 12q locus. Using comparative genomic hybridization and quantitative PCR analysis of this pedigree, we report identifying a 6-Kbp heterozygous triplication upstream of the matrix metalloproteinase 19 (MMP19) gene, present in all 17 affected family members and no normal members. Moreover, the triplication was not detected in 78 control subjects or in the Database of Genomic Variants. We further detected the same 6-Kbp triplication in one of 24 unrelated CODA patients and in none of 172 glaucoma patients. Analysis with a Luciferase assay showed that the 6-Kbp sequence has transcription enhancer activity. A 773-bp fragment of the 6-Kbp DNA segment increased downstream gene expression eightfold, suggesting that triplication of this sequence may lead to dysregulation of the downstream gene, MMP19, in CODA patients. Lastly, immunohistochemical analysis of human donor eyes revealed strong expression of MMP19 in optic nerve head. These data strongly suggest that triplication of an enhancer may lead to overexpression of MMP19 in the optic nerve that causes CODA. © 2015 WILEY PERIODICALS, INC.

  17. Changes in the six most common sequence types of Neisseria gonorrhoeae, including ST4378, identified by surveillance of antimicrobial resistance in northern Taiwan from 2006 to 2013.

    PubMed

    Cheng, Ching-Wai; Li, Lan-Hui; Su, Chen-Yi; Li, Shu-Ying; Yen, Muh-Yong

    2016-10-01

    There has been no longitudinal study of drug susceptibility in Neisseria gonorrhoeae in Taiwan since 2006. We collected 1090 gonococcal isolates from Taipei City Hospital, Taiwan from April 2006 to August 2013. We used a disk diffusion assay to determine the susceptibility to five antibiotics and an E-test to determine the minimum inhibitory concentrations for cefixime and ceftriaxone in isolates with resistance. Neisseria gonorrhoeae-multi Antigen Sequence Typing and DNA sequencing of the por and tbpB genes were used to identify sequence types. Among the 1090 isolates, the resistances to penicillin, ciprofloxacin, cefpodoxime, cefixime, and ceftriaxone were 61.01%, 83.39%, 9.63%, 6.70%, and 2.39%, respectively. The highest minimum inhibitory concentrations of cefixime and ceftriaxone were 0.19 mg/L and 0.50 mg/L, respectively. There were 327 sequence types. The four most common sequence types in homosexuals were ST4378, ST359, ST4654, and ST547; the two most common sequence types in heterosexuals were ST421 and ST419. Each of these sequence types had more than 25 isolates. There were significant differences in the sequence types in patients with different sexual orientations (p < 0.001). Oral cefixime or ceftriaxone injections were used as first-line drugs for the treatment of gonorrhea from 2006 to 2013 because gonorrhea isolates had low minimum inhibitory concentrations for these two drugs. The abrupt emergence of ST4378 (closely related to the notorious ST1407) since 2009 is a cause for alarm. Changes in sexual behavior, including an increase in sexual activity without the use of condoms, may have contributed to the peak in gonorrhea in 2010. Further molecular epidemiological investigations are required. Copyright © 2014. Published by Elsevier B.V.

  18. Constitutive androstane receptor transcriptionally activates human CYP1A1 and CYP1A2 genes through a common regulatory element in the 5'-flanking region.

    PubMed

    Yoshinari, Kouichi; Yoda, Noriaki; Toriyabe, Takayoshi; Yamazoe, Yasushi

    2010-01-15

    Phenobarbital has long been known to increase cellular levels of CYP1A1 and CYP1A2 possibly through a pathway(s) independent of aryl hydrocarbon receptor. We have investigated the role of constitutive androstane receptor (CAR), a xenobiotic-responsive nuclear receptor, in the transactivation of human CYP1A1 and CYP1A2. These genes are located in a head-to-head orientation, sharing a 5'-flanking region. Reporter assays were thus performed with dual-reporter constructs, containing the whole or partially deleted human CYP1A promoter between two different reporter genes. In this system, human CAR (hCAR) enhanced the transcription of both genes through common promoter regions from -461 to -554 and from -18089 to -21975 of CYP1A1. With reporter assays using additional deleted and mutated constructs, electrophoresis mobility shift assays and chromatin immunoprecipitation assays, an ER8 motif (everted repeat separated by eight nucleotides), located at around -520 of CYP1A1, was identified as an hCAR-responsive element and a binding motif of hCAR/human retinoid X receptor alpha heterodimer. hCAR enhanced the transcription of both genes also in the presence of an aryl hydrocarbon receptor ligand. Finally, hCAR activation increased CYP1A1 and CYP1A2 mRNA levels in cultured human hepatocytes. Our results indicate that CAR transactivates human CYP1A1 and CYP1A2 in human hepatocytes through the common cis-element ER8. Interestingly, the ER8 motif is highly conserved in the CYP1A1 proximal promoter sequences of various species, suggesting a fundamental role of CAR in the xenobiotic-induced expression of CYP1A1 and CYP1A2 independent of aryl hydrocarbon receptor.

  19. RNA sequencing and functional analysis implicate the regulatory role of long non-coding RNAs in tomato fruit ripening.

    PubMed

    Zhu, Benzhong; Yang, Yongfang; Li, Ran; Fu, Daqi; Wen, Liwei; Luo, Yunbo; Zhu, Hongliang

    2015-08-01

    Recently, long non-coding RNAs (lncRNAs) have been shown to play critical regulatory roles in model plants, such as Arabidopsis, rice, and maize. However, the presence of lncRNAs and how they function in fleshy fruit ripening are still largely unknown because fleshy fruit ripening is not present in the above model plants. Tomato is the model system for fruit ripening studies due to its dramatic ripening process. To investigate further the role of lncRNAs in fruit ripening, it is necessary and urgent to discover and identify novel lncRNAs and understand the function of lncRNAs in tomato fruit ripening. Here it is reported that 3679 lncRNAs were discovered from wild-type tomato and ripening mutant fruit. The lncRNAs are transcribed from all tomato chromosomes, 85.1% of which came from intergenic regions. Tomato lncRNAs are shorter and have fewer exons than protein-coding genes, a situation reminiscent of lncRNAs from other model plants. It was also observed that 490 lncRNAs were significantly up-regulated in ripening mutant fruits, and 187 lncRNAs were down-regulated, indicating that lncRNAs could be involved in the regulation of fruit ripening. In line with this, silencing of two novel tomato intergenic lncRNAs, lncRNA1459 and lncRNA1840, resulted in an obvious delay of ripening of wild-type fruit. Overall, the results indicated that lncRNAs might be essential regulators of tomato fruit ripening, which sheds new light on the regulation of fruit ripening. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  20. RNA sequencing and functional analysis implicate the regulatory role of long non-coding RNAs in tomato fruit ripening

    PubMed Central

    Zhu, Benzhong; Yang, Yongfang; Li, Ran; Fu, Daqi; Wen, Liwei; Luo, Yunbo; Zhu, Hongliang

    2015-01-01

    Recently, long non-coding RNAs (lncRNAs) have been shown to play critical regulatory roles in model plants, such as Arabidopsis, rice, and maize. However, the presence of lncRNAs and how they function in fleshy fruit ripening are still largely unknown because fleshy fruit ripening is not present in the above model plants. Tomato is the model system for fruit ripening studies due to its dramatic ripening process. To investigate further the role of lncRNAs in fruit ripening, it is necessary and urgent to discover and identify novel lncRNAs and understand the function of lncRNAs in tomato fruit ripening. Here it is reported that 3679 lncRNAs were discovered from wild-type tomato and ripening mutant fruit. The lncRNAs are transcribed from all tomato chromosomes, 85.1% of which came from intergenic regions. Tomato lncRNAs are shorter and have fewer exons than protein-coding genes, a situation reminiscent of lncRNAs from other model plants. It was also observed that 490 lncRNAs were significantly up-regulated in ripening mutant fruits, and 187 lncRNAs were down-regulated, indicating that lncRNAs could be involved in the regulation of fruit ripening. In line with this, silencing of two novel tomato intergenic lncRNAs, lncRNA1459 and lncRNA1840, resulted in an obvious delay of ripening of wild-type fruit. Overall, the results indicated that lncRNAs might be essential regulators of tomato fruit ripening, which sheds new light on the regulation of fruit ripening. PMID:25948705

  1. Propionibacterium acnes: Disease-Causing Agent or Common Contaminant? Detection in Diverse Patient Samples by Next-Generation Sequencing

    PubMed Central

    Friis-Nielsen, Jens; Vinner, Lasse; Hansen, Thomas Arn; Richter, Stine Raith; Fridholm, Helena; Herrera, Jose Alejandro Romero; Lund, Ole; Brunak, Søren; Izarzugaza, Jose M. G.; Mourier, Tobias; Nielsen, Lars Peter

    2016-01-01

    Propionibacterium acnes is the most abundant bacterium on human skin, particularly in sebaceous areas. P. acnes is suggested to be an opportunistic pathogen involved in the development of diverse medical conditions but is also a proven contaminant of human clinical samples and surgical wounds. Its significance as a pathogen is consequently a matter of debate. In the present study, we investigated the presence of P. acnes DNA in 250 next-generation sequencing data sets generated from 180 samples of 20 different sample types, mostly of cancerous origin. The samples were subjected to either microbial enrichment, involving nuclease treatment to reduce the amount of host nucleic acids, or shotgun sequencing. We detected high proportions of P. acnes DNA in enriched samples, particularly skin tissue-derived and other tissue samples, with the levels being higher in enriched samples than in shotgun-sequenced samples. P. acnes reads were detected in most samples analyzed, though the proportions in most shotgun-sequenced samples were low. Our results show that P. acnes can be detected in practically all sample types when molecular methods, such as next-generation sequencing, are employed. The possibility of contamination from the patient or other sources, including laboratory reagents or environment, should therefore always be considered carefully when P. acnes is detected in clinical samples. We advocate that detection of P. acnes always be accompanied by experiments validating the association between this bacterium and any clinical condition. PMID:26818667

  2. Activation of the major immediate early gene of human cytomegalovirus by cis-acting elements in the promoter-regulatory sequence and by virus-specific trans-acting components.

    PubMed Central

    Stinski, M F; Roehr, T J

    1985-01-01

    Upstream of the major immediate early gene of human cytomegalovirus (Towne) is a strong promoter-regulatory region that promotes the synthesis of 1.95-kilobase mRNA (D. R. Thomsen, R. M. Stenberg, W. F. Goins, and M. F. Stinski, Proc. Natl. Acad. Sci. U.S.A. 81:659-663, 1984; M. F. Stinski, D. R. Thomsen, R. M. Stenberg, and L. C. Goldstein, J. Virol. 46:1-14, 1983). The wild-type promoter-regulatory region as well as deletions within this region were ligated upstream of the thymidine kinase, chloramphenicol acetyltransferase, or ovalbumin genes. These gene chimeras were constructed to investigate the role of the regulatory sequences in enhancing downstream expression. The regulatory region extends to approximately 465 nucleotides upstream of the cap site for the initiation of transcription. The extent and type of regulatory sequences upstream of the promoter influences the level of in vitro transcription as well as the amount of in vivo expression of the downstream gene. The regulatory elements for cis-activation appear to be repeated several times within the regulatory region. A direct correlation was established between the distribution of the 19 (5' CCCCAGTTGACGTCAATGGG 3')- and 18 (5' CACTAACGGGACTTTCCAA 3')-nucleotide repeats and the level of downstream expression. In contrast, the 16 (5' CTTGGCAGTACATCAA 3')-nucleotide repeat is not necessary for the enhancement of downstream expression. In a domain associated with the 19- or 18-nucleotide repeats are elements that can be activated in trans by a human cytomegalovirus-specified component but not a herpes simplex virus-specified component. Therefore, the regulatory sequences of the major immediate early gene of human cytomegalovirus have an important role in interacting with cellular and virus-specific factors of the transcription complex to enhance downstream expression of this critical viral gene. Images PMID:2991567

  3. Analysis of sequences involved in IE2 transactivation of a baculovirus immediate-early gene promoter and identification of a new regulatory motif.

    PubMed

    Shippam-Brett, C E; Willis, L G; Theilmann, D A

    2001-05-01

    Opep-2 is a unique baculovirus early gene that has only been identified in the Orgyia pseudotsugata multiple capsid nucleopolyhedrovirus (OpMNPV). Previous analyses have shown this gene is expressed at very early times post-infection (p.i.) but is shut down by 36-48 h p.i. The promoter of opep-2 therefore, represents a class of early genes that is temporally regulated. In this study, a detailed analysis of the opep-2 promoter is performed to analyze the role individual motifs play in early gene expression. A new 13 base pair regulatory element was identified and shown to be essential in controlling high-level expression of this gene. In addition, mutational analysis revealed that GATA and CACGTG motifs, which have been shown to bind cellular factors in Sf9 and Ld652Y cells, played minor roles in influencing opep-2 expression in the absence of other viral factors. The OpMNPV transactivator IE2 causes a significant activation of the opep-2 promoter. Cotransfection of an extensive number of promoter deletions and mutations did not show any sequence specificity for IE2 transactivation. This is the first detailed analysis of the sequence requirements for IE2 transactivation, and these results suggest that IE2 does not bind directly to specific elements in the opep-2 promoter.

  4. The mouse p97 (CDC48) gene. Genomic structure, definition of transcriptional regulatory sequences, gene expression, and characterization of a pseudogene.

    PubMed

    Müller, J M; Meyer, H H; Ruhrberg, C; Stamp, G W; Warren, G; Shima, D T

    1999-04-09

    Here we present the first description of the genomic organization, transcriptional regulatory sequences, and adult and embryonic gene expression for the mouse p97(CDC48) AAA ATPase. Clones representing two distinct p97 genes were isolated in a genomic library screen, one of them likely representing a non-functional processed pseudogene. The coding region of the gene encoding the functional mRNA is interrupted by 16 introns and encompasses 20.4 kilobase pairs. Definition of the transcriptional initiation site and sequence analysis showed that the gene contains a TATA-less, GC-rich promoter region with an initiator element spanning the transcription start site. Cis-acting elements necessary for basal transcription activity reside within 410 base pairs of the flanking region as determined by transient transfection assays. In immunohistological analyses, p97 was widely expressed in embryos and adults, but protein levels were tightly controlled in a cell type- and cell differentiation-dependent manner. A remarkable heterogeneity in p97 immunostaining was found on a cellular level within a given tissue, and protein amounts in the cytoplasm and nucleus varied widely, suggesting a highly regulated and intermittent function for p97. This study provides the basis for a detailed analysis of the complex regulation of p97 and the reagents required for assessing its functional significance using targeted gene manipulation in the mouse.

  5. Nucleotide sequence analysis reveals linked N-acetyl hydrolase, thioesterase, transport, and regulatory genes encoded by the bialaphos biosynthetic gene cluster of Streptomyces hygroscopicus.

    PubMed Central

    Raibaud, A; Zalacain, M; Holt, T G; Tizard, R; Thompson, C J

    1991-01-01

    Nucleotide sequence analysis of a 5,000-bp region of the bialaphos antibiotic production (bap) gene cluster defined five open reading frames (ORFs) which predicted structural genes in the order bah, ORF1, ORF2, and ORF3 followed by the regulatory gene, brpA (H. Anzai, T. Murakami, S. Imai, A. Satoh, K. Nagaoka, and C.J. Thompson, J. Bacteriol. 169:3482-3488, 1987). The four structural genes were translationally coupled and apparently cotranscribed from an undefined promoter(s) under the positive control of the brpA gene product. S1 mapping experiments indicated that brpA was transcribed by two promoters (brpAp1 and brpAp2) which initiate transcription 150 and 157 bp upstream of brp A within an intergenic region and at least one promoter further upstream within the bap gene cluster (brpAp3). All three transcripts were present at low levels during exponential growth and increased just before the stationary phase. The levels of the brpAp3 band continued to increase at the onset of stationary phase, whereas brpAp1-and brpAp2-protected fragments showed no further change. BrpA contained a possible helix-turn-helix motif at its C terminus which was similar to the C-terminal regulatory motif found in the receiver component of a family of two-component transcriptional activator proteins. This motif was not associated with the N-terminal domain conserved in other members of the family. The structural gene cluster sequenced began with bah, encoding a bialaphos acetylhydrolase which removes the N-acetyl group from bialaphos as one of the final steps in the biosynthetic pathway. The observation that Bah was similar to a rat and to a bacterial (Acinetobacter calcoaceticus) lipase probably reflects the fact that the ester bonds of triglycerides and the amide bond linking acetate to phosphinothricin are similar and hydrolysis is catalyzed by structurally related enzymes. This was followed by two regions encoding ORF1 and ORF2 which were similar to each other (48% nucleotide

  6. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model.

    PubMed

    Kogelman, Lisette J A; Cirera, Susanna; Zhernakova, Daria V; Fredholm, Merete; Franke, Lude; Kadarmideen, Haja N

    2014-09-30

    Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and 34.58). Moreover, detection

  7. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model

    PubMed Central

    2014-01-01

    Background Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. Methods We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. Results WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and

  8. Full Mitochondrial Genome Sequence of the Sugar Beet Wireworm Limonius californicus (Coleoptera: Elateridae), a Common Agricultural Pest

    PubMed Central

    New, Daniel D.; Robison, Barrie D.; Rashed, Arash; Hohenlohe, Paul; Forney, Larry; Rashidi, Mahnaz; Wilson, Cathy M.; Settles, Matthew L.

    2016-01-01

    We report here the full mitochondrial genome sequence of Limonius californicus, a species of click beetle that is an agricultural pest in its larval form. The circular genome is 16.5 kb and contains 13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes. PMID:26798113

  9. Functional homology between the yeast regulatory proteins GAL4 and LAC9: LAC9-mediated transcriptional activation in Kluyveromyces lactis involves protein binding to a regulatory sequence homologous to the GAL4 protein-binding site.

    PubMed Central

    Breunig, K D; Kuger, P

    1987-01-01

    As shown previously, the beta-galactosidase gene of Kluyveromyces lactis is transcriptionally regulated via an upstream activation site (UASL) which contains a sequence homologous to the GAL4 protein-binding site in Saccharomyces cerevisiae (M. Ruzzi, K.D. Breunig, A.G. Ficca, and C.P. Hollenberg, Mol. Cell. Biol. 7:991-997, 1987). Here we demonstrate that the region of homology specifically binds a K. lactis regulatory protein. The binding activity was detectable in protein extracts from wild-type cells enriched for DNA-binding proteins by heparin affinity chromatography. These extracts could be used directly for DNase I and exonuclease III protection experiments. A lac9 deletion strain, which fails to induce the beta-galactosidase gene, did not contain the binding factor. The homology of LAC9 protein with GAL4 (J.M. Salmeron and S. A. Johnston, Nucleic Acids Res. 14:7767-7781, 1986) strongly suggests that LAC9 protein binds directly to UASL and plays a role similar to that of GAL4 in regulating transcription. Images PMID:2830492

  10. Analysis of the coding-complete genomic sequence of groundnut ringspot virus suggests a common ancestor with tomato chlorotic spot virus.

    PubMed

    de Breuil, Soledad; Cañizares, Joaquín; Blanca, José Miguel; Bejerman, Nicolás; Trucco, Verónica; Giolitti, Fabián; Ziarsolo, Peio; Lenardon, Sergio

    2016-08-01

    Groundnut ringspot virus (GRSV) and tomato chlorotic spot virus (TCSV) share biological and serological properties, so their identification is carried out by molecular methods. Their genomes consist of three segmented RNAs: L, M and S. The finding of a reassortant between these two viruses may complicate correct virus identification and requires the characterization of the complete genome. Therefore, we present for the first time the complete sequences of all the genes encoded by a GRSV isolate. The high level of sequence similarity between GRSV and TCSV (over 90 % identity) observed in the genes and proteins encoded in the M RNA support previous results indicating that these viruses probably have a common ancestor.

  11. A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification.

    PubMed

    Cenik, Can; Chua, Hon Nian; Singh, Guramrit; Akef, Abdalla; Snyder, Michael P; Palazzo, Alexander F; Moore, Melissa J; Roth, Frederick P

    2017-03-01

    Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5' proximal-intron-minus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N(1)-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N(1)-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. © 2017 Cenik et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  12. Targeted sequence capture and resequencing implies a predominant role of regulatory regions in the divergence of a sympatric lake whitefish species pair (Coregonus clupeaformis).

    PubMed

    Hebert, Francois Olivier; Renaut, Sébastien; Bernatchez, Louis

    2013-10-01

    Latest technological developments in evolutionary biology bring new challenges in documenting the intricate genetic architecture of species in the process of divergence. Sympatric populations of lake whitefish represent one of the key systems to investigate this issue. Despite the value of random genotype-by-sequencing methods and decreasing cost of sequencing technologies, it remains challenging to investigate variation in coding regions, especially in the case of recently duplicated genomes as in salmonids, as this greatly complicates whole genome resequencing. We thus designed a sequence capture array targeting 2773 annotated genes to document the nature and the extent of genomic divergence between sympatric dwarf and normal whitefish. Among the 2728 genes successfully captured, a total of 2182 coding and 10,415 noncoding putative single-nucleotide polymorphisms (SNPs) were identified after applying a first set of basic filters. A genome scan with a quality-refined selection of 2203 SNPs identified 267 outlier SNPs in 210 candidate genes located in genomic regions potentially involved in whitefish divergence and reproductive isolation. We found highly heterogeneous FST estimates among SNP loci. There was an overall low level of coding polymorphism, with a predominance of noncoding mutations among outliers. The heterogeneous patterns of divergence among loci confirm the porous nature of genomes during speciation with gene flow. Considering that few protein-coding mutations were identified as highly divergent, our results, along with previous transcriptomic studies, imply that changes in regulatory regions most likely had a greater role in the process of whitefish population divergence than protein-coding mutations. This study is the first to demonstrate the efficiency of large-scale targeted resequencing for a nonmodel species with such a large and unsequenced genome.

  13. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans

    PubMed Central

    Khachatoorian, Careen; Judelson, Howard S.

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies. PMID:26716454

  14. HLA-F coding and regulatory segments variability determined by massively parallel sequencing procedures in a Brazilian population sample.

    PubMed

    Lima, Thálitta Hetamaro Ayala; Buttura, Renato Vidal; Donadi, Eduardo Antônio; Veiga-Castelli, Luciana Caricati; Mendes-Junior, Celso Teixeira; Castelli, Erick C

    2016-10-01

    Human Leucocyte Antigen F (HLA-F) is a non-classical HLA class I gene distinguished from its classical counterparts by low allelic polymorphism and distinctive expression patterns. Its exact function remains unknown. It is believed that HLA-F has tolerogenic and immune modulatory properties. Currently, there is little information regarding the HLA-F allelic variation among human populations and the available studies have evaluated only a fraction of the HLA-F gene segment and/or have searched for known alleles only. Here we present a strategy to evaluate the complete HLA-F variability including its 5' upstream, coding and 3' downstream segments by using massively parallel sequencing procedures. HLA-F variability was surveyed on 196 individuals from the Brazilian Southeast. The results indicate that the HLA-F gene is indeed conserved at the protein level, where thirty coding haplotypes or coding alleles were detected, encoding only four different HLA-F full-length protein molecules. Moreover, a same protein molecule is encoded by 82.45% of all coding alleles detected in this Brazilian population sample. However, the HLA-F nucleotide and haplotype variability is much higher than our current knowledge both in Brazilians and considering the 1000 Genomes Project data. This protein conservation is probably a consequence of the key role of HLA-F in the immune system physiology.

  15. Complete genome sequence and transcriptomics analyses reveal pigment biosynthesis and regulatory mechanisms in an industrial strain, Monascus purpureus YY-1.

    PubMed

    Yang, Yue; Liu, Bin; Du, Xinjun; Li, Ping; Liang, Bin; Cheng, Xiaozhen; Du, Liangcheng; Huang, Di; Wang, Lei; Wang, Shuo

    2015-02-09

    Monascus has been used to produce natural colorants and food supplements for more than one thousand years, and approximately more than one billion people eat Monascus-fermented products during their daily life. In this study, using next-generation sequencing and optical mapping approaches, a 24.1-Mb complete genome of an industrial strain, Monascus purpureus YY-1, was obtained. This genome consists of eight chromosomes and 7,491 genes. Phylogenetic analysis at the genome level provides convincing evidence for the evolutionary position of M. purpureus. We provide the first comprehensive prediction of the biosynthetic pathway for Monascus pigment. Comparative genomic analyses show that the genome of M. purpureus is 13.6-40% smaller than those of closely related filamentous fungi and has undergone significant gene losses, most of which likely occurred during its specialized adaptation to starch-based foods. Comparative transcriptome analysis reveals that carbon starvation stress, resulting from the use of relatively low-quality carbon sources, contributes to the high yield of pigments by repressing central carbon metabolism and augmenting the acetyl-CoA pool. Our work provides important insights into the evolution of this economically important fungus and lays a foundation for future genetic manipulation and engineering of this strain.

  16. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans.

    PubMed

    Poidevin, Laetitia; Andreeva, Kalina; Khachatoorian, Careen; Judelson, Howard S

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies.

  17. Identification and sequence analysis of two regulatory genes involved in anaerobic toluene metabolism by strain T1.

    PubMed Central

    Coschigano, P W; Young, L Y

    1997-01-01

    T1 is a denitrifying bacterium isolated for its ability to grow with toluene serving as the sole carbon source. Mutants of this strain that have defects in the toluene utilization pathway have been isolated and have been separated into classes based on growth phenotypes. A cosmid clone has been isolated by complementing the tutB16 (for toluene utilization) mutation. The complementing gene has been localized to a 3.3-kb DNA fragment. An additional open reading frame upstream of the tutB gene has also been identified and is designated tutC. The nucleotide sequence and the predicted amino acid translation of the 6.4-kb DNA fragment that contains these genes are presented. The tutB and tutC gene products of strain T1 have homology to members of the two-component sensor-regulator family and are proposed to play a role in the regulation of toluene metabolic genes of strain T1. To our knowledge, this is the first published report of the isolation of mutants defective in anaerobic aromatic hydrocarbon degradation. Additionally, we report for the first time the cloning of genes involved in an anaerobic aromatic hydrocarbon degradation pathway. PMID:9023943

  18. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Hu, Jingjie; Wang, Xiaolong; Hu, Xiaoli; Bao, Zhenmin

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2 6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  19. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Jingjie, Hu; Xiaolong, Wang; Xiaoli, Hu; Zhenmin, Bao

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2-6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  20. Rearrangements of the transcriptional regulatory networks of metabolic pathways in fungi.

    PubMed

    Lavoie, Hugo; Hogues, Hervé; Whiteway, Malcolm

    2009-12-01

    Growing evidence suggests that transcriptional regulatory networks in many organisms are highly flexible. Here, we discuss the evolution of transcriptional regulatory networks governing the metabolic machinery of sequenced ascomycetes. In particular, recent work has shown that transcriptional rewiring is common in regulons controlling processes such as production of ribosome components and metabolism of carbohydrates and lipids. We note that dramatic rearrangements of the transcriptional regulatory components of metabolic functions have occurred among ascomycetes species.

  1. A Distinct Regulatory Sequence Is Essential for the Expression of a Subset of nle Genes in Attaching and Effacing Escherichia coli

    PubMed Central

    García-Angulo, Víctor A.; Martínez-Santos, Verónica I.; Villaseñor, Tomás; Santana, Francisco J.; Huerta-Saquero, Alejandro; Martínez, Luary C.; Jiménez, Rafael; Lara-Ochoa, Cristina; Téllez-Sosa, Juan; Bustamante, Víctor H.

    2012-01-01

    Enteropathogenic Escherichia coli uses a type III secretion system (T3SS), encoded in the locus of enterocyte effacement (LEE) pathogenicity island, to translocate a wide repertoire of effector proteins into the host cell in order to subvert cell signaling cascades and promote bacterial colonization and survival. Genes encoding type III-secreted effectors are located in the LEE and scattered throughout the chromosome. While LEE gene regulation is better understood, the conditions and factors involved in the expression of effectors encoded outside the LEE are just starting to be elucidated. Here, we identified a highly conserved sequence containing a 13-bp inverted repeat (IR), located upstream of a subset of genes coding for different non-LEE-encoded effectors in A/E pathogens. Site-directed mutagenesis and deletion analysis of the nleH1 and nleB2 regulatory regions revealed that this IR is essential for the transcriptional activation of both genes. Growth conditions that favor the expression of LEE genes also facilitate the activation of nleH1 and nleB2; however, their expression is independent of the LEE-encoded positive regulators Ler and GrlA but is repressed by GrlR and the global regulator H-NS. In contrast, GrlA and Ler are required for nleA expression, while H-NS silences it. Consistent with their role in the regulation of nleA, purified Ler and H-NS bound to the regulatory region of nleA upstream of its promoter. This work shows that at least two modes of regulation control the expression of effector genes in attaching and effacing (A/E) pathogens, suggesting that a subset of effector functions may be coordinately expressed in a particular niche or time during infection. PMID:22904277

  2. A distinct regulatory sequence is essential for the expression of a subset of nle genes in attaching and effacing Escherichia coli.

    PubMed

    García-Angulo, Víctor A; Martínez-Santos, Verónica I; Villaseñor, Tomás; Santana, Francisco J; Huerta-Saquero, Alejandro; Martínez, Luary C; Jiménez, Rafael; Lara-Ochoa, Cristina; Téllez-Sosa, Juan; Bustamante, Víctor H; Puente, José L

    2012-10-01

    Enteropathogenic Escherichia coli uses a type III secretion system (T3SS), encoded in the locus of enterocyte effacement (LEE) pathogenicity island, to translocate a wide repertoire of effector proteins into the host cell in order to subvert cell signaling cascades and promote bacterial colonization and survival. Genes encoding type III-secreted effectors are located in the LEE and scattered throughout the chromosome. While LEE gene regulation is better understood, the conditions and factors involved in the expression of effectors encoded outside the LEE are just starting to be elucidated. Here, we identified a highly conserved sequence containing a 13-bp inverted repeat (IR), located upstream of a subset of genes coding for different non-LEE-encoded effectors in A/E pathogens. Site-directed mutagenesis and deletion analysis of the nleH1 and nleB2 regulatory regions revealed that this IR is essential for the transcriptional activation of both genes. Growth conditions that favor the expression of LEE genes also facilitate the activation of nleH1 and nleB2; however, their expression is independent of the LEE-encoded positive regulators Ler and GrlA but is repressed by GrlR and the global regulator H-NS. In contrast, GrlA and Ler are required for nleA expression, while H-NS silences it. Consistent with their role in the regulation of nleA, purified Ler and H-NS bound to the regulatory region of nleA upstream of its promoter. This work shows that at least two modes of regulation control the expression of effector genes in attaching and effacing (A/E) pathogens, suggesting that a subset of effector functions may be coordinately expressed in a particular niche or time during infection.

  3. Contrasting Frequencies and Effects of cis- and trans-Regulatory Mutations Affecting Gene Expression

    PubMed Central

    Metzger, Brian P. H.; Duveau, Fabien; Yuan, David C.; Tryban, Stephen; Yang, Bing; Wittkopp, Patricia J.

    2016-01-01

    Heritable differences in gene expression are caused by mutations in DNA sequences encoding cis-regulatory elements and trans-regulatory factors. These two classes of regulatory change differ in their relative contributions to expression differences in natural populations because of the combined effects of mutation and natural selection. Here, we investigate how new mutations create the regulatory variation upon which natural selection acts by quantifying the frequencies and effects of hundreds of new cis- and trans-acting mutations altering activity of the TDH3 promoter in the yeast Saccharomyces cerevisiae in the absence of natural selection. We find that cis-regulatory mutations have larger effects on expression than trans-regulatory mutations and that while trans-regulatory mutations are more common overall, cis- and trans-regulatory changes in expression are equally abundant when only the largest changes in expression are considered. In addition, we find that cis-regulatory mutations are skewed toward decreased expression while trans-regulatory mutations are skewed toward increased expression. We also measure the effects of cis- and trans-regulatory mutations on the variability in gene expression among genetically identical cells, a property of gene expression known as expression noise, finding that trans-regulatory mutations are much more likely to decrease expression noise than cis-regulatory mutations. Because new mutations are the raw material upon which natural selection acts, these differences in the frequencies and effects of cis- and trans-regulatory mutations should be considered in models of regulatory evolution. PMID:26782996

  4. A Common Missense Variant in the Glucokinase Regulatory Protein Gene (GCKR) Is Associated with Increased Plasma Triglyceride and C-Reactive Protein but Lower Fasting Glucose Concentrations

    USDA-ARS?s Scientific Manuscript database

    OBJECTIVE-Using the genome-wide-association approach, we recently identified the glucokinase regulatory protein gene (GCKR, rs780094) region as a novel quantitative trait locus for plasma triglyceride concentration in Europeans. Here, we sought to study the association of GCKR variants with metaboli...

  5. Characterization of the bovine pregnancy-associated glycoprotein gene family--analysis of gene sequences, regulatory regions within the promoter and expression of selected genes.

    PubMed

    Telugu, Bhanu Prakash V L; Walker, Angela M; Green, Jonathan A

    2009-04-24

    The Pregnancy-associated glycoproteins (PAGs) belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1) we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2) we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3) we determined relative transcript abundance of selected PAGs during pregnancy and, 4) we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo) PAG-2. From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs), were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed differences in spatial and temporal expression. We also

  6. Characterization of the bovine pregnancy-associated glycoprotein gene family – analysis of gene sequences, regulatory regions within the promoter and expression of selected genes

    PubMed Central

    Telugu, Bhanu Prakash VL; Walker, Angela M; Green, Jonathan A

    2009-01-01

    Background The Pregnancy-associated glycoproteins (PAGs) belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1) we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2) we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3) we determined relative transcript abundance of selected PAGs during pregnancy and, 4) we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo) PAG-2. Results From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs), were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. Conclusion PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed differences in spatial

  7. Cis-regulatory mutations in human disease

    PubMed Central

    2009-01-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few. PMID:19641089

  8. Cis-regulatory mutations in human disease.

    PubMed

    Epstein, Douglas J

    2009-07-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few.

  9. Uncommon HLA alleles identified by hemizygous ultra-high Sanger sequencing: haplotype associations and reconsideration of their assignment in the Common and Well-Documented catalogue.

    PubMed

    Voorter, Christina E M; Groeneweg, Mathijs; Groeneveld, Lisette; Tilanus, Marcel G J

    2016-02-01

    Although the number of HLA alleles still increases, many of them have been reported being uncommon. This is partly due to lack of full length gene sequencing, especially for those alleles belonging to an allele ambiguity in which the first discovered allele has been assigned as the most frequent one. As members of the working group on Common and Well Documented (CWD) alleles and since we implemented full length group-specific sequencing as standard method routinely, we have investigated the presence of presumably rare alleles in our collection of HLA typing data. We identified 50 alleles, that were not previously encountered as Common or Well Documented. Sixteen of them should be added to the CWD catalogue, since we encountered them in 5 or more unrelated individuals. Another 11 could be added, based upon our results and the data present in the IMGT database and the rare allele section of the allele frequencies database. Furthermore, tight associations were observed between several different alleles even at the level of synonymous and non-coding sequences. In addition, in several cases the uncommon allele was found to be more frequent than its common counterpart.

  10. Paleoecology of the Devonian-Mississippian black-shale sequence in eastern Kentucky with an atlas of some common fossils

    SciTech Connect

    Barron, L.S.; Ettensohn, F.R.

    1981-04-01

    The Devonian-Mississippian black-shale sequence of eastern North America is a distinctive stratigraphic interval generally characterized by low clastic influx, high organic production in the water column, anaerobic bottom conditions, and the relative absence of fossil evidence for biologic activity. The laminated black shales which constitute most of the black-shale sequence are broken by two major sequences of interbedded greenish-gray, clayey shales which contain bioturbation and pyritized micromorph invertebrates. The black shales contain abundant evidence of life from upper parts of the water column such as fish fossils, conodonts, algae and other phytoplankton; however, there is a lack of evidence of benthic life. The rare brachiopods, crinoids, and molluscs that occur in the black shales were probably epiplanktic. A significant physical distinction between the environment in which the black sediments were deposited and that in which the greenish-gray sediments were deposited was the level of dissolved oxygen. The laminated black shales point to anaerobic conditions and the bioturbated greenish-gray shales suggest dysaerobic to marginally aerobic-dysaerobic conditions. A paleoenvironmental model in which quasi-estuarine circulation compliments and enhances the effect of a stratified water column can account for both depletion of dissolved oxygen in the bottom environments and the absence of oxygen replenishment during black-shale deposition. Periods of abundant clastic influx from fluvial environments to the east probably account for the abundance of clays in the greenish-gray shale as well as the small amounts of oxygen necessary to support the depauparate, opportunistic, benthic faunas found there. These pulses of greenish-gray clastics were short-lived and eventually were replaced by anaerobic conditions and low rates of clastic sedimentation which characterized most of black-shale deposition.

  11. PrimerSNP: a web tool for whole-genome selection of allele-specific and common primers of phylogenetically-related bacterial genomic sequences

    PubMed Central

    Yao, Jiqiang; Lin, Hong; Van Deynze, Allen; Doddapaneni, Harshavardhan; Francis, Martha; Lemos, Eliana Gertrudes Macedo; Civerolo, Edwin L

    2008-01-01

    Background The increasing number of genomic sequences of bacteria makes it possible to select unique SNPs of a particular strain/species at the whole genome level and thus design specific primers based on the SNPs. The high similarity of genomic sequences among phylogenetically-related bacteria requires the identification of the few loci in the genome that can serve as unique markers for strain differentiation. PrimerSNP attempts to identify reliable strain-specific markers, on which specific primers are designed for pathogen detection purpose. Results PrimerSNP is an online tool to design primers based on strain specific SNPs for multiple strains/species of microorganisms at the whole genome level. The allele-specific primers could distinguish query sequences of one strain from other homologous sequences by standard PCR reaction. Additionally, PrimerSNP provides a feature for designing common primers that can amplify all the homologous sequences of multiple strains/species of microorganisms. PrimerSNP is freely available at . Conclusion PrimerSNP is a high-throughput specific primer generation tool for the differentiation of phylogenetically-related strains/species. Experimental validation showed that this software had a successful prediction rate of 80.4 – 100% for strain specific primer design. PMID:18937861

  12. Comparison of Muscle Onset Activation Sequences between a Golf or Tennis Swing and Common Training Exercises Using Surface Electromyography: A Pilot Study

    PubMed Central

    Shultz, Rebecca; Fredericson, Michael

    2016-01-01

    Aim. The purpose of this pilot study is to use surface electromyography to determine an individual athlete's typical muscle onset activation sequence when performing a golf or tennis forward swing and to use the method to assess to what degree the sequence is reproduced with common conditioning exercises and a machine designed for this purpose. Methods. Data for 18 healthy male subjects were collected for 15 muscles of the trunk and lower extremities. Data were filtered and processed to determine the average onset of muscle activation for each motion. A Spearman correlation estimated congruence of activation order between the swing and each exercise. Correlations of each group were pooled with 95% confidence intervals using a random effects meta-analytic strategy. Results. The averaged sequences differed among each athlete tested, but pooled correlations demonstrated a positive association between each exercise and the participants' natural muscle onset activation sequence. Conclusion. The selected training exercises and Turning Point™ device all partially reproduced our athletes' averaged muscle onset activation sequences for both sports. The results support consideration of a larger, adequately powered study using this method to quantify to what degree each of the selected exercises is appropriate for use in both golf and tennis. PMID:27403454

  13. [Discrimination of psychoactive fungi (commonly called "magic mushrooms") based on the DNA sequence of the internal transcribed spacer region].

    PubMed

    Maruyama, Takuro; Shirota, Osamu; Kawahara, Nobuo; Yokoyama, Kazumasa; Makino, Yukiko; Goda, Yukihiro

    2003-02-01

    'Magic mushrooms' (MMs) are psychoactive fungi containing the hallucinogenic compounds, psilocin (1) and psilocybin (2). Since June 6, 2002, these fungi have been regulated by the Narcotics and Psychotropics Control Law in Japan. Because there are many kinds of MMs and they are sold even as dry powders in local markets, it is very difficult to identify the original species of the MMs by morphological observation. Therefore, we investigated the internal transcribed spacer (ITS) region in the ribosomal RNA gene of MMs obtained in Japanese markets to classify them by a genetic approach. Based on the size and nucleotide sequence of the ITS region amplified by PCR, tested MMs were classified into 6 groups. Furthermore, a comparison of the DNA sequences of the MMs with those of authentic samples or with those found in the databases (GenBank, EMBL and DDBJ) made it possible to identify the species of tested MMs. Analysis by LC revealed that psilocin (1) was contained at the highest level in Panaeolus cyanescens among the MMs, but was absent in the Amanita species.

  14. Identification of the transcriptional regulatory sequences of human calponin promoter and their use in targeting a conditionally replicating herpes vector to malignant human soft tissue and bone tumors.

    PubMed

    Yamamura, H; Hashio, M; Noguchi, M; Sugenoya, Y; Osakada, M; Hirano, N; Sasaki, Y; Yoden, T; Awata, N; Araki, N; Tatsuta, M; Miyatake, S I; Takahashi, K

    2001-05-15

    The calponin (basic or h1) gene, normally expressed in maturated smooth muscle cells, is aberrantly expressed in a variety of human soft tissue and bone tumors. In this study, we show that expression of the calponin gene in human soft tissue and bone tumor cells is regulated at the transcriptional level by the sequence between positions -260 and -219 upstream of the translation initiation site. A novel conditionally replicating herpes simplex virus-1 vector (d12.CALP) in which the calponin promoter drives expression of ICP4, a major trans-activating factor for viral genes was constructed and tested as an experimental treatment for malignant human soft tissue and bone tumors. In cell culture, d12.CALP at low multiplicity of infection (0.001 plaque-forming unit/cell) selectively killed calponin-positive human synovial sarcoma, leiomyosarcoma, and osteosarcoma cells. For in vivo studies, 10 animals harboring SK-LMS-1 human leiomyosarcoma cells were randomly divided and treated twice on days 0 and 9 intraneoplastically with either 1 x 10(7) plaque-forming units of d12.CALP/100 mm(3) of tumor volume or with medium alone. The viral treatment group showed stable and significant inhibition of tumorigenicity with apparent cure in four of five mice by day 35. Replication of viral DNA demonstrated by PCR amplification and expression of the inserted LacZ gene visualized by 5-bromo-4-chloro-3-indolyl-beta-D-galactopyranoside histochemistry was associated with oncolysis of d12.CALP-treated tumors, while sparing normal vascular smooth muscle cells. In mice harboring two SK-LMS-1 tumors, replication of d12.CALP was detected in a nontreated tumor distant from the site of virus inoculation. These results indicate that replication-competent virus vectors controlled by the calponin transcriptional regulatory sequence may be a new therapeutic strategy for treatment of malignant human soft tissue and bone tumors.

  15. Sequencing of SCN5A identifies rare and common variants associated with cardiac conduction: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.

    PubMed

    Magnani, Jared W; Brody, Jennifer A; Prins, Bram P; Arking, Dan E; Lin, Honghuang; Yin, Xiaoyan; Liu, Ching-Ti; Morrison, Alanna C; Zhang, Feng; Spector, Tim D; Alonso, Alvaro; Bis, Joshua C; Heckbert, Susan R; Lumley, Thomas; Sitlani, Colleen M; Cupples, L Adrienne; Lubitz, Steven A; Soliman, Elsayed Z; Pulit, Sara L; Newton-Cheh, Christopher; O'Donnell, Christopher J; Ellinor, Patrick T; Benjamin, Emelia J; Muzny, Donna M; Gibbs, Richard A; Santibanez, Jireh; Taylor, Herman A; Rotter, Jerome I; Lange, Leslie A; Psaty, Bruce M; Jackson, Rebecca; Rich, Stephen S; Boerwinkle, Eric; Jamshidi, Yalda; Sotoodehnia, Nona

    2014-06-01

    The cardiac sodium channel SCN5A regulates atrioventricular and ventricular conduction. Genetic variants in this gene are associated with PR and QRS intervals. We sought to characterize further the contribution of rare and common coding variation in SCN5A to cardiac conduction. In Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study, we performed targeted exonic sequencing of SCN5A (n=3699, European ancestry individuals) and identified 4 common (minor allele frequency >1%) and 157 rare variants. Common and rare SCN5A coding variants were examined for association with PR and QRS intervals through meta-analysis of European ancestry participants from CHARGE, National Heart, Lung, and Blood Institute's Exome Sequencing Project (n=607), and the UK10K (n=1275) and by examining Exome Sequencing Project African ancestry participants (n=972). Rare coding SCN5A variants in aggregate were associated with PR interval in European and African ancestry participants (P=1.3×10(-3)). Three common variants were associated with PR and QRS interval duration among European ancestry participants and one among African ancestry participants. These included 2 well-known missense variants: rs1805124 (H558R) was associated with PR and QRS shortening in European ancestry participants (P=6.25×10(-4) and P=5.2×10(-3), respectively) and rs7626962 (S1102Y) was associated with PR shortening in those of African ancestry (P=2.82×10(-3)). Among European ancestry participants, 2 novel synonymous variants, rs1805126 and rs6599230, were associated with cardiac conduction. Our top signal, rs1805126 was associated with PR and QRS lengthening (P=3.35×10(-7) and P=2.69×10(-4), respectively) and rs6599230 was associated with PR shortening (P=2.67×10(-5)). By sequencing SCN5A, we identified novel common and rare coding variants associated with cardiac conduction. © 2014 American Heart Association, Inc.

  16. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease.

    PubMed

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-05-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. Published by Elsevier Ltd.

  17. Vision from next generation sequencing: Multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease

    PubMed Central

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-01-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of “gene” itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  18. Establishment of quantitative sequencing and filter contact vial bioassay for monitoring pyrethroid resistance in the common bed bug, Cimex lectularius.

    PubMed

    Seong, Keon Mook; Lee, Da-Young; Yoon, Kyong Sup; Kwon, Deok Ho; Kim, Heung Chul; Klein, Terry A; Clark, J Marshall; Lee, Si Hyeock

    2010-07-01

    Two point mutations (V419L and L925I) in the voltage-sensitive sodium channel alpha-subunit gene have been identified in deltamethrin-resistant bed bugs. A quantitative sequencing (QS) protocol was developed to establish a population-based genotyping method as a molecular resistance-monitoring tool based on the frequency of the two mutations. The nucleotide signal ratio at each mutation site was generated from sequencing chromatograms and plotted against the corresponding resistance allele frequency. Frequency prediction equations were generated from the plots by linear regression, and the signal ratios were shown to highly correlate with resistance allele frequencies (r2 > 0.9928). As determined by QS, neither mutation was found in a bed bug population collected in 1993. Populations collected in recent years (2007-2009), however, exhibited completely or nearly saturating L925I mutation frequencies and highly variable frequencies of the V419L mutation. In addition to QS, the filter contact vial bioassay (FCVB) method was established and used to determine the baseline susceptibility and resistance of bed bugs to deltamethrin and lambda-cyhalothrin. A pyrethroid-resistant strain showed >9,375- and 6,990-fold resistance to deltamethrin and lambda-cyhalothrin, respectively. Resistance allele frequencies in different bed bug populations predicted by QS correlated well with the FCVB results, confirming the roles of the two mutations in pyrethroid resistance. Taken together, employment of QS in conjunction with FCVB should greatly facilitate the detection and monitoring of pyrethroid-resistant bed bugs in the field. The advantages of FCVB as an on-site resistance-monitoring tool are discussed.

  19. Expression profile of small RNAs in Acacia mangium secondary xylem tissue with contrasting lignin content - potential regulatory sequences in monolignol biosynthetic pathway

    PubMed Central

    2011-01-01

    Background Lignin, after cellulose, is the second most abundant biopolymer accounting for approximately 15-35% of the dry weight of wood. As an important component during wood formation, lignin is indispensable for plant structure and defense. However, it is an undesirable component in the pulp and paper industry. Removal of lignin from cellulose is costly and environmentally hazardous process. Tremendous efforts have been devoted to understand the role of enzymes and genes in controlling the amount and composition of lignin to be deposited in the cell wall. However, studies on the impact of downregulation and overexpression of monolignol biosynthesis genes in model species on lignin content, plant fitness and viability have been inconsistent. Recently, non-coding RNAs have been discovered to play an important role in regulating the entire monolignol biosynthesis pathway. As small RNAs have critical functions in various biological process during wood formation, small RNA profiling is an important tool for the identification of complete set of differentially expressed small RNAs between low lignin and high lignin secondary xylem. Results In line with this, we have generated two small RNAs libraries from samples with contrasting lignin content using Illumina GAII sequencer. About 10 million sequence reads were obtained in secondary xylem of Am48 with high lignin content (41%) and a corresponding 14 million sequence reads were obtained in secondary xylem of Am54 with low lignin content (21%). Our results suggested that A. mangium small RNAs are composed of a set of 12 highly conserved miRNAs families found in plant miRNAs database, 82 novel miRNAs and a large proportion of non-conserved small RNAs with low expression levels. The predicted target genes of those differentially expressed conserved and non-conserved miRNAs include transcription factors associated with regulation of the lignin biosynthetic pathway genes. Some of these small RNAs play an important role in

  20. Expression profile of small RNAs in Acacia mangium secondary xylem tissue with contrasting lignin content - potential regulatory sequences in monolignol biosynthetic pathway.

    PubMed

    Ong, Seong Siang; Wickneswari, Ratnam

    2011-11-30

    Lignin, after cellulose, is the second most abundant biopolymer accounting for approximately 15-35% of the dry weight of wood. As an important component during wood formation, lignin is indispensable for plant structure and defense. However, it is an undesirable component in the pulp and paper industry. Removal of lignin from cellulose is costly and environmentally hazardous process. Tremendous efforts have been devoted to understand the role of enzymes and genes in controlling the amount and composition of lignin to be deposited in the cell wall. However, studies on the impact of downregulation and overexpression of monolignol biosynthesis genes in model species on lignin content, plant fitness and viability have been inconsistent. Recently, non-coding RNAs have been discovered to play an important role in regulating the entire monolignol biosynthesis pathway. As small RNAs have critical functions in various biological process during wood formation, small RNA profiling is an important tool for the identification of complete set of differentially expressed small RNAs between low lignin and high lignin secondary xylem. In line with this, we have generated two small RNAs libraries from samples with contrasting lignin content using Illumina GAII sequencer. About 10 million sequence reads were obtained in secondary xylem of Am48 with high lignin content (41%) and a corresponding 14 million sequence reads were obtained in secondary xylem of Am54 with low lignin content (21%). Our results suggested that A. mangium small RNAs are composed of a set of 12 highly conserved miRNAs families found in plant miRNAs database, 82 novel miRNAs and a large proportion of non-conserved small RNAs with low expression levels. The predicted target genes of those differentially expressed conserved and non-conserved miRNAs include transcription factors associated with regulation of the lignin biosynthetic pathway genes. Some of these small RNAs play an important role in epigenetic silencing

  1. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean

    SciTech Connect

    Song, Qijian; Jia, Gaofeng; Hyten, David L.; Jenkins, Jerry; Hwang, Eun-Young; Schroeder, Steven G.; Osorno, Juan M.; Schmutz, Jeremy; Jackson, Scott A.; McClean, Phillip E.; Cregan, Perry B.

    2015-08-28

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of large scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad.

  2. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean.

    PubMed

    Song, Qijian; Jia, Gaofeng; Hyten, David L; Jenkins, Jerry; Hwang, Eun-Young; Schroeder, Steven G; Osorno, Juan M; Schmutz, Jeremy; Jackson, Scott A; McClean, Phillip E; Cregan, Perry B

    2015-08-28

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of large scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad. Copyright © 2015 Song et al.

  3. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean

    PubMed Central

    Song, Qijian; Jia, Gaofeng; Hyten, David L.; Jenkins, Jerry; Hwang, Eun-Young; Schroeder, Steven G.; Osorno, Juan M.; Schmutz, Jeremy; Jackson, Scott A.; McClean, Phillip E.; Cregan, Perry B.

    2015-01-01

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of large scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad. PMID:26318155

  4. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean

    DOE PAGES

    Song, Qijian; Jia, Gaofeng; Hyten, David L.; ...

    2015-08-28

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of largemore » scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad.« less

  5. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

    PubMed Central

    Pruitt, Kim D.; Harrow, Jennifer; Harte, Rachel A.; Wallin, Craig; Diekhans, Mark; Maglott, Donna R.; Searle, Steve; Farrell, Catherine M.; Loveland, Jane E.; Ruef, Barbara J.; Hart, Elizabeth; Suner, Marie-Marthe; Landrum, Melissa J.; Aken, Bronwen; Ayling, Sarah; Baertsch, Robert; Fernandez-Banet, Julio; Cherry, Joshua L.; Curwen, Val; DiCuccio, Michael; Kellis, Manolis; Lee, Jennifer; Lin, Michael F.; Schuster, Michael; Shkeda, Andrew; Amid, Clara; Brown, Garth; Dukhanina, Oksana; Frankish, Adam; Hart, Jennifer; Maidak, Bonnie L.; Mudge, Jonathan; Murphy, Michael R.; Murphy, Terence; Rajan, Jeena; Rajput, Bhanu; Riddick, Lillian D.; Snow, Catherine; Steward, Charles; Webb, David; Weber, Janet A.; Wilming, Laurens; Wu, Wenyu; Birney, Ewan; Haussler, David; Hubbard, Tim; Ostell, James; Durbin, Richard; Lipman, David

    2009-01-01

    Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions. PMID:19498102

  6. India Allele Finder: a web-based annotation tool for identifying common alleles in next-generation sequencing data of Indian origin.

    PubMed

    Zhang, Jimmy F; James, Francis; Shukla, Anju; Girisha, Katta M; Paciorkowski, Alex R

    2017-06-27

    We built India Allele Finder, an online searchable database and command line tool, that gives researchers access to variant frequencies of Indian Telugu individuals, using publicly available fastq data from the 1000 Genomes Project. Access to appropriate population-based genomic variant annotation can accelerate the interpretation of genomic sequencing data. In particular, exome analysis of individuals of Indian descent will identify population variants not reflected in European exomes, complicating genomic analysis for such individuals. India Allele Finder offers improved ease-of-use to investigators seeking to identify and annotate sequencing data from Indian populations. We describe the use of India Allele Finder to identify common population variants in a disease quartet whole exome dataset, reducing the number of candidate single nucleotide variants from 84 to 7. India Allele Finder is freely available to investigators to annotate genomic sequencing data from Indian populations. Use of India Allele Finder allows efficient identification of population variants in genomic sequencing data, and is an example of a population-specific annotation tool that simplifies analysis and encourages international collaboration in genomics research.

  7. ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing.

    PubMed

    Lopez-Doriga, Adriana; Feliubadaló, Lídia; Menéndez, Mireia; Lopez-Doriga, Sergio; Morón-Duran, Francisco D; del Valle, Jesús; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Campos, Olga; Gómez, Carolina; Pineda, Marta; González, Sara; Moreno, Victor; Capellá, Gabriel; Lázaro, Conxi

    2014-03-01

    Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.

  8. Lung Parenchymal Signal Intensity in MRI: A Technical Review with Educational Aspirations Regarding Reversible Versus Irreversible Transverse Relaxation Effects in Common Pulse Sequences.

    PubMed

    Mulkern, Robert; Haker, Steven; Mamata, Hatsuho; Lee, Edward; Mitsouras, Dimitrios; Oshio, Koichi; Balasubramanian, Mukund; Hatabu, Hiroto

    2014-03-01

    Lung parenchyma is challenging to image with proton MRI. The large air space results in ~l/5th as many signal-generating protons compared to other organs. Air/tissue magnetic susceptibility differences lead to strong magnetic field gradients throughout the lungs and to broad frequency distributions, much broader than within other organs. Such distributions have been the subject of experimental and theoretical analyses which may reveal aspects of lung microarchitecture useful for diagnosis. Their most immediate relevance to current imaging practice is to cause rapid signal decays, commonly discussed in terms of short T2(*) values of 1 ms or lower at typical imaging field strengths. Herein we provide a brief review of previous studies describing and interpreting proton lung spectra. We then link these broad frequency distributions to rapid signal decays, though not necessarily the exponential decays generally used to define T2(*) values. We examine how these decays influence observed signal intensities and spatial mapping features associated with the most prominent torso imaging sequences, including spoiled gradient and spin echo sequences. Effects of imperfect refocusing pulses on the multiple echo signal decays in single shot fast spin echo (SSFSE) sequences and effects of broad frequency distributions on balanced steady state free precession (bSSFP) sequence signal intensities are also provided. The theoretical analyses are based on the concept of explicitly separating the effects of reversible and irreversible transverse relaxation processes, thus providing a somewhat novel and more general framework from which to estimate lung signal intensity behavior in modern imaging practice.

  9. Genetic divergence analysis of the Common Barn Owl Tyto alba (Scopoli, 1769) and the Short-eared Owl Asio flammeus (Pontoppidan, 1763) from southern Chile using COI sequence.

    PubMed

    Colihueque, Nelson; Gantz, Alberto; Rau, Jaime Ricardo; Parraguez, Margarita

    2015-01-01

    In this paper new mitochondrial COI sequences of Common Barn Owl Tyto alba (Scopoli, 1769) and Short-eared Owl Asio flammeus (Pontoppidan, 1763) from southern Chile are reported and compared with sequences from other parts of the World. The intraspecific genetic divergence (mean p-distance) was 4.6 to 5.5% for the Common Barn Owl in comparison with specimens from northern Europe and Australasia and 3.1% for the Short-eared Owl with respect to samples from north America, northern Europe and northern Asia. Phylogenetic analyses revealed three distinctive groups for the Common Barn Owl: (i) South America (Chile and Argentina) plus Central and North America, (ii) northern Europe and (iii) Australasia, and two distinctive groups for the Short-eared Owl: (i) South America (Chile and Argentina) and (ii) north America plus northern Europe and northern Asia. The level of genetic divergence observed in both species exceeds the upper limit of intraspecific comparisons reported previously for Strigiformes. Therefore, this suggests that further research is needed to assess the taxonomic status, particularly for the Chilean populations that, to date, have been identified as belonging to these species through traditional taxonomy.

  10. Genetic divergence analysis of the Common Barn Owl Tyto alba (Scopoli, 1769) and the Short-eared Owl Asio flammeus (Pontoppidan, 1763) from southern Chile using COI sequence

    PubMed Central

    Colihueque, Nelson; Gantz, Alberto; Rau, Jaime Ricardo; Parraguez, Margarita

    2015-01-01

    Abstract In this paper new mitochondrial COI sequences of Common Barn Owl Tyto alba (Scopoli, 1769) and Short-eared Owl Asio flammeus (Pontoppidan, 1763) from southern Chile are reported and compared with sequences from other parts of the World. The intraspecific genetic divergence (mean p-distance) was 4.6 to 5.5% for the Common Barn Owl in comparison with specimens from northern Europe and Australasia and 3.1% for the Short-eared Owl with respect to samples from north America, northern Europe and northern Asia. Phylogenetic analyses revealed three distinctive groups for the Common Barn Owl: (i) South America (Chile and Argentina) plus Central and North America, (ii) northern Europe and (iii) Australasia, and two distinctive groups for the Short-eared Owl: (i) South America (Chile and Argentina) and (ii) north America plus northern Europe and northern Asia. The level of genetic divergence observed in both species exceeds the upper limit of intraspecific comparisons reported previously for Strigiformes. Therefore, this suggests that further research is needed to assess the taxonomic status, particularly for the Chilean populations that, to date, have been identified as belonging to these species through traditional taxonomy. PMID:26668551

  11. Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes.

    PubMed

    Liu, Wen; Ghouri, Fozia; Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim; Liu, Xiangdong

    2017-01-01

    Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93-11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93-11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re-sequencing

  12. Common methods for fecal sample storage in field studies yield consistent signatures of individual identity in microbiome sequencing data

    PubMed Central

    Blekhman, Ran; Tang, Karen; Archie, Elizabeth A.; Barreiro, Luis B.; Johnson, Zachary P.; Wilson, Mark E.; Kohn, Jordan; Yuan, Michael L.; Gesquiere, Laurence; Grieneisen, Laura E.; Tung, Jenny

    2016-01-01

    Field studies of wild vertebrates are frequently associated with extensive collections of banked fecal samples—unique resources for understanding ecological, behavioral, and phylogenetic effects on the gut microbiome. However, we do not understand whether sample storage methods confound the ability to investigate interindividual variation in gut microbiome profiles. Here, we extend previous work on storage methods for gut microbiome samples by comparing immediate freezing, the gold standard of preservation, to three methods commonly used in vertebrate field studies: lyophilization, storage in ethanol, and storage in RNAlater. We found that the signature of individual identity consistently outweighed storage effects: alpha diversity and beta diversity measures were significantly correlated across methods, and while samples often clustered by donor, they never clustered by storage method. Provided that all analyzed samples are stored the same way, banked fecal samples therefore appear highly suitable for investigating variation in gut microbiota. Our results open the door to a much-expanded perspective on variation in the gut microbiome across species and ecological contexts. PMID:27528013

  13. Development and cross-validation of sequencing-based assays for genotyping common polymorphisms of the CXCL5 gene.

    PubMed

    Zineh, Issam; Welder, Gregory J; Langaee, Taimour Y

    2006-08-01

    Epithelial neutrophil activating peptide (ENA-78) is encoded by the polymorphic CXCL5 gene and is a recruiter and activator of neutrophils. Furthermore, ENA-78 may be involved in pathological inflammatory processes and variable drug responses. To facilitate future disease-gene and pharmacogenetic investigation of ENA-78, we developed and cross-validated medium- to high-throughput genotyping assays for 2 commonly occurring CXCL5 polymorphisms (rs352046 and rs425535). Furthermore, we compared allele and genotype frequencies in a U.S. population with those of a previously studied European population. There was 100% genotype concordance between the 2 methods used (Pyrosequencing and TaqMan). Variant allele frequencies for rs352046 were consistent between the U.S. (16%) and European (16%) populations, while the rs425535 variant allele was more than twice as high in the European cohort (38% vs. 16%). There was complete linkage of genotypes at both loci in our population. The distribution of variant alleles for the 2 polymorphisms studied should be further evaluated in other populations. In addition, our data highlight the importance of assay validation using multiple platforms.

  14. Common methods for fecal sample storage in field studies yield consistent signatures of individual identity in microbiome sequencing data.

    PubMed

    Blekhman, Ran; Tang, Karen; Archie, Elizabeth A; Barreiro, Luis B; Johnson, Zachary P; Wilson, Mark E; Kohn, Jordan; Yuan, Michael L; Gesquiere, Laurence; Grieneisen, Laura E; Tung, Jenny

    2016-08-16

    Field studies of wild vertebrates are frequently associated with extensive collections of banked fecal samples-unique resources for understanding ecological, behavioral, and phylogenetic effects on the gut microbiome. However, we do not understand whether sample storage methods confound the ability to investigate interindividual variation in gut microbiome profiles. Here, we extend previous work on storage methods for gut microbiome samples by comparing immediate freezing, the gold standard of preservation, to three methods commonly used in vertebrate field studies: lyophilization, storage in ethanol, and storage in RNAlater. We found that the signature of individual identity consistently outweighed storage effects: alpha diversity and beta diversity measures were significantly correlated across methods, and while samples often clustered by donor, they never clustered by storage method. Provided that all analyzed samples are stored the same way, banked fecal samples therefore appear highly suitable for investigating variation in gut microbiota. Our results open the door to a much-expanded perspective on variation in the gut microbiome across species and ecological contexts.

  15. A common sequence motif determines the Cajal body-specific localization of box H/ACA scaRNAs.

    PubMed

    Richard, Patricia; Darzacq, Xavier; Bertrand, Edouard; Jády, Beáta E; Verheggen, Céline; Kiss, Tamás

    2003-08-15

    Post-transcriptional synthesis of 2'-O-methylated nucleotides and pseudouridines in Sm spliceosomal small nuclear RNAs takes place in the nucleoplasmic Cajal bodies and it is directed by guide RNAs (scaRNAs) that are structurally and functionally indistinguishable from small nucleolar RNAs (snoRNAs) directing rRNA modification in the nucleolus. The scaRNAs are synthesized in the nucleoplasm and specifically targeted to Cajal bodies. Here, mutational analysis of the human U85 box C/D-H/ACA scaRNA, followed by in situ localization, demonstrates that box H/ACA scaRNAs share a common Cajal body-specific localization signal, the CAB box. Two copies of the evolutionarily conserved CAB consensus (UGAG) are located in the terminal loops of the 5' and 3' hairpins of the box H/ACA domains of mammalian, Drosophila and plant scaRNAs. Upon alteration of the CAB boxes, mutant scaRNAs accumulate in the nucleolus. In turn, authentic snoRNAs can be targeted into Cajal bodies by addition of exogenous CAB box motifs. Our results indicate that scaRNAs represent an ancient group of small nuclear RNAs which are localized to Cajal bodies by an evolutionarily conserved mechanism.

  16. Targeted next-generation sequencing of commonly mutated genes in esophageal adenocarcinoma patients with long-term survival.

    PubMed

    Visser, E; Franken, I A; Brosens, L A A; de Leng, W W J; Strengman, E; Offerhaus, J A; Ruurda, J P; van Hillegersberg, R

    2017-09-01

    Survival of patients with esophageal adenocarcinoma remains poor and individual differences in prognosis remain unexplained. This study investigated whether gene mutations can explain why patients with high-risk (pT3-4, pN+) esophageal adenocarcinoma survive past 5 years after esophagectomy. Six long-term survivors (LTS) (≥5 years survival without recurrence) and six short-term survivors (STS) (<2 years survival due to recurrence) who underwent resection without neoadjuvant therapy for high-risk esophageal adenocarcinoma were included. Targeted next-generation sequencing of 16 genes related to esophageal adenocarcinoma was performed. Mutations were compared between the LTS and STS and described in comparison with literature. A total of 48 mutations in 10 genes were identified. In the LTS, the median number of mutated genes per sample was 5 (range: 0-5) and the samples together harbored 22 mutations in 8 genes: APC (n = 1), CDH11 (n = 2), CDKN2A (n = 2), FAT4 (n = 5), KRAS (n = 1), PTPRD (n = 1), TLR4 (n = 8), and TP53 (n = 2). The median number of mutated genes per sample in the STS was 4 (range: 1-8) and in total 26 mutations were found in six genes: CDH11 (n = 5), FAT4 (n = 7), SMAD4 (n = 1), SMARCA4 (n = 1), TLR4 (n = 7), and TP53 (n = 5). CDH11, CDKN2A, FAT4, TLR4, and TP53 were mutated in at least 2 LTS or STS, exceeding mutation rates in literature. Mutations across the LTS and STS were found in 10 of the 16 genes. The results warrant future studies to investigate a larger range of genes in a larger sample size. This may result in a panel with prognostic genes, to predict individual prognosis and to select effective individualized therapy for patients with esophageal adenocarcinoma. © The Authors 2017. Published by Oxford University Press on behalf of International Society for Diseases of the Esophagus. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. Epithelial and endothelial expression of the green fluorescent protein reporter gene under the control of bovine prion protein (PrP) gene regulatory sequences in transgenic mice

    NASA Astrophysics Data System (ADS)

    Lemaire-Vieille, Catherine; Schulze, Tobias; Podevin-Dimster, Valérie; Follet, Jérome; Bailly, Yannick; Blanquet-Grossard, Françoise; Decavel, Jean-Pierre; Heinen, Ernst; Cesbron, Jean-Yves

    2000-05-01

    The expression of the cellular form of the prion protein (PrPc) gene is required for prion replication and neuroinvasion in transmissible spongiform encephalopathies. The identification of the cell types expressing PrPc is necessary to understanding how the agent replicates and spreads from peripheral sites to the central nervous system. To determine the nature of the cell types expressing PrPc, a green fluorescent protein reporter gene was expressed in transgenic mice under the control of 6.9 kb of the bovine PrP gene regulatory sequences. It was shown that the bovine PrP gene is expressed as two populations of mRNA differing by alternative splicing of one 115-bp 5' untranslated exon in 17 different bovine tissues. The analysis of transgenic mice showed reporter gene expression in some cells that have been identified as expressing PrP, such as cerebellar Purkinje cells, lymphocytes, and keratinocytes. In addition, expression of green fluorescent protein was observed in the plexus of the enteric nervous system and in a restricted subset of cells not yet clearly identified as expressing PrP: the epithelial cells of the thymic medullary and the endothelial cells of both the mucosal capillaries of the intestine and the renal capillaries. These data provide valuable information on the distribution of PrPc at the cellular level and argue for roles of the epithelial and endothelial cells in the spread of infection from the periphery to the brain. Moreover, the transgenic mice described in this paper provide a model that will allow for the study of the transcriptional activity of the PrP gene promoter in response to scrapie infection.

  18. ‘Default’ generated neonatal regulatory T cells are hypomethylated at conserved non-coding sequence 2 and promote long-term cardiac allograft survival

    PubMed Central

    Cheng, Chao; Wang, Sihua; Ye, Ping; Huang, Xiaofan; Liu, Zheng; Wu, Jie; Sun, Yuan; Xie, Aini; Wang, Guohua; Xia, Jiahong

    2014-01-01

    Regulatory T (Treg) cells play an important role in the maintenance of immune self-tolerance and homeostasis. We previously reported that neonatal CD4+ T cells have an intrinsic ‘default’ mechanism to become Treg (neoTreg) cells in response to T-cell receptor (TCR) stimulation. However, the underlying mechanisms are unclear and the effects of neoTreg cells on regulating immune responses remain unknown. Due to their involvement in Foxp3 regulation, we examined the role of DNA methyltransferase 1 (DNMT1) and DNMT3b during the induction of neoTreg cells in the Foxp3gfp mice. The function of neoTreg cells was assessed in an acute allograft rejection model established in RAG2−/− mice with allograft cardiac transplantation and transferred with syngeneic CD4+ effector T cells. Following ex vivo TCR stimulation, the DNMT activity was increased threefold in adult CD4+ T cells, but not significantly increased in neonatal cells. However, adoptively transferred neoTreg cells significantly prolonged cardiac allograft survival (mean survival time 47 days, P < 0·001) and maintained Foxp3 expression similar to natural Treg cells. The neoTreg cells were hypomethylated at the conserved non-coding DNA sequence 2 locus of Foxp3 compared with adult Treg cells. The DNMT antagonist 5-aza-2′-deoxycytidine (5-Aza) induced increased Foxp3 expression in mature CD4+ T cells. 5-Aza-inducible Treg cells combined with continuous 5-Aza treatment prolonged graft survival. These results indicate that the ‘default’ pathway of neoTreg cell differentiation is associated with reduced DNMT1 and DNMT3b response to TCR stimulus. The neoTreg cells may be a strategy to alleviate acute allograft rejection. PMID:24944101

  19. Analysis of transcriptional and upstream regulatory sequence activity of two environmental stress-inducible genes, NBS-Str1 and BLEC-Str8, of rice.

    PubMed

    Ray, Swatismita; Kapoor, Sanjay; Tyagi, Akhilesh K

    2012-04-01

    Two abiotic stress-inducible upstream regulatory sequences (URSs) from rice have been identified and functionally characterized in rice. NBS-Str1 and BLEC-Str8 genes have been identified, by analysing the transcriptome data of cold, salt and desiccation stress-treated 7-day-old rice (Oryza sativa L. var. IR64) seedling, to be preferentially responsive to desiccation and salt stress, respectively. NBS-Str1 and BLEC-Str8 genes code for putative NBS (nucleotide binding site)-LRR (leucine rich repeat) and β-lectin domain protein, respectively. NBS-Str1 URS is induced in root tissue, preferentially in vascular bundle, during 3 and 24 h of desiccation stress condition in transgenic 7-day-old rice seedling. In mature transgenic plants, this URS shows induction in root and shoot tissue under desiccation stress as well as under prolonged (1 and 2 day) salt stress. BLEC-Str8 URS shows basal activity under un-stressed condition, however, it is inducible under salt stress condition in both root and leaf tissues in young seedling and mature plants. Activity of BLEC-Str8 URS has been found to be vascular tissue preferential, however, under salt stress condition its activity is also found in the mesophyll tissue. NBS-Str1 and BLEC-Str8 URSs are inducible by heavy metal, copper and manganese. Interestingly, both the URSs have been found to be non responsive to ABA treatment, implying them to be part of ABA-independent abiotic stress response pathway. These URSs could prove useful for expressing a transgene in a stress responsive manner for development of stress tolerant transgenic systems.

  20. Expression of Alternatively Spliced Human T-Cell Leukemia Virus Type 1 mRNAs Is Influenced by Mitosis and by a Novel cis-Acting Regulatory Sequence

    PubMed Central

    Cavallari, Ilaria; Rende, Francesca; Bona, Marion K.; Sztuba-Solinska, Joanna; Silic-Benussi, Micol; Tognon, Martina; Franchini, Genoveffa; D'Agostino, Donna M.

    2015-01-01

    ABSTRACT Human T-cell leukemia virus type 1 (HTLV-1) expression depends on the concerted action of Tax, which drives transcription of the viral genome, and Rex, which favors expression of incompletely spliced mRNAs and determines a 2-phase temporal pattern of viral expression. In the present study, we investigated the Rex dependence of the complete set of alternatively spliced HTLV-1 mRNAs. Analyses of cells transfected with Rex–wild-type and Rex-knockout HTLV-1 molecular clones using splice site-specific quantitative reverse transcription (qRT)-PCR revealed that mRNAs encoding the p30Tof, p13, and p12/8 proteins were Rex dependent, while the p21rex mRNA was Rex independent. These findings provide a rational explanation for the intermediate-late temporal pattern of expression of the p30tof, p13, and p12/8 mRNAs described in previous studies. All the Rex-dependent mRNAs contained a 75-nucleotide intronic region that increased the nuclear retention and degradation of a reporter mRNA in the absence of other viral sequences. Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) analysis revealed that this sequence formed a stable hairpin structure. Cell cycle synchronization experiments indicated that mitosis partially bypasses the requirement for Rex to export Rex-dependent HTLV-1 transcripts. These findings indicate a link between the cycling properties of the host cell and the temporal pattern of viral expression/latency that might influence the ability of the virus to spread and evade the immune system. IMPORTANCE HTLV-1 is a complex retrovirus that causes two distinct pathologies termed adult T-cell leukemia/lymphoma and tropical spastic paraparesis/HTLV-1-associated myelopathy in about 5% of infected individuals. Expression of the virus depends on the concerted action of Tax, which drives transcription of the viral genome, and Rex, which favors expression of incompletely spliced mRNAs and determines a 2-phase temporal pattern of virus

  1. Expression of Alternatively Spliced Human T-Cell Leukemia Virus Type 1 mRNAs Is Influenced by Mitosis and by a Novel cis-Acting Regulatory Sequence.

    PubMed

    Cavallari, Ilaria; Rende, Francesca; Bona, Marion K; Sztuba-Solinska, Joanna; Silic-Benussi, Micol; Tognon, Martina; LeGrice, Stuart F J; Franchini, Genoveffa; D'Agostino, Donna M; Ciminale, Vincenzo

    2015-11-18

    Human T-cell leukemia virus type 1 (HTLV-1) expression depends on the concerted action of Tax, which drives transcription of the viral genome, and Rex, which favors expression of incompletely spliced mRNAs and determines a 2-phase temporal pattern of viral expression. In the present study, we investigated the Rex dependence of the complete set of alternatively spliced HTLV-1 mRNAs. Analyses of cells transfected with Rex-wild-type and Rex-knockout HTLV-1 molecular clones using splice site-specific quantitative reverse transcription (qRT)-PCR revealed that mRNAs encoding the p30Tof, p13, and p12/8 proteins were Rex dependent, while the p21rex mRNA was Rex independent. These findings provide a rational explanation for the intermediate-late temporal pattern of expression of the p30tof, p13, and p12/8 mRNAs described in previous studies. All the Rex-dependent mRNAs contained a 75-nucleotide intronic region that increased the nuclear retention and degradation of a reporter mRNA in the absence of other viral sequences. Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) analysis revealed that this sequence formed a stable hairpin structure. Cell cycle synchronization experiments indicated that mitosis partially bypasses the requirement for Rex to export Rex-dependent HTLV-1 transcripts. These findings indicate a link between the cycling properties of the host cell and the temporal pattern of viral expression/latency that might influence the ability of the virus to spread and evade the immune system. HTLV-1 is a complex retrovirus that causes two distinct pathologies termed adult T-cell leukemia/lymphoma and tropical spastic paraparesis/HTLV-1-associated myelopathy in about 5% of infected individuals. Expression of the virus depends on the concerted action of Tax, which drives transcription of the viral genome, and Rex, which favors expression of incompletely spliced mRNAs and determines a 2-phase temporal pattern of virus expression. The findings reported

  2. The ribosomal transcription units of Haplorchis pumilio and H. taichui and the use of 28S rDNA sequences for phylogenetic identification of common heterophyids in Vietnam.

    PubMed

    Le, Thanh Hoa; Nguyen, Khue Thi; Nguyen, Nga Thi Bich; Doan, Huong Thi Thanh; Dung, Do Trung; Blair, David

    2017-01-09

    Heterophyidiasis is now a major public health threat in many tropical countries. Species in the trematode family Heterophyidae infecting humans include Centrocestus formosanus, Haplorchis pumilio, H. taichui, H. yokogawai, Procerovum varium and Stellantchasmus falcatus. For molecular phylogenetic and systematic studies on trematodes, we need more prospective markers for taxonomic identification and classification. This study provides near-complete ribosomal transcription units (rTU) from Haplorchis pumilio and H. taichui and demonstrates the use of 28S rDNA sequences for identification and phylogenetic analysis. The near-complete ribosomal transcription units (rTU), consisting of 18S, ITS1, 5.8S, ITS2 and 28S rRNA genes and spacers, from H. pumilio and H. taichui from human hosts in Vietnam, were determined and annotated. Sequence analysis revealed tandem repetitive elements in ITS1 in H. pumilio and in ITS2 in H. taichui. A phylogenetic tree inferred from 28S rDNA sequences of 40 trematode strains/species, including 14 Vietnamese heterophyid individuals, clearly confirmed the status of each of the Vietnamese species: Centrocestus formosanus, Haplorchis pumilio, H. taichui, H. yokogawai, Procerovum varium and Stellantchasmus falcatus. However, the family Heterophyidae was clearly not monophyletic, with some genera apparently allied with other families within the superfamily Opisthorchioidea (i.e. Cryptogonimidae and Opisthorchiidae). These families and their constituent genera require substantial re-evaluation using a combination of morphological and molecular data. Our new molecular data will assist in such studies. The 28S rDNA sequences are conserved among individuals within a species but varied between genera. Based on analysis of 40 28S rDNA sequences representing 19 species in the superfamily Opisthorchioidea and an outgroup taxon (Alaria alata, family Diplostomidae), six common human pathogenic heterophyids were identified and clearly resolved. The

  3. Risk-based assessment applied to QA GLP audits. How to fulfill regulatory requirements while making the best use of our common sense, knowledge, talents, and resources?

    PubMed

    Piton, Alain

    2008-01-01

    For ages the standard plan of internal good laboratory practice (GLP) audits has been designed according to the study critical phases concept. A decade ago the concept of facility-based and processbased audits was adopted, mostly under the influence of short-term and in vitro study design. For unclear reasons, the quarterly inspection scheme has been the prevailing rule. Nowadays, the emerging concept of risk management reaches the field of GLP. In this context, the following items are discussed: i) nature of risks associated with the GLP principles and GLP studies; ii) risk in a GLP environment and criteria used to characterize a risk in laboratory and in an environment of research and development; iii) quality and integrity of data, study results and scientific conclusions; iv) risks associated to the processes and those associated to the products; v) workers safety; vi) consumers safety; vii) variety of tools available for the assessment of the above specific risks; viii) principles of risk assessment (the five-step approach); ix) standard and specific risk assessment tools; x) required level of accuracy; xi) use of risk assessment results for the elaboration of audit plans; xi) nature of information obtained; xii) prioritization; xiii) intrinsic risk versus available resources; xiv) potential caveats from a regulatory standpoint; xv) compatibility of risk approach with the GLP regulatory requirements; xvi) how to demonstrate the GLP goals are fulfilled although some of the GLP specific requirements may not be; xvii) benefits of this approach for the audits efficiency and the quality systems improvement; xviii) what the risk approach provides to the organization; xix) how does risk approach efficiency compare to standard efficacy; xx) use of metrics for continuous improvement.

  4. Common CYP2D6 polymorphisms affecting alternative splicing and transcription: long-range haplotypes with two regulatory variants modulate CYP2D6 activity.

    PubMed

    Wang, Danxin; Poi, Ming J; Sun, Xiaochun; Gaedigk, Andrea; Leeder, J Steven; Sadee, Wolfgang

    2014-01-01

    Cytochrome P450 2D6 (CYP2D6) is involved in the metabolism of 25% of clinically used drugs. Genetic polymorphisms cause substantial variation in CYP2D6 activity and serve as biomarkers guiding drug therapy. However, genotype-phenotype relationships remain ambiguous except for poor metabolizers carrying null alleles, suggesting the presence of yet unknown genetic variants. Searching for regulatory CYP2D6 polymorphisms, we find that a SNP defining the CYP2D6*2 allele, rs16947 [R296C, 17-60% minor allele frequency (MAF)], previously thought to convey normal activity, alters exon 6 splicing, thereby reducing CYP2D6 expression at least 2-fold. In addition, two completely linked SNPs (rs5758550/rs133333, MAF 13-42%) increase CYP2D6 transcription more than 2-fold, located in a distant downstream enhancer region (>100 kb) that interacts with the CYP2D6 promoter. In high linkage disequilibrium (LD) with each other, rs16947 and the enhancer SNPs form haplotypes that affect CYP2D6 enzyme activity in vivo. In a pediatric cohort of 164 individuals, rs16947 alone (minor haplotype frequency 28%) was associated with reduced CYP2D6 metabolic activity (measured as dextromethorphan/metabolite ratios), whereas rs5758550/rs133333 alone (frequency 3%) resulted in increased CYP2D6 activity, while haplotypes containing both rs16947 and rs5758550/rs133333 were similar to the wild-type. Other alleles used in biomarker panels carrying these variants such as CYP2D6*41 require re-evaluation of independ