High throughput platforms for structural genomics of integral membrane proteins.
Mancia, Filippo; Love, James
2011-08-01
Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.
Hamilton, Eileen P; Kapusta, Aurélie; Huvos, Piroska E; Bidwell, Shelby L; Zafar, Nikhat; Tang, Haibao; Hadjithomas, Michalis; Krishnakumar, Vivek; Badger, Jonathan H; Caler, Elisabet V; Russ, Carsten; Zeng, Qiandong; Fan, Lin; Levin, Joshua Z; Shea, Terrance; Young, Sarah K; Hegarty, Ryan; Daza, Riza; Gujja, Sharvari; Wortman, Jennifer R; Birren, Bruce W; Nusbaum, Chad; Thomas, Jainy; Carey, Clayton M; Pritham, Ellen J; Feschotte, Cédric; Noto, Tomoko; Mochizuki, Kazufumi; Papazyan, Romeo; Taverna, Sean D; Dear, Paul H; Cassidy-Hanley, Donna M; Xiong, Jie; Miao, Wei; Orias, Eduardo; Coyne, Robert S
2016-01-01
The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum. DOI: http://dx.doi.org/10.7554/eLife.19090.001 PMID:27892853
Lee, Chi-Ching; Chen, Yi-Ping Phoebe; Yao, Tzu-Jung; Ma, Cheng-Yu; Lo, Wei-Cheng; Lyu, Ping-Chiang; Tang, Chuan Yi
2013-04-10
Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. Copyright © 2012 Elsevier B.V. All rights reserved.
Active Transposition in Genomes
Huang, Cheng Ran Lisa; Burns, Kathleen H.; Boeke, Jef D.
2013-01-01
Transposons are DNA sequences capable of moving in genomes. Early evidence showed their accumulation in many species and suggested their continued activity in at least isolated organisms. In the past decade, with the development of various genomic technologies, it has become abundantly clear that ongoing activity is the rule rather than the exception. Active transposons of various classes are observed throughout plants and animals, including humans. They continue to create new insertions, have an enormous variety of structural and functional impact on genes and genomes, and play important roles in genome evolution. Transposon activities have been identified and measured by employing various strategies. Here, we summarize evidence of current transposon activity in various plant and animal genomes. PMID:23145912
Refining the structure and content of clinical genomic reports.
Dorschner, Michael O; Amendola, Laura M; Shirts, Brian H; Kiedrowski, Lesli; Salama, Joseph; Gordon, Adam S; Fullerton, Stephanie M; Tarczy-Hornoch, Peter; Byers, Peter H; Jarvik, Gail P
2014-03-01
To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. © 2014 Wiley Periodicals, Inc.
Refining the Structure and Content of Clinical Genomic Reports
DORSCHNER, MICHAEL O.; AMENDOLA, LAURA M.; SHIRTS, BRIAN H.; KIEDROWSKI, LESLI; SALAMA, JOSEPH; GORDON, ADAM S.; FULLERTON, STEPHANIE M.; TARCZY-HORNOCH, PETER; BYERS, PETER H.; JARVIK, GAIL P.
2014-01-01
To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. PMID:24616401
Akhunov, Eduard D.; Sehgal, Sunish; Liang, Hanquan; Wang, Shichen; Akhunova, Alina R.; Kaur, Gaganpreet; Li, Wanlong; Forrest, Kerrie L.; See, Deven; Šimková, Hana; Ma, Yaqin; Hayden, Matthew J.; Luo, Mingcheng; Faris, Justin D.; Doležel, Jaroslav; Gill, Bikram S.
2013-01-01
Cycles of whole-genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied by comparing the patterns of gene structure changes, alternative splicing (AS), and codon substitution rates among wheat and model grass genomes. In orthologous gene sets, significantly more acquired and lost exonic sequences were detected in wheat than in model grasses. In wheat, 35% of these gene structure rearrangements resulted in frame-shift mutations and premature termination codons. An increased codon mutation rate in the wheat lineage compared with Brachypodium distachyon was found for 17% of orthologs. The discovery of premature termination codons in 38% of expressed genes was consistent with ongoing pseudogenization of the wheat genome. The rates of AS within the individual wheat subgenomes (21%–25%) were similar to diploid plants. However, we uncovered a high level of AS pattern divergence between the duplicated homeologous copies of genes. Our results are consistent with the accelerated accumulation of AS isoforms, nonsynonymous mutations, and gene structure rearrangements in the wheat lineage, likely due to genetic redundancy created by WGDs. Whereas these processes mostly contribute to the degeneration of a duplicated genome and its diploidization, they have the potential to facilitate the origin of new functional variations, which, upon selection in the evolutionary lineage, may play an important role in the origin of novel traits. PMID:23124323
Retrotransposons as regulators of gene expression
Elbarbary, Reyad A.; Lucas, Bronwyn A.; Maquat, Lynne E.
2016-01-01
Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body’s defense mechanisms. PMID:26912865
Bradley, Anthony R; Echalier, Aude; Fairhead, Michael; Strain-Damerell, Claire; Brennan, Paul; Bullock, Alex N; Burgess-Brown, Nicola A; Carpenter, Elisabeth P; Gileadi, Opher; Marsden, Brian D; Lee, Wen Hwa; Yue, Wyatt; Bountra, Chas; von Delft, Frank
2017-11-08
The ongoing explosion in genomics data has long since outpaced the capacity of conventional biochemical methodology to verify the large number of hypotheses that emerge from the analysis of such data. In contrast, it is still a gold-standard for early phenotypic validation towards small-molecule drug discovery to use probe molecules (or tool compounds), notwithstanding the difficulty and cost of generating them. Rational structure-based approaches to ligand discovery have long promised the efficiencies needed to close this divergence; in practice, however, this promise remains largely unfulfilled, for a host of well-rehearsed reasons and despite the huge technical advances spearheaded by the structural genomics initiatives of the noughties. Therefore the current, fourth funding phase of the Structural Genomics Consortium (SGC), building on its extensive experience in structural biology of novel targets and design of protein inhibitors, seeks to redefine what it means to do structural biology for drug discovery. We developed the concept of a Target Enabling Package (TEP) that provides, through reagents, assays and data, the missing link between genetic disease linkage and the development of usefully potent compounds. There are multiple prongs to the ambition: rigorously assessing targets' genetic disease linkages through crowdsourcing to a network of collaborating experts; establishing a systematic approach to generate the protocols and data that comprise each target's TEP; developing new, X-ray-based fragment technologies for generating high quality chemical matter quickly and cheaply; and exploiting a stringently open access model to build multidisciplinary partnerships throughout academia and industry. By learning how to scale these approaches, the SGC aims to make structures finally serve genomics, as originally intended, and demonstrate how 3D structures systematically allow new modes of druggability to be discovered for whole classes of targets. © 2017 The Author(s).
A Workshop Report on Wheat Genome Sequencing
Gill, Bikram S.; Appels, Rudi; Botha-Oberholster, Anna-Maria; Buell, C. Robin; Bennetzen, Jeffrey L.; Chalhoub, Boulos; Chumley, Forrest; Dvořák, Jan; Iwanaga, Masaru; Keller, Beat; Li, Wanlong; McCombie, W. Richard; Ogihara, Yasunari; Quetier, Francis; Sasaki, Takuji
2004-01-01
Sponsored by the National Science Foundation and the U.S. Department of Agriculture, a wheat genome sequencing workshop was held November 10–11, 2003, in Washington, DC. It brought together 63 scientists of diverse research interests and institutions, including 45 from the United States and 18 from a dozen foreign countries (see list of participants at http://www.ksu.edu/igrow). The objectives of the workshop were to discuss the status of wheat genomics, obtain feedback from ongoing genome sequencing projects, and develop strategies for sequencing the wheat genome. The purpose of this report is to convey the information discussed at the workshop and provide the basis for an ongoing dialogue, bringing forth comments and suggestions from the genetics community. PMID:15514080
Retrotransposons as regulators of gene expression.
Elbarbary, Reyad A; Lucas, Bronwyn A; Maquat, Lynne E
2016-02-12
Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body's defense mechanisms. Copyright © 2016, American Association for the Advancement of Science.
A whole-genome, radiation hybrid map of wheat
USDA-ARS?s Scientific Manuscript database
Generating a reference sequence of bread wheat (Triticum aestivum L.) is a challenging task because of its large, highly repetitive and allopolyploid genome. Ordering of BAC- and NGS-based contigs in ongoing wheat genome-sequencing projects primarily uses recombination and comparative genomics-base...
GenColors: annotation and comparative genomics of prokaryotes made easy.
Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen
2007-01-01
GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
78 FR 18680 - Genomic Medicine Program Advisory Committee, Notice of Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-27
... DEPARTMENT OF VETERANS AFFAIRS Genomic Medicine Program Advisory Committee, Notice of Meeting The..., that the Genomic Medicine Program Advisory Committee will meet on April 11, 2013, in Suite 1000 at the... ongoing Million Veteran Program, as well as the clinical Genomic Medicine Service. The emerging...
2012-07-01
compared between wild type and mutant plants via chromatin immunoprecipitation (ChIP). Additionally, differences in centromere structure between wild...specific focus on non-CpG contexts. The proposed work is ongoing, and so far the major accomplishments include creation of relevant plant lines...laboratories that study topics related to breast cancer and epigenetics 1. Monthly journal club meetings at the Center for Vertebrate Genomics (CVG) which
Crystal structure of enterococcus faecalis sly A-like transcriptional factor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, R.; Zhang, R.; Zagnitko, O.
2003-05-30
The crystal structure of a SlyA transcriptional regulator at 1.6 {angstrom} resolution is presented, and structural relationships between members of the MarR/SlyA family are discussed. The SlyA family, which includes SlyA, Rap, Hor, and RovA proteins, is widely distributed in bacterial and archaeal genomes. Current evidence suggests that SlyA-like factors act as repressors, activators, and modulators of gene transcription. These proteins have been shown to up-regulate the expression of molecular chaperones, acid-resistance proteins, and cytolysin, and down-regulate several biosynthetic enzymes. The structure of SlyA from Enterococcus faecalis, determined as a part of an ongoing structural genomics initiative (www.mcsg.anl.gov), revealed themore » same winged helix DNA-binding motif that was recently found in the MarR repressor from Escherichia coli and the MexR repressor from Pseudomonas aeruginosa, a sequence homologue of MarR. Phylogenetic analysis of the MarR/SlyA family suggests that Sly is placed between the SlyA and MarR subfamilies and shows significant sequence similarity to members of both subfamilies.« less
McEwen, Jean E; Boyer, Joy T; Sun, Kathie Y; Rothenberg, Karen H; Lockhart, Nicole C; Guyer, Mark S
2014-01-01
For more than 20 years, the Ethical, Legal, and Social Implications (ELSI) Program of the National Human Genome Research Institute has supported empirical and conceptual research to anticipate and address the ethical, legal, and social implications of genomics. As a component of the agency that funds much of the underlying science, the program has always been an experiment. The ever-expanding number of issues the program addresses and the relatively low level of commitment on the part of other funding agencies to support such research make setting priorities especially challenging. Program-supported studies have had a significant impact on the conduct of genomics research, the implementation of genomic medicine, and broader public policies. The program's influence is likely to grow as ELSI research, genomics research, and policy development activities become increasingly integrated. Achieving the benefits of increased integration while preserving the autonomy, objectivity, and intellectual independence of ELSI investigators presents ongoing challenges and new opportunities.
USDA-ARS?s Scientific Manuscript database
Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...
Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K
2016-04-18
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.
NASA Astrophysics Data System (ADS)
Gusev, Oleg; Sugimoto, Manabu; Novikova, Nataliya; Sychev, Vladimir; Okuda, Takashi; Kikawada, Takahiro
2012-07-01
Anhydrobiotic chironomid larvae of Polypedilum vanderplanki (Diptera) can withstand prolonged complete desiccation as well as other external stresses including ionizing radiation. Recent experiments showed that this insect is able to survive long-tern exposure to real outer space. At the same time, we found that dehydration causes alterations in chromatin structure and a severe fragmentation of nuclear DNA in the cells of the larvae despite successful anhydrobiosis. Analysis of several remote populations of the chironomid in Africa that desiccation-related DNA damage might be a driving genetic force for rapid radiation within the species. First results of ongoing genome project suggest that origin and evolution of anhydrobiosis in this single insect species related to rapid duplication of the genes, coding late embryogenesis abundant proteins (LEA) and other molecular agents directly involved in desiccation resistance in the cells. Analysis of genome-wide mRNA expression profiles in the larvae subjected to desiccation shows that joint-activity of large multiple-genes coding regions in the genome involved in control of anhydrobiosis-related molecular adaptations in the chironomid.
Patterns of admixture and population structure in native populations of Northwest North America.
Verdu, Paul; Pemberton, Trevor J; Laurent, Romain; Kemp, Brian M; Gonzalez-Oliver, Angelica; Gorodezky, Clara; Hughes, Cris E; Shattuck, Milena R; Petzelt, Barbara; Mitchell, Joycelynn; Harry, Harold; William, Theresa; Worl, Rosita; Cybulski, Jerome S; Rosenberg, Noah A; Malhi, Ripan S
2014-08-01
The initial contact of European populations with indigenous populations of the Americas produced diverse admixture processes across North, Central, and South America. Recent studies have examined the genetic structure of indigenous populations of Latin America and the Caribbean and their admixed descendants, reporting on the genomic impact of the history of admixture with colonizing populations of European and African ancestry. However, relatively little genomic research has been conducted on admixture in indigenous North American populations. In this study, we analyze genomic data at 475,109 single-nucleotide polymorphisms sampled in indigenous peoples of the Pacific Northwest in British Columbia and Southeast Alaska, populations with a well-documented history of contact with European and Asian traders, fishermen, and contract laborers. We find that the indigenous populations of the Pacific Northwest have higher gene diversity than Latin American indigenous populations. Among the Pacific Northwest populations, interior groups provide more evidence for East Asian admixture, whereas coastal groups have higher levels of European admixture. In contrast with many Latin American indigenous populations, the variance of admixture is high in each of the Pacific Northwest indigenous populations, as expected for recent and ongoing admixture processes. The results reveal some similarities but notable differences between admixture patterns in the Pacific Northwest and those in Latin America, contributing to a more detailed understanding of the genomic consequences of European colonization events throughout the Americas.
Patterns of Admixture and Population Structure in Native Populations of Northwest North America
Verdu, Paul; Pemberton, Trevor J.; Laurent, Romain; Kemp, Brian M.; Gonzalez-Oliver, Angelica; Gorodezky, Clara; Hughes, Cris E.; Shattuck, Milena R.; Petzelt, Barbara; Mitchell, Joycelynn; Harry, Harold; William, Theresa; Worl, Rosita; Cybulski, Jerome S.; Rosenberg, Noah A.; Malhi, Ripan S.
2014-01-01
The initial contact of European populations with indigenous populations of the Americas produced diverse admixture processes across North, Central, and South America. Recent studies have examined the genetic structure of indigenous populations of Latin America and the Caribbean and their admixed descendants, reporting on the genomic impact of the history of admixture with colonizing populations of European and African ancestry. However, relatively little genomic research has been conducted on admixture in indigenous North American populations. In this study, we analyze genomic data at 475,109 single-nucleotide polymorphisms sampled in indigenous peoples of the Pacific Northwest in British Columbia and Southeast Alaska, populations with a well-documented history of contact with European and Asian traders, fishermen, and contract laborers. We find that the indigenous populations of the Pacific Northwest have higher gene diversity than Latin American indigenous populations. Among the Pacific Northwest populations, interior groups provide more evidence for East Asian admixture, whereas coastal groups have higher levels of European admixture. In contrast with many Latin American indigenous populations, the variance of admixture is high in each of the Pacific Northwest indigenous populations, as expected for recent and ongoing admixture processes. The results reveal some similarities but notable differences between admixture patterns in the Pacific Northwest and those in Latin America, contributing to a more detailed understanding of the genomic consequences of European colonization events throughout the Americas. PMID:25122539
Crystal structure of chorismate mutase from Burkholderia thailandensis.
Asojo, Oluwatoyin A; Dranow, David M; Serbzhinskiy, Dmitry; Subramanian, Sandhya; Staker, Bart; Edwards, Thomas E; Myler, Peter J
2018-05-01
Burkholderia thailandensis is often used as a model for more virulent members of this genus of proteobacteria that are highly antibiotic-resistant and are potential agents of biological warfare that are infective by inhalation. As part of ongoing efforts to identify potential targets for the development of rational therapeutics, the structures of enzymes that are absent in humans, including that of chorismate mutase from B. thailandensis, have been determined by the Seattle Structural Genomics Center for Infectious Disease. The high-resolution structure of chorismate mutase from B. thailandensis was determined in the monoclinic space group P2 1 with three homodimers per asymmetric unit. The overall structure of each protomer has the prototypical AroQγ topology and shares conserved binding-cavity residues with other chorismate mutases, including those with which it has no appreciable sequence identity.
Griffin, Philippa C.; Hoffmann, Ary A.
2014-01-01
Background and Aims While molecular approaches can often accurately reconstruct species relationships, taxa that are incompletely differentiated pose a challenge even with extensive data. Such taxa are functionally differentiated, but may be genetically differentiated only at small and/or patchy regions of the genome. This issue is considered here in Poa tussock grass species that dominate grassland and herbfields in the Australian alpine zone. Methods Previously reported tetraploidy was confirmed in all species by sequencing seven nuclear regions and five microsatellite markers. A Bayesian approach was used to co-estimate nuclear and chloroplast gene trees with an overall dated species tree. The resulting species tree was used to examine species structure and recent hybridization, and intertaxon fertility was tested by experimental crosses. Key Results Species tree estimation revealed Poa gunnii, a Tasmanian endemic species, as sister to the rest of the Australian alpine Poa. The taxa have radiated in the last 0·5–1·2 million years and the non-gunnii taxa are not supported as genetically distinct. Recent hybridization following past species divergence was also not supported. Ongoing gene flow is suggested, with some broad-scale geographic structure within the group. Conclusions The Australian alpine Poa species are not genetically distinct despite being distinguishable phenotypically, suggesting recent adaptive divergence with ongoing intertaxon gene flow. This highlights challenges in using conventional molecular taxonomy to infer species relationships in recent, rapid radiations. PMID:24607721
van de Guchte, M; Penaud, S; Grimaldi, C; Barbe, V; Bryson, K; Nicolas, P; Robert, C; Oztas, S; Mangenot, S; Couloux, A; Loux, V; Dervyn, R; Bossy, R; Bolotin, A; Batto, J-M; Walunas, T; Gibrat, J-F; Bessières, P; Weissenbach, J; Ehrlich, S D; Maguin, E
2006-06-13
Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is a representative of the group of lactic acid-producing bacteria, mainly known for its worldwide application in yogurt production. The genome sequence of this bacterium has been determined and shows the signs of ongoing specialization, with a substantial number of pseudogenes and incomplete metabolic pathways and relatively few regulatory functions. Several unique features of the L. bulgaricus genome support the hypothesis that the genome is in a phase of rapid evolution. (i) Exceptionally high numbers of rRNA and tRNA genes with regard to genome size may indicate that the L. bulgaricus genome has known a recent phase of important size reduction, in agreement with the observed high frequency of gene inactivation and elimination; (ii) a much higher GC content at codon position 3 than expected on the basis of the overall GC content suggests that the composition of the genome is evolving toward a higher GC content; and (iii) the presence of a 47.5-kbp inverted repeat in the replication termination region, an extremely rare feature in bacterial genomes, may be interpreted as a transient stage in genome evolution. The results indicate the adaptation of L. bulgaricus from a plant-associated habitat to the stable protein and lactose-rich milk environment through the loss of superfluous functions and protocooperation with Streptococcus thermophilus.
The first genome sequences of human bocaviruses from Vietnam
Thanh, Tran Tan; Van, Hoang Minh Tu; Hong, Nguyen Thi Thu; Nhu, Le Nguyen Truc; Anh, Nguyen To; Tuan, Ha Manh; Hien, Ho Van; Tuong, Nguyen Manh; Kien, Trinh Trung; Khanh, Truong Huu; Nhan, Le Nguyen Thanh; Hung, Nguyen Thanh; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H. Rogier; Tan, Le Van
2017-01-01
As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the virus. PMID:28090592
2017-01-01
We investigated the spatiotemporal dynamics of HSV genome transport during the initiation of infection using viruses containing bioorthogonal traceable precursors incorporated into their genomes (HSVEdC). In vitro assays revealed a structural alteration in the capsid induced upon HSVEdC binding to solid supports that allowed coupling to external capture agents and demonstrated that the vast majority of individual virions contained bioorthogonally-tagged genomes. Using HSVEdC in vivo we reveal novel aspects of the kinetics, localisation, mechanistic entry requirements and morphological transitions of infecting genomes. Uncoating and nuclear import was observed within 30 min, with genomes in a defined compaction state (ca. 3-fold volume increase from capsids). Free cytosolic uncoated genomes were infrequent (7–10% of the total uncoated genomes), likely a consequence of subpopulations of cells receiving high particle numbers. Uncoated nuclear genomes underwent temporal transitions in condensation state and while ICP4 efficiently associated with condensed foci of initial infecting genomes, this relationship switched away from residual longer lived condensed foci to increasingly decondensed genomes as infection progressed. Inhibition of transcription had no effect on nuclear entry but in the absence of transcription, genomes persisted as tightly condensed foci. Ongoing transcription, in the absence of protein synthesis, revealed a distinct spatial clustering of genomes, which we have termed genome congregation, not seen with non-transcribing genomes. Genomes expanded to more decondensed forms in the absence of DNA replication indicating additional transitional steps. During full progression of infection, genomes decondensed further, with a diffuse low intensity signal dissipated within replication compartments, but frequently with tight foci remaining peripherally, representing unreplicated genomes or condensed parental strands of replicated DNA. Uncoating and nuclear entry was independent of proteasome function and resistant to inhibitors of nuclear export. Together with additional data our results reveal new insight into the spatiotemporal dynamics of HSV genome uncoating, transport and organisation. PMID:29121649
Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.
2016-01-01
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667
Genomic data-sharing: what will be our legacy?
Callier, Shawneequa; Husain, Rajah; Simpson, Rachel
2014-01-01
Prior to 1974, the Tuskegee Syphilis experiments, expansive use of the HeLa cells, and other blatant instances of research abuse pervaded the medical research field. Ongoing challenges to informed consent, privacy and data-sharing will influence the stories that research participants today share with future generations. This has significant implications for the advancement of genomic science, and the public's perception of genomic research. PMID:24634673
Sweet, Kevin; Gordon, Erynn S.; Sturm, Amy C.; Schmidlen, Tara J.; Manickam, Kandamurugu; Toland, Amanda Ewart; Keller, Margaret A.; Stack, Catharine B.; García-España, J. Felipe; Bellafante, Mark; Tayal, Neeraj; Embi, Peter; Binkley, Philip; Hershberger, Ray E.; Sadee, Wolfgang; Christman, Michael; Marsh, Clay
2014-01-01
We describe the development and implementation of a randomized controlled trial to investigate the impact of genomic counseling on a cohort of patients with heart failure (HF) or hypertension (HTN), managed at a large academic medical center, the Ohio State University Wexner Medical Center (OSUWMC). Our study is built upon the existing Coriell Personalized Medicine Collaborative (CPMC®). OSUWMC patient participants with chronic disease (CD) receive eight actionable complex disease and one pharmacogenomic test report through the CPMC® web portal. Participants are randomized to either the in-person post-test genomic counseling—active arm, versus web-based only return of results—control arm. Study-specific surveys measure: (1) change in risk perception; (2) knowledge retention; (3) perceived personal control; (4) health behavior change; and, for the active arm (5), overall satisfaction with genomic counseling. This ongoing partnership has spurred creation of both infrastructure and procedures necessary for the implementation of genomics and genomic counseling in clinical care and clinical research. This included creation of a comprehensive informed consent document and processes for prospective return of actionable results for multiple complex diseases and pharmacogenomics (PGx) through a web portal, and integration of genomic data files and clinical decision support into an EPIC-based electronic medical record. We present this partnership, the infrastructure, genomic counseling approach, and the challenges that arose in the design and conduct of this ongoing trial to inform subsequent collaborative efforts and best genomic counseling practices. PMID:24926413
Sperschneider, Jana; Gardiner, Donald M.; Thatcher, Louise F.; Lyons, Rebecca; Singh, Karam B.; Manners, John M.; Taylor, Jennifer M.
2015-01-01
Pathogens and hosts are in an ongoing arms race and genes involved in host–pathogen interactions are likely to undergo diversifying selection. Fusarium plant pathogens have evolved diverse infection strategies, but how they interact with their hosts in the biotrophic infection stage remains puzzling. To address this, we analyzed the genomes of three Fusarium plant pathogens for genes that are under diversifying selection. We found a two-speed genome structure both on the chromosome and gene group level. Diversifying selection acts strongly on the dispensable chromosomes in Fusarium oxysporum f. sp. lycopersici and on distinct core chromosome regions in Fusarium graminearum, all of which have associations with virulence. Members of two gene groups evolve rapidly, namely those that encode proteins with an N-terminal [SG]-P-C-[KR]-P sequence motif and proteins that are conserved predominantly in pathogens. Specifically, 29 F. graminearum genes are rapidly evolving, in planta induced and encode secreted proteins, strongly pointing toward effector function. In summary, diversifying selection in Fusarium is strongly reflected as genomic footprints and can be used to predict a small gene set likely to be involved in host–pathogen interactions for experimental verification. PMID:25994930
Applications of genomics to slow the spread of multidrug-resistant Neisseria gonorrhoeae.
Mortimer, Tatum D; Grad, Yonatan H
2018-06-06
Infections with Neisseria gonorrhoeae, a sexually transmitted pathogen that causes urethritis, cervicitis, and more severe complications, are increasing. Gonorrhea is typically treated with antibiotics; however, N. gonorrhoeae has rapidly acquired resistance to many antibiotic classes, and lineages with reduced susceptibility to the currently recommended therapies are emerging worldwide. In this review, we discuss the contributions of whole genome sequencing (WGS) to our understanding of resistant N. gonorrhoeae. Genomics has illuminated the evolutionary origins and population structure of N. gonorrhoeae and the magnitude of horizontal gene transfer within and between Neisseria species. WGS can be used to predict the susceptibility of N. gonorrhoeae based on known resistance determinants, track the spread of these determinants throughout the N. gonorrhoeae population, and identify novel loci contributing to resistance. WGS has also allowed more detailed epidemiological analysis of transmission of N. gonorrhoeae between individuals and populations than previously used typing methods. Ongoing N. gonorrhoeae genomics will complement other laboratory techniques to understand the biology and evolution of the pathogen, improve diagnostics and treatment in the clinic, and inform public health policies to limit the impact of antibiotic resistance. © 2018 New York Academy of Sciences.
Toward genome-enabled mycology.
Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W
2013-01-01
Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.
Pediatric Genomic Data Inventory (PGDI) Overview
About Pediatric cancer is a genetic disease that can largely differ from similar malignancies in an adult population. To fuel new discoveries and treatments specific to pediatric oncologies, the NCI Office of Cancer Genomics has developed a dynamic resource known as the Pediatric Genomic Data Inventory to allow investigators to more easily locate genomic datasets. This resource lists known ongoing and completed sequencing projects of pediatric cancer cohorts from the United States and other countries, along with some basic details and reference metadata.
Transcriptome assembly, gene annotation and tissue gene expression atlas of the rainbow trout
USDA-ARS?s Scientific Manuscript database
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complimented by transcriptome information that will enhance genome assembly and annotation. Previously, we reported a transcriptome reference sequence using a 19X coverage of Sanger and 454-pyrosequencing dat...
Lee, Vivian C Y; Chow, Judy F C; Lau, Estella Y L; Yeung, William S B; Ho, P C; Ng, Ernest H Y
2015-02-01
To compare the pregnancy outcome of the fluorescent in-situ hybridisation and array comparative genomic hybridisation in preimplantation genetic diagnosis of translocation carriers. Historical cohort. A teaching hospital in Hong Kong. All preimplantation genetic diagnosis treatment cycles performed for translocation carriers from 2001 to 2013. Overall, 101 treatment cycles for preimplantation genetic diagnosis in translocation were included: 77 cycles for reciprocal translocation and 24 cycles for Robertsonian translocation. Fluorescent in-situ hybridisation and array comparative genomic hybridisation were used in 78 and 11 cycles, respectively. The ongoing pregnancy rate per initiated cycle after array comparative genomic hybridisation was significantly higher than that after fluorescent in-situ hybridisation in all translocation carriers (36.4% vs 9.0%; P=0.010). The miscarriage rate was comparable with both techniques. The testing method (array comparative genomic hybridisation or fluorescent in-situ hybridisation) was the only significant factor affecting the ongoing pregnancy rate after controlling for the women's age, type of translocation, and clinical information of the preimplantation genetic diagnosis cycles by logistic regression (odds ratio=1.875; P=0.023; 95% confidence interval, 1.090-3.226). This local retrospective study confirmed that comparative genomic hybridisation is associated with significantly higher pregnancy rates versus fluorescent in-situ hybridisation in translocation carriers. Array comparative genomic hybridisation should be the technique of choice in preimplantation genetic diagnosis cycles in translocation carriers.
Pilgrim, Brettney L; Perry, Robert C; Keefe, Donald G; Perry, Elizabeth A; Dawn Marshall, H
2012-01-01
In conservation genetics and management, it is important to understand the contribution of historical and contemporary processes to geographic patterns of genetic structure in order to characterize and preserve diversity. As part of a 10-year monitoring program by the Government of Newfoundland and Labrador, Canada, we measured the population genetic structure of the world's most northern native populations of brook trout (Salvelinus fontinalis) in Labrador to gather baseline data to facilitate monitoring of future impacts of the recently opened Trans-Labrador Highway. Six-locus microsatellite profiles were obtained from 1130 fish representing 32 populations from six local regions. Genetic diversity in brook trout populations in Labrador (average HE= 0.620) is within the spectrum of variability found in other brook trout across their northeastern range, with limited ongoing gene flow occurring between populations (average pairwise FST= 0.139). Evidence for some contribution of historical processes shaping genetic structure was inferred from an isolation-by-distance analysis, while dual routes of post-Wisconsinan recolonization were indicated by STRUCTURE analysis: K= 2 was the most likely number of genetic groups, revealing a separation between northern and west-central Labrador from all remaining populations. Our results represent the first data from the nuclear genome of brook trout in Labrador and emphasize the usefulness of microsatellite data for revealing the extent to which genetic structure is shaped by both historical and contemporary processes. PMID:22837834
Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome
Lan, Tianying; Renner, Tanya; Ibarra-Laclette, Enrique; Farr, Kimberly M.; Chang, Tien-Hao; Cervantes-Pérez, Sergio Alan; Zheng, Chunfang; Sankoff, David; Tang, Haibao; Purbojati, Rikky W.; Putra, Alexander; Drautz-Moses, Daniela I.; Schuster, Stephan C.; Herrera-Estrella, Luis; Albert, Victor A.
2017-01-01
Utricularia gibba, the humped bladderwort, is a carnivorous plant that retains a tiny nuclear genome despite at least two rounds of whole genome duplication (WGD) since common ancestry with grapevine and other species. We used a third-generation genome assembly with several complete chromosomes to reconstruct the two most recent lineage-specific ancestral genomes that led to the modern U. gibba genome structure. Patterns of subgenome dominance in the most recent WGD, both architectural and transcriptional, are suggestive of allopolyploidization, which may have generated genomic novelty and led to instantaneous speciation. Syntenic duplicates retained in polyploid blocks are enriched for transcription factor functions, whereas gene copies derived from ongoing tandem duplication events are enriched in metabolic functions potentially important for a carnivorous plant. Among these are tandem arrays of cysteine protease genes with trap-specific expression that evolved within a protein family known to be useful in the digestion of animal prey. Further enriched functions among tandem duplicates (also with trap-enhanced expression) include peptide transport (intercellular movement of broken-down prey proteins), ATPase activities (bladder-trap acidification and transmembrane nutrient transport), hydrolase and chitinase activities (breakdown of prey polysaccharides), and cell-wall dynamic components possibly associated with active bladder movements. Whereas independently polyploid Arabidopsis syntenic gene duplicates are similarly enriched for transcriptional regulatory activities, Arabidopsis tandems are distinct from those of U. gibba, while still metabolic and likely reflecting unique adaptations of that species. Taken together, these findings highlight the special importance of tandem duplications in the adaptive landscapes of a carnivorous plant genome. PMID:28507139
The Microprocessor controls the activity of mammalian retrotransposons
Heras, Sara R.; Macias, Sara; Plass, Mireya; Fernandez, Noemí; Cano, David; Eyras, Eduardo; Garcia-Perez, José L.; Cáceres, Javier F.
2013-01-01
More than half of the human genome is made of Transposable Elements. Their ongoing mobilization is a driving force in genetic diversity; however, little is known about how the host regulates their activity. Here, we show that the Microprocessor (Drosha-DGCR8), which is required for microRNA biogenesis, also recognizes and binds RNAs derived from human LINE-1 (Long INterspersed Element 1), Alu and SVA retrotransposons. Expression analyses demonstrate that cells lacking a functional Microprocessor accumulate LINE-1 mRNA and encoded proteins. Furthermore, we show that structured regions of the LINE-1 mRNA can be cleaved in vitro by Drosha. Additionally, we used a cell culture-based assay to show that the Microprocessor negatively regulates LINE-1 and Alu retrotransposition in vivo. Altogether, these data reveal a new role for the Microprocessor as a post-transcriptional repressor of mammalian retrotransposons acting as a defender of human genome integrity. PMID:23995758
The Microprocessor controls the activity of mammalian retrotransposons.
Heras, Sara R; Macias, Sara; Plass, Mireya; Fernandez, Noemí; Cano, David; Eyras, Eduardo; Garcia-Perez, José L; Cáceres, Javier F
2013-10-01
More than half of the human genome is made of transposable elements whose ongoing mobilization is a driving force in genetic diversity; however, little is known about how the host regulates their activity. Here, we show that the Microprocessor (Drosha-DGCR8), which is required for microRNA biogenesis, also recognizes and binds RNAs derived from human long interspersed element 1 (LINE-1), Alu and SVA retrotransposons. Expression analyses demonstrate that cells lacking a functional Microprocessor accumulate LINE-1 mRNA and encoded proteins. Furthermore, we show that structured regions of the LINE-1 mRNA can be cleaved in vitro by Drosha. Additionally, we used a cell culture-based assay to show that the Microprocessor negatively regulates LINE-1 and Alu retrotransposition in vivo. Altogether, these data reveal a new role for the Microprocessor as a post-transcriptional repressor of mammalian retrotransposons and a defender of human genome integrity.
Kirk, Maggie; Tonkin, Emma; Skirton, Heather
2014-01-01
KIRK M., TONKIN E. & SKIRTON H. (2014) An iterative consensus-building approach to revising a genetics/genomics competency framework for nurse education in the UK. Journal of Advanced Nursing 70(2), 405–420. doi: 10.1111/jan.12207 AimTo report a review of a genetics education framework using a consensus approach to agree on a contemporary and comprehensive revised framework. BackgroundAdvances in genomic health care have been significant since the first genetics education framework for nurses was developed in 2003. These, coupled with developments in policy and international efforts to promote nursing competence in genetics, indicated that review was timely. DesignA structured, iterative, primarily qualitative approach, based on a nominal group technique. MethodA meeting convened in 2010 involved stakeholders in UK nursing education, practice and management, including patient representatives (n = 30). A consensus approach was used to solicit participants' views on the individual/family needs identified from real-life stories of people affected by genetic conditions and the nurses' knowledge, skills and attitudes needed to meet those needs. Five groups considered the stories in iterative rounds, reviewing comments from previous groups. Omissions and deficiencies were identified by mapping resulting themes to the original framework. Anonymous voting captured views. Educators at a second meeting developed learning outcomes for the final framework. FindingsDeficiencies in relation to Advocacy, Information management and Ongoing care were identified. All competencies of the original framework were revised, adding an eighth competency to make explicit the need for ongoing care of the individual/family. ConclusionModifications to the framework reflect individual/family needs and are relevant to the nursing role. The approach promoted engagement in a complex issue and provides a framework to guide nurse education in genetics/genomics; however, nursing leadership is crucial to successful implementation. PMID:23879662
The population structure of Vibrio cholerae from the Chandigarh Region of Northern India.
Abd El Ghany, Moataz; Chander, Jagadish; Mutreja, Ankur; Rashid, Mamoon; Hill-Cawthorne, Grant A; Ali, Shahjahan; Naeem, Raeece; Thomson, Nicholas R; Dougan, Gordon; Pain, Arnab
2014-07-01
Cholera infection continues to be a threat to global public health. The current cholera pandemic associated with Vibrio cholerae El Tor has now been ongoing for over half a century. Thirty-eight V. cholerae El Tor isolates associated with a cholera outbreak in 2009 from the Chandigarh region of India were characterised by a combination of microbiology, molecular typing and whole-genome sequencing. The genomic analysis indicated that two clones of V. cholera circulated in the region and caused disease during this time. These clones fell into two distinct sub-clades that map independently onto wave 3 of the phylogenetic tree of seventh pandemic V. cholerae El Tor. Sequence analyses of the cholera toxin gene, the Vibrio seventh Pandemic Island II (VSPII) and SXT element correlated with this phylogenetic position of the two clades on the El Tor tree. The clade 2 isolates, characterized by a drug-resistant profile and the expression of a distinct cholera toxin, are closely related to the recent V. cholerae isolated elsewhere, including Haiti, but fell on a distinct branch of the tree, showing they were independent outbreaks. Multi-Locus Sequence Typing (MLST) distinguishes two sequence types among the 38 isolates, that did not correspond to the clades defined by whole-genome sequencing. Multi-Locus Variable-length tandem-nucleotide repeat Analysis (MLVA) identified 16 distinct clusters. The use of whole-genome sequencing enabled the identification of two clones of V. cholerae that circulated during the 2009 Chandigarh outbreak. These clones harboured a similar structure of ICEVchHai1 but differed mainly in the structure of CTX phage and VSPII. The limited capacity of MLST and MLVA to discriminate between the clones that circulated in the 2009 Chandigarh outbreak highlights the value of whole-genome sequencing as a route to the identification of further genetic markers to subtype V. cholerae isolates.
Appropriate utilization of data from toxicogenomic studies ins an ongoing concern of the regulated industries and the agencies charged with assessing safety or risk. An area of current interest is the possibility of toxicogenomics to enhance our ability to develop higher or high-...
EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome
Thibaud-Nissen, Françoise; Campbell, Matthew; Hamilton, John P; Zhu, Wei; Buell, C Robin
2007-01-01
Background Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging knowledge from a dispersed community of scientists is a demonstrated way of improving a genome annotation. This requires tools that facilitate 1) the submission of gene annotation to an annotation project, 2) the review of the submitted models by project annotators, and 3) the incorporation of the submitted models in the ongoing annotation effort. Results We have developed the Eukaryotic Community Annotation Package (EuCAP), an annotation tool, and have applied it to the rice genome. The primary level of curation by community annotators (CA) has been the annotation of gene families. Annotation can be submitted by email or through the EuCAP Web Tool. The CA models are aligned to the rice pseudomolecules and the coordinates of these alignments, along with functional annotation, are stored in the MySQL EuCAP Gene Model database. Web pages displaying the alignments of the CA models to the Osa1 Genome models are automatically generated from the EuCAP Gene Model database. The alignments are reviewed by the project annotators (PAs) in the context of experimental evidence. Upon approval by the PAs, the CA models, along with the corresponding functional annotations, are integrated into the Osa1 Genome Annotation. The CA annotations, grouped by family, are displayed on the Community Annotation pages of the project website , as well as in the Community Annotation track of the Genome Browser. Conclusion We have applied EuCAP to rice. As of July 2007, the structural and/or functional annotation of 1,094 genes representing 57 families have been deposited and integrated into the current gene set. All of the EuCAP components are open-source, thereby allowing the implementation of EuCAP for the annotation of other genomes. EuCAP is available at . PMID:17961238
Kirk, Maggie; Tonkin, Emma; Skirton, Heather
2014-02-01
To report a review of a genetics education framework using a consensus approach to agree on a contemporary and comprehensive revised framework. Advances in genomic health care have been significant since the first genetics education framework for nurses was developed in 2003. These, coupled with developments in policy and international efforts to promote nursing competence in genetics, indicated that review was timely. A structured, iterative, primarily qualitative approach, based on a nominal group technique. A meeting convened in 2010 involved stakeholders in UK nursing education, practice and management, including patient representatives (n = 30). A consensus approach was used to solicit participants' views on the individual/family needs identified from real-life stories of people affected by genetic conditions and the nurses' knowledge, skills and attitudes needed to meet those needs. Five groups considered the stories in iterative rounds, reviewing comments from previous groups. Omissions and deficiencies were identified by mapping resulting themes to the original framework. Anonymous voting captured views. Educators at a second meeting developed learning outcomes for the final framework. Deficiencies in relation to Advocacy, Information management and Ongoing care were identified. All competencies of the original framework were revised, adding an eighth competency to make explicit the need for ongoing care of the individual/family. Modifications to the framework reflect individual/family needs and are relevant to the nursing role. The approach promoted engagement in a complex issue and provides a framework to guide nurse education in genetics/genomics; however, nursing leadership is crucial to successful implementation. © 2013 The Authors. Journal of Advanced Nursing published by John Wiley & Sons Ltd.
Human centromere genomics: now it's personal.
Hayden, Karen E
2012-07-01
Advances in human genomics have accelerated studies in evolution, disease, and cellular regulation. However, centromere sequences, defining the chromosomal interface with spindle microtubules, remain largely absent from ongoing genomic studies and disconnected from functional, genome-wide analyses. This disparity results from the challenge of predicting the linear order of multi-megabase-sized regions that are composed almost entirely of near-identical satellite DNA. Acknowledging these challenges, the field of human centromere genomics possesses the potential to rapidly advance given the availability of individual, or personalized, genome projects matched with the promise of long-read sequencing technologies. Here I review the current genomic model of human centromeres in consideration of those studies involving functional datasets that examine the role of sequence in centromere identity.
Chow, J Fc; Yeung, W Sb; Lee, V Cy; Lau, E Yl; Ho, P C; Ng, E Hy
2017-04-01
Preimplantation genetic screening has been proposed to improve the in-vitro fertilisation outcome by screening for aneuploid embryos or blastocysts. This study aimed to report the outcome of 133 cycles of preimplantation genetic diagnosis and screening by array comparative genomic hybridisation. This study of case series was conducted in a tertiary assisted reproductive centre in Hong Kong. Patients who underwent preimplantation genetic diagnosis for chromosomal abnormalities or preimplantation genetic screening between 1 April 2012 and 30 June 2015 were included. They underwent in-vitro fertilisation and intracytoplasmic sperm injection. An embryo biopsy was performed on day-3 embryos and the blastomere was subject to array comparative genomic hybridisation. Embryos with normal copy numbers were replaced. The ongoing pregnancy rate, implantation rate, and miscarriage rate were studied. During the study period, 133 cycles of preimplantation genetic diagnosis for chromosomal abnormalities or preimplantation genetic screening were initiated in 94 patients. Overall, 112 cycles proceeded to embryo biopsy and 65 cycles had embryo transfer. The ongoing pregnancy rate per transfer cycle after preimplantation genetic screening was 50.0% and that after preimplantation genetic diagnosis was 34.9%. The implantation rates after preimplantation genetic screening and diagnosis were 45.7% and 41.1%, respectively and the miscarriage rates were 8.3% and 28.6%, respectively. There were 26 frozen-thawed embryo transfer cycles, in which vitrified and biopsied genetically transferrable embryos were replaced, resulting in an ongoing pregnancy rate of 36.4% in the screening group and 60.0% in the diagnosis group. The clinical outcomes of preimplantation genetic diagnosis and screening using comparative genomic hybridisation in our unit were comparable to those reported internationally. Genetically transferrable embryos replaced in a natural cycle may improve the ongoing pregnancy rate and implantation rate when compared with transfer in a stimulated cycle.
Benner, Christian; Havulinna, Aki S; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ripatti, Samuli; Pirinen, Matti
2017-10-05
During the past few years, various novel statistical methods have been developed for fine-mapping with the use of summary statistics from genome-wide association studies (GWASs). Although these approaches require information about the linkage disequilibrium (LD) between variants, there has not been a comprehensive evaluation of how estimation of the LD structure from reference genotype panels performs in comparison with that from the original individual-level GWAS data. Using population genotype data from Finland and the UK Biobank, we show here that a reference panel of 1,000 individuals from the target population is adequate for a GWAS cohort of up to 10,000 individuals, whereas smaller panels, such as those from the 1000 Genomes Project, should be avoided. We also show, both theoretically and empirically, that the size of the reference panel needs to scale with the GWAS sample size; this has important consequences for the application of these methods in ongoing GWAS meta-analyses and large biobank studies. We conclude by providing software tools and by recommending practices for sharing LD information to more efficiently exploit summary statistics in genetics research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Ambroggio, Xavier I; Dommer, Jennifer; Gopalan, Vivek; Dunham, Eleca J; Taubenberger, Jeffery K; Hurt, Darrell E
2013-06-18
Influenza A viruses possess RNA genomes that mutate frequently in response to immune pressures. The mutations in the hemagglutinin genes are particularly significant, as the hemagglutinin proteins mediate attachment and fusion to host cells, thereby influencing viral pathogenicity and species specificity. Large-scale influenza A genome sequencing efforts have been ongoing to understand past epidemics and pandemics and anticipate future outbreaks. Sequencing efforts thus far have generated nearly 9,000 distinct hemagglutinin amino acid sequences. Comparative models for all publicly available influenza A hemagglutinin protein sequences (8,769 to date) were generated using the Rosetta modeling suite. The C-alpha root mean square deviations between a randomly chosen test set of models and their crystallographic templates were less than 2 Å, suggesting that the modeling protocols yielded high-quality results. The models were compiled into an online resource, the Hemagglutinin Structure Prediction (HASP) server. The HASP server was designed as a scientific tool for researchers to visualize hemagglutinin protein sequences of interest in a three-dimensional context. With a built-in molecular viewer, hemagglutinin models can be compared side-by-side and navigated by a corresponding sequence alignment. The models and alignments can be downloaded for offline use and further analysis. The modeling protocols used in the HASP server scale well for large amounts of sequences and will keep pace with expanded sequencing efforts. The conservative approach to modeling and the intuitive search and visualization interfaces allow researchers to quickly analyze hemagglutinin sequences of interest in the context of the most highly related experimental structures, and allow them to directly compare hemagglutinin sequences to each other simultaneously in their two- and three-dimensional contexts. The models and methodology have shown utility in current research efforts and the ongoing aim of the HASP server is to continue to accelerate influenza A research and have a positive impact on global public health.
USDA-ARS?s Scientific Manuscript database
During ongoing proteomic analysis of the soybean (Glycine max (L.) Merr) germplasm collection, PI 603408 was identified as a landrace whose seeds lack accumulation of one of the major seed storage glycinin protein subunits. Whole genomic resequencing was used to identify a two-base deletion affectin...
Advances in Homology Protein Structure Modeling
Xiang, Zhexin
2007-01-01
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function. PMID:16787261
Luebert, Federico; Jacobs, Pit; Hilger, Hartmut H; Muller, Ludo A H
2014-01-01
The genetic structure of populations of closely related, sympatric species may hold the signature of the geographical mode of the speciation process. In fully allopatric speciation, it is expected that genetic differentiation between species is homogeneously distributed across the genome. In nonallopatric speciation, the genomes may remain undifferentiated to a large extent. In this article, we analyzed the genetic structure of five sympatric species from the plant genus Heliotropium in the Atacama Desert. We used amplified fragment length polymorphisms (AFLPs) to characterize the genetic structure of these species and evaluate their genetic differentiation as well as the number of loci subject to positive selection using divergence outlier analysis (DOA). The five species form distinguishable groups in the genetic space, with zones of overlap, indicating that they are possibly not completely isolated. Among-species differentiation accounts for 35% of the total genetic differentiation (FST = 0.35), and FST between species pairs is positively correlated with phylogenetic distance. DOA suggests that few loci are subject to positive selection, which is in line with a scenario of nonallopatric speciation. These results support the idea that sympatric species of Heliotropium sect. Cochranea are under an ongoing speciation process, characterized by a fluctuation of population ranges in response to pulses of arid and humid periods during Quaternary times. PMID:24558582
Luebert, Federico; Jacobs, Pit; Hilger, Hartmut H; Muller, Ludo A H
2014-02-01
The genetic structure of populations of closely related, sympatric species may hold the signature of the geographical mode of the speciation process. In fully allopatric speciation, it is expected that genetic differentiation between species is homogeneously distributed across the genome. In nonallopatric speciation, the genomes may remain undifferentiated to a large extent. In this article, we analyzed the genetic structure of five sympatric species from the plant genus Heliotropium in the Atacama Desert. We used amplified fragment length polymorphisms (AFLPs) to characterize the genetic structure of these species and evaluate their genetic differentiation as well as the number of loci subject to positive selection using divergence outlier analysis (DOA). The five species form distinguishable groups in the genetic space, with zones of overlap, indicating that they are possibly not completely isolated. Among-species differentiation accounts for 35% of the total genetic differentiation (F ST = 0.35), and F ST between species pairs is positively correlated with phylogenetic distance. DOA suggests that few loci are subject to positive selection, which is in line with a scenario of nonallopatric speciation. These results support the idea that sympatric species of Heliotropium sect. Cochranea are under an ongoing speciation process, characterized by a fluctuation of population ranges in response to pulses of arid and humid periods during Quaternary times.
Human-Specific Duplication and Mosaic Transcripts: The Recent Paralogous Structure of Chromosome 22
Bailey, Jeffrey A. ; Yavor, Amy M. ; Viggiano, Luigi ; Misceo, Doriana ; Horvath, Juliann E. ; Archidiacono, Nicoletta ; Schwartz, Stuart ; Rocchi, Mariano ; Eichler, Evan E.
2002-01-01
In recent decades, comparative chromosomal banding, chromosome painting, and gene-order studies have shown strong conservation of gross chromosome structure and gene order in mammals. However, findings from the human genome sequence suggest an unprecedented degree of recent (<35 million years ago) segmental duplication. This dynamism of segmental duplications has important implications in disease and evolution. Here we present a chromosome-wide view of the structure and evolution of the most highly homologous duplications (⩾1 kb and ⩾90%) on chromosome 22. Overall, 10.8% (3.7/33.8 Mb) of chromosome 22 is duplicated, with an average sequence identity of 95.4%. To organize the duplications into tractable units, intron-exon structure and well-defined duplication boundaries were used to define 78 duplicated modules (minimally shared evolutionary segments) with 157 copies on chromosome 22. Analysis of these modules provides evidence for the creation or modification of 11 novel transcripts. Comparative FISH analyses of human, chimpanzee, gorilla, orangutan, and macaque reveal qualitative and quantitative differences in the distribution of these duplications—consistent with their recent origin. Several duplications appear to be human specific, including a ∼400-kb duplication (99.4%–99.8% sequence identity) that transposed from chromosome 14 to the most proximal pericentromeric region of chromosome 22. Experimental and in silico data further support a pericentromeric gradient of duplications where the most recent duplications transpose adjacent to the centromere. Taken together, these data suggest that segmental duplications have been an ongoing process of primate genome evolution, contributing to recent gene innovation and the dynamic transformation of genome architecture within and among closely related species. PMID:11731936
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide
Liolios, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Kyrpides, Nikos C.
2006-01-01
The Genomes On Line Database (GOLD) is a web resource for comprehensive access to information regarding complete and ongoing genome sequencing projects worldwide. The database currently incorporates information on over 1500 sequencing projects, of which 294 have been completed and the data deposited in the public databases. GOLD v.2 has been expanded to provide information related to organism properties such as phenotype, ecotype and disease. Furthermore, project relevance and availability information is now included. GOLD is available at . It is also mirrored at the Institute of Molecular Biology and Biotechnology, Crete, Greece at PMID:16381880
Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard
2014-01-01
Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific information about genes or microRNAs is quick and easily accessible. Hence, this platform can support the ongoing OS research and biomarker discovery. Database URL: http://osteosarcoma-db.uni-muenster.de. © The Author(s) 2014. Published by Oxford University Press.
Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard
2014-01-01
Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific information about genes or microRNAs is quick and easily accessible. Hence, this platform can support the ongoing OS research and biomarker discovery. Database URL: http://osteosarcoma-db.uni-muenster.de PMID:24865352
Kao, Damian; Lai, Alvina G; Stamataki, Evangelia; Rosic, Silvana; Konstantinides, Nikolaos; Jarvis, Erin; Di Donfrancesco, Alessia; Pouchkina-Stancheva, Natalia; Sémon, Marie; Grillo, Marco; Bruce, Heather; Kumar, Suyash; Siwanowicz, Igor; Le, Andy; Lemire, Andrew; Eisen, Michael B; Extavour, Cassandra; Browne, William E; Wolff, Carsten; Averof, Michalis; Patel, Nipam H; Sarkies, Peter; Pavlopoulos, Anastasios; Aboobaker, Aziz
2016-01-01
The amphipod crustacean Parhyale hawaiensis is a blossoming model system for studies of developmental mechanisms and more recently regeneration. We have sequenced the genome allowing annotation of all key signaling pathways, transcription factors, and non-coding RNAs that will enhance ongoing functional studies. Parhyale is a member of the Malacostraca clade, which includes crustacean food crop species. We analysed the immunity related genes of Parhyale as an important comparative system for these species, where immunity related aquaculture problems have increased as farming has intensified. We also find that Parhyale and other species within Multicrustacea contain the enzyme sets necessary to perform lignocellulose digestion ('wood eating'), suggesting this ability may predate the diversification of this lineage. Our data provide an essential resource for further development of Parhyale as an experimental model. The first malacostracan genome will underpin ongoing comparative work in food crop species and research investigating lignocellulose as an energy source. DOI: http://dx.doi.org/10.7554/eLife.20062.001 PMID:27849518
Kao, Damian; Lai, Alvina G; Stamataki, Evangelia; Rosic, Silvana; Konstantinides, Nikolaos; Jarvis, Erin; Di Donfrancesco, Alessia; Pouchkina-Stancheva, Natalia; Sémon, Marie; Grillo, Marco; Bruce, Heather; Kumar, Suyash; Siwanowicz, Igor; Le, Andy; Lemire, Andrew; Eisen, Michael B; Extavour, Cassandra; Browne, William E; Wolff, Carsten; Averof, Michalis; Patel, Nipam H; Sarkies, Peter; Pavlopoulos, Anastasios; Aboobaker, Aziz
2016-11-16
The amphipod crustacean Parhyale hawaiensis is a blossoming model system for studies of developmental mechanisms and more recently regeneration. We have sequenced the genome allowing annotation of all key signaling pathways, transcription factors, and non-coding RNAs that will enhance ongoing functional studies. Parhyale is a member of the Malacostraca clade, which includes crustacean food crop species. We analysed the immunity related genes of Parhyale as an important comparative system for these species, where immunity related aquaculture problems have increased as farming has intensified. We also find that Parhyale and other species within Multicrustacea contain the enzyme sets necessary to perform lignocellulose digestion ('wood eating'), suggesting this ability may predate the diversification of this lineage. Our data provide an essential resource for further development of Parhyale as an experimental model. The first malacostracan genome will underpin ongoing comparative work in food crop species and research investigating lignocellulose as an energy source.
Metcalfe, Cushla J; Filée, Jonathan; Germon, Isabelle; Joss, Jean; Casane, Didier
2012-11-01
Haploid genomes greater than 25,000 Mb are rare, within the animals only the lungfish and some of the salamanders and crustaceans are known to have genomes this large. There is very little data on the structure of genomes this size. It is known, however, that for animal genomes up to 3,000 Mb, there is in general a good correlation between genome size and the percent of the genome composed of repetitive sequence and that this repetitive component is highly dynamic. In this study, we sampled the Australian lungfish genome using three mini-genomic libraries and found that with very little sequence, the results converged on an estimate of 40% of the genome being composed of recognizable transposable elements (TEs), chiefly from the CR1 and L2 long interspersed nuclear element clades. We further characterized the CR1 and L2 elements in the lungfish genome and show that although most CR1 elements probably represent recent amplifications, the L2 elements are more diverse and are more likely the result of a series of amplifications. We suggest that our sampling method has probably underestimated the recognizable TE content. However, on the basis of the most likely sources of error, we suggest that this very large genome is not largely composed of recently amplified, undetected TEs but may instead include a large component of older degenerate TEs. Based on these estimates, and on Thomson's (Thomson K. 1972. An attempt to reconstruct evolutionary changes in the cellular DNA content of lungfish. J Exp Zool. 180:363-372) inference that in the lineage leading to the extant Australian lungfish, there was massive increase in genome size between 350 and 200 mya, after which the size of the genome changed little, we speculate that the very large Australian lungfish genome may be the result of a massive amplification of TEs followed by a long period with a very low rate of sequence removal and some ongoing TE activity.
Genome-wide comparative analysis of four Indian Drosophila species.
Mohanty, Sujata; Khanna, Radhika
2017-12-01
Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.
Fan, Xiang-yu; Lin, Yan-ping; Liao, Guo-jian; Xie, Jian-ping
2015-12-01
Zinc finger nuclease, transcription activator-like effector nuclease, and clustered regularly interspaced short palindromic repeats/Cas9 nuclease are important targeted genome editing technologies. They have great significance in scientific research and applications on aspects of functional genomics research, species improvement, disease prevention and gene therapy. There are past or ongoing disputes over ownership of the intellectual property behind every technology. In this review, we summarize the patents on these three targeted genome editing technologies in order to provide some reference for developing genome editing technologies with self-owned intellectual property rights and some implications for current innovation and entrepreneurship education in universities.
Genomics Education for the Public: Perspectives of Genomic Researchers and ELSI Advisors
Jones, Sondra Smolek; Markey, Janell M.; Byerly, Katherine W.; Roberts, Megan C.
2014-01-01
Aims: For more than two decades genomic education of the public has been a significant challenge. As genomic information becomes integrated into daily life and routine clinical care, the need for public education is even more critical. We conducted a pilot study to learn how genomic researchers and ethical, legal, and social implications advisors who were affiliated with large-scale genomic variation studies have approached the issue of educating the public about genomics. Methods/Results: Semi-structured telephone interviews were conducted with researchers and advisors associated with the SNP/HAPMAP studies and the Cancer Genome Atlas Study. Respondents described approach(es) associated with educating the public about their study. Interviews were audio-recorded, transcribed, coded, and analyzed by team review. Although few respondents described formal educational efforts, most provided recommendations for what should/could be done, emphasizing the need for an overarching entity(s) to take responsibility to lead the effort to educate the public. Opposing views were described related to: who this should be; the overall goal of the educational effort; and the educational approach. Four thematic areas emerged: What is the rationale for educating the public about genomics?; Who is the audience?; Who should be responsible for this effort?; and What should the content be? Policy issues associated with these themes included the need to agree on philosophical framework(s) to guide the rationale, content, and target audiences for education programs; coordinate previous/ongoing educational efforts; and develop a centralized knowledge base. Suggestions for next steps are presented. Conclusion: A complex interplay of philosophical, professional, and cultural issues can create impediments to genomic education of the public. Many challenges, however, can be addressed by agreement on a guiding philosophical framework(s) and identification of a responsible entity(s) to provide leadership for developing/overseeing an appropriate infrastructure to support the coordination/integration/sharing and evaluation of educational efforts, benefiting consumers and professionals. PMID:24495163
Genomics education for the public: perspectives of genomic researchers and ELSI advisors.
Dressler, Lynn G; Jones, Sondra Smolek; Markey, Janell M; Byerly, Katherine W; Roberts, Megan C
2014-03-01
For more than two decades genomic education of the public has been a significant challenge. As genomic information becomes integrated into daily life and routine clinical care, the need for public education is even more critical. We conducted a pilot study to learn how genomic researchers and ethical, legal, and social implications advisors who were affiliated with large-scale genomic variation studies have approached the issue of educating the public about genomics. Semi-structured telephone interviews were conducted with researchers and advisors associated with the SNP/HAPMAP studies and the Cancer Genome Atlas Study. Respondents described approach(es) associated with educating the public about their study. Interviews were audio-recorded, transcribed, coded, and analyzed by team review. Although few respondents described formal educational efforts, most provided recommendations for what should/could be done, emphasizing the need for an overarching entity(s) to take responsibility to lead the effort to educate the public. Opposing views were described related to: who this should be; the overall goal of the educational effort; and the educational approach. Four thematic areas emerged: What is the rationale for educating the public about genomics?; Who is the audience?; Who should be responsible for this effort?; and What should the content be? Policy issues associated with these themes included the need to agree on philosophical framework(s) to guide the rationale, content, and target audiences for education programs; coordinate previous/ongoing educational efforts; and develop a centralized knowledge base. Suggestions for next steps are presented. A complex interplay of philosophical, professional, and cultural issues can create impediments to genomic education of the public. Many challenges, however, can be addressed by agreement on a guiding philosophical framework(s) and identification of a responsible entity(s) to provide leadership for developing/overseeing an appropriate infrastructure to support the coordination/integration/sharing and evaluation of educational efforts, benefiting consumers and professionals.
Altermann, Eric; Lu, Jingli; McCulloch, Alan
2017-01-01
Expert curated annotation remains one of the critical steps in achieving a reliable biological relevant annotation. Here we announce the release of GAMOLA2, a user friendly and comprehensive software package to process, annotate and curate draft and complete bacterial, archaeal, and viral genomes. GAMOLA2 represents a wrapping tool to combine gene model determination, functional Blast, COG, Pfam, and TIGRfam analyses with structural predictions including detection of tRNAs, rRNA genes, non-coding RNAs, signal protein cleavage sites, transmembrane helices, CRISPR repeats and vector sequence contaminations. GAMOLA2 has already been validated in a wide range of bacterial and archaeal genomes, and its modular concept allows easy addition of further functionality in future releases. A modified and adapted version of the Artemis Genome Viewer (Sanger Institute) has been developed to leverage the additional features and underlying information provided by the GAMOLA2 analysis, and is part of the software distribution. In addition to genome annotations, GAMOLA2 features, among others, supplemental modules that assist in the creation of custom Blast databases, annotation transfers between genome versions, and the preparation of Genbank files for submission via the NCBI Sequin tool. GAMOLA2 is intended to be run under a Linux environment, whereas the subsequent visualization and manual curation in Artemis is mobile and platform independent. The development of GAMOLA2 is ongoing and community driven. New functionality can easily be added upon user requests, ensuring that GAMOLA2 provides information relevant to microbiologists. The software is available free of charge for academic use. PMID:28386247
Altermann, Eric; Lu, Jingli; McCulloch, Alan
2017-01-01
Expert curated annotation remains one of the critical steps in achieving a reliable biological relevant annotation. Here we announce the release of GAMOLA2, a user friendly and comprehensive software package to process, annotate and curate draft and complete bacterial, archaeal, and viral genomes. GAMOLA2 represents a wrapping tool to combine gene model determination, functional Blast, COG, Pfam, and TIGRfam analyses with structural predictions including detection of tRNAs, rRNA genes, non-coding RNAs, signal protein cleavage sites, transmembrane helices, CRISPR repeats and vector sequence contaminations. GAMOLA2 has already been validated in a wide range of bacterial and archaeal genomes, and its modular concept allows easy addition of further functionality in future releases. A modified and adapted version of the Artemis Genome Viewer (Sanger Institute) has been developed to leverage the additional features and underlying information provided by the GAMOLA2 analysis, and is part of the software distribution. In addition to genome annotations, GAMOLA2 features, among others, supplemental modules that assist in the creation of custom Blast databases, annotation transfers between genome versions, and the preparation of Genbank files for submission via the NCBI Sequin tool. GAMOLA2 is intended to be run under a Linux environment, whereas the subsequent visualization and manual curation in Artemis is mobile and platform independent. The development of GAMOLA2 is ongoing and community driven. New functionality can easily be added upon user requests, ensuring that GAMOLA2 provides information relevant to microbiologists. The software is available free of charge for academic use.
Family-Based Approaches to Cardiovascular Health Promotion.
Vedanthan, Rajesh; Bansilal, Sameer; Soto, Ana Victoria; Kovacic, Jason C; Latina, Jacqueline; Jaslow, Risa; Santana, Maribel; Gorga, Elio; Kasarskis, Andrew; Hajjar, Roger; Schadt, Eric E; Björkegren, Johan L; Fayad, Zahi A; Fuster, Valentin
2016-04-12
Cardiovascular disease is the leading cause of mortality in the world, and the increasing burden is largely a consequence of modifiable behavioral risk factors that interact with genomics and the environment. Continuous cardiovascular health promotion and disease prevention throughout the lifespan is critical, and the family is a central entity in this process. In this review, we describe the potential rationale and mechanisms that contribute to the importance of family for cardiovascular health promotion, focusing on: 1) mutual interdependence of the family system; 2) shared environment; 3) parenting style; 4) caregiver perceptions; and 5) genomics. We conclude that family-based approaches that target both caregivers and children, encourage communication among the family unit, and address the structural and environmental conditions in which families live and operate are likely to be the most effective approach to promote cardiovascular health. We describe lessons learned, future implications, and applications to ongoing and planned studies. Copyright © 2016 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Quaedflieg, Conny W E M; Schwabe, Lars
2018-03-01
Stressful events have a major impact on memory. They modulate memory formation in a time-dependent manner, closely linked to the temporal profile of action of major stress mediators, in particular catecholamines and glucocorticoids. Shortly after stressor onset, rapidly acting catecholamines and fast, non-genomic glucocorticoid actions direct cognitive resources to the processing and consolidation of the ongoing threat. In parallel, control of memory is biased towards rather rigid systems, promoting habitual forms of memory allowing efficient processing under stress, at the expense of "cognitive" systems supporting memory flexibility and specificity. In this review, we discuss the implications of this shift in the balance of multiple memory systems for the dynamics of the memory trace. Specifically, stress appears to hinder the incorporation of contextual details into the memory trace, to impede the integration of new information into existing knowledge structures, to impair the flexible generalisation across past experiences, and to hamper the modification of memories in light of new information. Delayed, genomic glucocorticoid actions might reverse the control of memory, thus restoring homeostasis and "cognitive" control of memory again.
The Giardia genome project database.
McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L
2000-08-15
The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.
Genomic standards consortium projects.
Field, Dawn; Sterk, Peter; Kottmann, Renzo; De Smet, J Wim; Amaral-Zettler, Linda; Cochrane, Guy; Cole, James R; Davies, Neil; Dawyndt, Peter; Garrity, George M; Gilbert, Jack A; Glöckner, Frank Oliver; Hirschman, Lynette; Klenk, Hans-Peter; Knight, Rob; Kyrpides, Nikos; Meyer, Folker; Karsch-Mizrachi, Ilene; Morrison, Norman; Robbins, Robert; San Gil, Inigo; Sansone, Susanna; Schriml, Lynn; Tatusova, Tatiana; Ussery, Dave; Yilmaz, Pelin; White, Owen; Wooley, John; Caporaso, Gregory
2014-06-15
The Genomic Standards Consortium (GSC) is an open-membership community that was founded in 2005 to work towards the development, implementation and harmonization of standards in the field of genomics. Starting with the defined task of establishing a minimal set of descriptions the GSC has evolved into an active standards-setting body that currently has 18 ongoing projects, with additional projects regularly proposed from within and outside the GSC. Here we describe our recently enacted policy for proposing new activities that are intended to be taken on by the GSC, along with the template for proposing such new activities.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.
Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata
Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240
The Genome 10K Project: a way forward.
Koepfli, Klaus-Peter; Paten, Benedict; O'Brien, Stephen J
2015-01-01
The Genome 10K Project was established in 2009 by a consortium of biologists and genome scientists determined to facilitate the sequencing and analysis of the complete genomes of 10,000 vertebrate species. Since then the number of selected and initiated species has risen from ∼26 to 277 sequenced or ongoing with funding, an approximately tenfold increase in five years. Here we summarize the advances and commitments that have occurred by mid-2014 and outline the achievements and present challenges of reaching the 10,000-species goal. We summarize the status of known vertebrate genome projects, recommend standards for pronouncing a genome as sequenced or completed, and provide our present and future vision of the landscape of Genome 10K. The endeavor is ambitious, bold, expensive, and uncertain, but together the Genome 10K Consortium of Scientists and the worldwide genomics community are moving toward their goal of delivering to the coming generation the gift of genome empowerment for many vertebrate species.
The Genome 10K Project: A Way Forward
Koepfli, Klaus-Peter; Paten, Benedict; O’Brien, Stephen J.
2017-01-01
The Genome 10K Project was established in 2009 by a consortium of biologists and genome scientists determined to facilitate the sequencing and analysis of the complete genomes of 10,000 vertebrate species. Since then the number of selected and initiated species has risen from ~26 to 277 sequenced or ongoing with funding, an approximately tenfold increase in five years. Here we summarize the advances and commitments that have occurred by mid-2014 and outline the achievements and present challenges of reaching the 10,000-species goal. We summarize the status of known vertebrate genome projects, recommend standards for pronouncing a genome as sequenced or completed, and provide our present and future vision of the landscape of Genome 10K. The endeavor is ambitious, bold, expensive, and uncertain, but together the Genome 10K Consortium of Scientists and the worldwide genomics community are moving toward their goal of delivering to the coming generation the gift of genome empowerment for many vertebrate species. PMID:25689317
Evolving approaches to the ethical management of genomic data.
McEwen, Jean E; Boyer, Joy T; Sun, Kathie Y
2013-06-01
The ethical landscape in the field of genomics is rapidly shifting. Plummeting sequencing costs, along with ongoing advances in bioinformatics, now make it possible to generate an enormous volume of genomic data about vast numbers of people. The informational richness, complexity, and frequently uncertain meaning of these data, coupled with evolving norms surrounding the sharing of data and samples and persistent privacy concerns, have generated a range of approaches to the ethical management of genomic information. As calls increase for the expanded use of broad or even open consent, and as controversy grows about how best to handle incidental genomic findings, these approaches, informed by normative analysis and empirical data, will continue to evolve alongside the science. Published by Elsevier Ltd.
Evolving Approaches to the Ethical Management of Genomic Data
Boyer, Joy T.; Sun, Kathie Y.
2013-01-01
The ethical landscape in the field of genomics is rapidly shifting. Plummeting sequencing costs, along with ongoing advances in bioinformatics, now make it possible to generate an enormous volume of genomic data about vast numbers of people. The informational richness, complexity, and frequently uncertain meaning of these data, coupled with evolving norms surrounding the sharing of data and samples and persistent privacy concerns, have generated a range of approaches to the ethical management of genomic information. As calls increase for the expanded use of broad or even open consent, and as controversy grows about how best to handle incidental genomic findings, these approaches, informed by normative analysis and empirical data, will continue to evolve alongside the science. PMID:23453621
Eppig, Janan T; Smith, Cynthia L; Blake, Judith A; Ringwald, Martin; Kadin, James A; Richardson, Joel E; Bult, Carol J
2017-01-01
The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.
Halophiles and their enzymes: Negativity put to good use
DasSarma, Shiladitya; DasSarma, Priya
2015-01-01
Halophilic microorganisms possess stable enzymes that function in very high salinity, an extreme condition that leads to denaturation, aggregation, and precipitation of most other proteins. Genomic and structural analyses have established that the enzymes of halophilic Archaea and many halophilic Bacteria are negatively charged due to an excess of acidic over basic residues, and altered hydrophobicity, which enhance solubility and promote function in low water activity conditions. Here, we provide an update on recent bioinformatic analysis of predicted halophilic proteomes as well as experimental molecular studies on individual halophilic enzymes. On-going efforts on discovery and utilization of halophiles and their enzymes for biotechnology, including biofuel applications are also considered. PMID:26066288
A Taste of Algal Genomes from the Joint Genome Institute
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuo, Alan; Grigoriev, Igor
Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basicmore » and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.« less
Genetic characterisation of the recent foot-and-mouth disease virus subtype A/IRN/2005
Klein, Joern; Hussain, Manzoor; Ahmad, Munir; Normann, Preben; Afzal, Muhammad; Alexandersen, Soren
2007-01-01
Background According to the World Reference Laboratory for FMD, a new subtype of FMDV serotype A was detected in Iran in 2005. This subtype was designated A/IRN/2005, and rapidly spread throughout Iran and moved westwards into Saudi Arabia and Turkey where it was initially detected from August 2005 and subsequently caused major disease problems in the spring of 2006. The same subtype reached Jordan in 2007. As part of an ongoing project we have also detected this subtype in Pakistan with the first positive samples detected in April 2006. To characterise this subtype in detail, we have sequenced and analysed the complete coding sequence of three subtype A/IRN/2005 isolates collected in Pakistan in 2006, the complete coding sequence of one subtype A/IRN/2005 isolate collected during the first outbreak in Turkey in 2005 and, in addition, the partial 1D coding sequence derived from 4 epithelium samples and 34 swab-samples from Asian buffaloes or cattle subsequently found to be infected with the A/IRN/2005 subtype. Results The phylogenies of the genome regions encoding for the structural proteins, displayed, with the exception of 1A, distinct, serotype-specific clustering and an evolutionary relationship of the A/IRN/2005 sublineage with the A22 sublineage. Potential recombination events have been detected in parts of the genome region coding for the non-structural proteins of FMDV. In addition, amino acid substitutions have been detected in the deduced VP1 protein sequence, potentially related to clinical or subclinical outcome of FMD. Indications of differential susceptibility for developing a subclinical course of disease between Asian buffaloes and cattle have been detected. Furthermore, hitherto unknown insertions of 2 amino acids before the second start codon, as well as sublineage specific amino acids have been detected in the genome region encoding for the leader proteinase of A/IRN/2005 sublineage. Conclusion Our findings indicate that the A/IRN/2005 sublineage has undergone two different paths of evolution for the structural and non-structural genome regions. The structural genome regions have had their evolutionary starting point in the A22 sublineage. It can be assumed that, due to the quasispecies structure of FMDV populations and the error-prone replication process, advantageous mutations in a changed environment have been fixed and lead to the occurrence of the new A/IRN/2005 sublineage. Together with this mechanism, recombination within the non-structural genome regions, potentially modifying the virulence of the virus, may be involved in the success of this new sublineage. The possible origin of this recombinant virus may be a co-infection with Asia1 and a serotype A precursor of the A/IRN/2005 sublineage potentially within Asian Buffaloes, as these appears to relatively easy become infected, but usually without developing clinical disease and consequently showing not a strong acute inflammatory immune response against a second FMDV infection. PMID:18001482
Nelson, Oranmiyan W.; Garrity, George M.
2011-01-01
The purpose of this table is to provide the community with a citable record of publications of ongoing genome sequencing projects that have led to a publication in the scientific literature. While our goal is to make the list complete, there is no guarantee that we may have omitted one or more publications appearing in this time frame. Readers and authors who wish to have publications added to this subsequent versions of this list are invited to provide the bibliometric data for such references to the SIGS editorial office.
Methanococcus jannaschii genome: revisited
NASA Technical Reports Server (NTRS)
Kyrpides, N. C.; Olsen, G. J.; Klenk, H. P.; White, O.; Woese, C. R.
1996-01-01
Analysis of genomic sequences is necessarily an ongoing process. Initial gene assignments tend (wisely) to be on the conservative side (Venter, 1996). The analysis of the genome then grows in an iterative fashion as additional data and more sophisticated algorithms are brought to bear on the data. The present report is an emendation of the original gene list of Methanococcus jannaschii (Bult et al., 1996). By using a somewhat more updated database and more relaxed (and operator-intensive) pattern matching methods, we were able to add significantly to, and in a few cases amend, the gene identification table originally published by Bult et al. (1996).
The dynamic evolutionary history of genome size in North American woodland salamanders.
Newman, Catherine E; Gregory, T Ryan; Austin, Christopher C
2017-04-01
The genus Plethodon is the most species-rich salamander genus in North America, and nearly half of its species face an uncertain future. It is also one of the most diverse families in terms of genome sizes, which range from 1C = 18.2 to 69.3 pg, or 5-20 times larger than the human genome. Large genome size in salamanders results in part from accumulation of transposable elements and is associated with various developmental and physiological traits. However, genome sizes have been reported for only 25% of the species of Plethodon (14 of 55). We collected genome size data for Plethodon serratus to supplement an ongoing phylogeographic study, reconstructed the evolutionary history of genome size in Plethodontidae, and inferred probable genome sizes for the 41 species missing empirical data. Results revealed multiple genome size changes in Plethodon: genomes of western Plethodon increased, whereas genomes of eastern Plethodon decreased, followed by additional decreases or subsequent increases. The estimated genome size of P. serratus was 21 pg. New understanding of variation in genome size evolution, along with genome size inferences for previously unstudied taxa, provide a foundation for future studies on the biology of plethodontid salamanders.
Trindade, Inês B.; Fonseca, Bruno M.; Matias, Pedro M.; Louro, Ricardo O.; Moe, Elin
2016-01-01
Siderophore-binding proteins (SIPs) perform a key role in iron acquisition in multiple organisms. In the genome of the marine bacterium Shewanella frigidimarina NCIMB 400, the gene tagged as SFRI_RS12295 encodes a protein from this family. Here, the cloning, expression, purification and crystallization of this protein are reported, together with its preliminary X-ray crystallographic analysis to 1.35 Å resolution. The SIP crystals belonged to the monoclinic space group P21, with unit-cell parameters a = 48.04, b = 78.31, c = 67.71 Å, α = 90, β = 99.94, γ = 90°, and are predicted to contain two molecules per asymmetric unit. Structure determination by molecular replacement and the use of previously determined ∼2 Å resolution SIP structures with ∼30% sequence identity as templates are ongoing. PMID:27599855
The history of Old World camelids in the light of molecular genetics.
Burger, Pamela Anna
2016-06-01
Old World camels have come into the focus as sustainable livestock species, unique in their morphological and physiological characteristics and capable of providing vital products even under extreme environmental conditions. The evolutionary history of dromedary and Bactrian camels traces back to the middle Eocene (around 40 million years ago, mya), when the ancestors of Camelus emerged on the North American continent. While the genetic status of the two domestic species has long been established, the wild two-humped camel has only recently been recognized as a separate species, Camelus ferus, based on molecular genetic data. The demographic history established from genome drafts of Old World camels shows the independent development of the three species over the last 100,000 years with severe bottlenecks occurring during the last glacial period and in the recent past. Ongoing studies involve the immune system, relevant production traits, and the global population structure and domestication of Old World camels. Based on the now available whole genome drafts, specific metabolic pathways have been described shedding new light on the camels' ability to adapt to desert environments. These new data will also be at the origin for genome-wide association studies to link economically relevant phenotypes to genotypes and to conserve the diverse genetic resources in Old World camelids.
Therkildsen, Nina Overgaard; Hemmer-Hansen, Jakob; Hedeholm, Rasmus Berg; Wisz, Mary S; Pampoulie, Christophe; Meldrup, Dorte; Bonanomi, Sara; Retzel, Anja; Olsen, Steffen Malskær; Nielsen, Einar Eg
2013-01-01
Accurate prediction of species distribution shifts in the face of climate change requires a sound understanding of population diversity and local adaptations. Previous modeling has suggested that global warming will lead to increased abundance of Atlantic cod (Gadus morhua) in the ocean around Greenland, but the dynamics of earlier abundance fluctuations are not well understood. We applied a retrospective spatiotemporal population genomics approach to examine the temporal stability of cod population structure in this region and to search for signatures of divergent selection over a 78-year period spanning major demographic changes. Analyzing >900 gene-associated single nucleotide polymorphisms in 847 individuals, we identified four genetically distinct groups that exhibited varying spatial distributions with considerable overlap and mixture. The genetic composition had remained stable over decades at some spawning grounds, whereas complete population replacement was evident at others. Observations of elevated differentiation in certain genomic regions are consistent with adaptive divergence between the groups, indicating that they may respond differently to environmental variation. Significantly increased temporal changes at a subset of loci also suggest that adaptation may be ongoing. These findings illustrate the power of spatiotemporal population genomics for revealing biocomplexity in both space and time and for informing future fisheries management and conservation efforts. PMID:23789034
Therkildsen, Nina Overgaard; Hemmer-Hansen, Jakob; Hedeholm, Rasmus Berg; Wisz, Mary S; Pampoulie, Christophe; Meldrup, Dorte; Bonanomi, Sara; Retzel, Anja; Olsen, Steffen Malskær; Nielsen, Einar Eg
2013-06-01
Accurate prediction of species distribution shifts in the face of climate change requires a sound understanding of population diversity and local adaptations. Previous modeling has suggested that global warming will lead to increased abundance of Atlantic cod (Gadus morhua) in the ocean around Greenland, but the dynamics of earlier abundance fluctuations are not well understood. We applied a retrospective spatiotemporal population genomics approach to examine the temporal stability of cod population structure in this region and to search for signatures of divergent selection over a 78-year period spanning major demographic changes. Analyzing >900 gene-associated single nucleotide polymorphisms in 847 individuals, we identified four genetically distinct groups that exhibited varying spatial distributions with considerable overlap and mixture. The genetic composition had remained stable over decades at some spawning grounds, whereas complete population replacement was evident at others. Observations of elevated differentiation in certain genomic regions are consistent with adaptive divergence between the groups, indicating that they may respond differently to environmental variation. Significantly increased temporal changes at a subset of loci also suggest that adaptation may be ongoing. These findings illustrate the power of spatiotemporal population genomics for revealing biocomplexity in both space and time and for informing future fisheries management and conservation efforts.
What Defines the "Kingdom" Fungi?
Richards, Thomas A; Leonard, Guy; Wideman, Jeremy G
2017-06-01
The application of environmental DNA techniques and increased genome sequencing of microbial diversity, combined with detailed study of cellular characters, has consistently led to the reexamination of our understanding of the tree of life. This has challenged many of the definitions of taxonomic groups, especially higher taxonomic ranks such as eukaryotic kingdoms. The Fungi is an example of a kingdom which, together with the features that define it and the taxa that are grouped within it, has been in a continual state of flux. In this article we aim to summarize multiple lines of data pertinent to understanding the early evolution and definition of the Fungi. These include ongoing cellular and genomic comparisons that, we will argue, have generally undermined all attempts to identify a synapomorphic trait that defines the Fungi. This article will also summarize ongoing work focusing on taxon discovery, combined with phylogenomic analysis, which has identified novel groups that lie proximate/adjacent to the fungal clade-wherever the boundary that defines the Fungi may be. Our hope is that, by summarizing these data in the form of a discussion, we can illustrate the ongoing efforts to understand what drove the evolutionary diversification of fungi.
Arar, Nedal; Knight, Sara J; Modell, Stephen M; Issa, Amalia M
2011-03-01
The main mission of the Genomic Applications in Practice and Prevention Network™ is to advance collaborative efforts involving partners from across the public health sector to realize the promise of genomics in healthcare and disease prevention. We introduce a new framework that supports the Genomic Applications in Practice and Prevention Network mission and leverages the characteristics of the complex adaptive systems approach. We call this framework the Genome-based Knowledge Management in Cycles model (G-KNOMIC). G-KNOMIC proposes that the collaborative work of multidisciplinary teams utilizing genome-based applications will enhance translating evidence-based genomic findings by creating ongoing knowledge management cycles. Each cycle consists of knowledge synthesis, knowledge evaluation, knowledge implementation and knowledge utilization. Our framework acknowledges that all the elements in the knowledge translation process are interconnected and continuously changing. It also recognizes the importance of feedback loops, and the ability of teams to self-organize within a dynamic system. We demonstrate how this framework can be used to improve the adoption of genomic technologies into practice using two case studies of genomic uptake.
Reference genome sequence of the model plant Setaria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao
We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species thatmore » demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).« less
Reference genome sequence of the model plant Setaria.
Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M
2012-05-13
We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).
An Approach to Using Toxicogenomic Data in US EPA Human ...
This draft report is a description of an approach to evaluate genomic data for use in risk assessment and a case study to illustrate the approach. The dibutyl phthalate (DBP) case study example focuses on male reproductive developmental effects and the qualitative application of the available genomic data. The case study presented in this draft document is a separate activity from any of the ongoing IRIS human health assessments for the phthalates. This draft report is a description of an approach to evaluate genomic data for use in risk assessment and a case study to illustrate the approach. The dibutyl phthalate (DBP) case study example focuses on male reproductive developmental effects and the qualitative application of the available genomic data.
Using cancer cell-line profiling, we established an ongoing resource to identify, as comprehensively as possible, the drug-targetable dependencies that specific genomic alterations impart on human cancers. We measured the sensitivity of hundreds of genetically characterized cancer cell lines to hundreds of small-molecule probes and drugs that have highly selective interactions with their targets, and that collectively modulate many distinct nodes in cancer cell circuitry.
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence' (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C.
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/ PMID:17981842
Human genomics projects and precision medicine.
Carrasco-Ramiro, F; Peiró-Pastor, R; Aguado, B
2017-09-01
The completion of the Human Genome Project (HGP) in 2001 opened the floodgates to a deeper understanding of medicine. There are dozens of HGP-like projects which involve from a few tens to several million genomes currently in progress, which vary from having specialized goals or a more general approach. However, data generation, storage, management and analysis in public and private cloud computing platforms have raised concerns about privacy and security. The knowledge gained from further research has changed the field of genomics and is now slowly permeating into clinical medicine. The new precision (personalized) medicine, where genome sequencing and data analysis are essential components, allows tailored diagnosis and treatment according to the information from the patient's own genome and specific environmental factors. P4 (predictive, preventive, personalized and participatory) medicine is introducing new concepts, challenges and opportunities. This review summarizes current sequencing technologies, concentrates on ongoing human genomics projects, and provides some examples in which precision medicine has already demonstrated clinical impact in diagnosis and/or treatment.
Precision medicine in pediatric oncology: Lessons learned and next steps.
Mody, Rajen J; Prensner, John R; Everett, Jessica; Parsons, D Williams; Chinnaiyan, Arul M
2017-03-01
The maturation of genomic technologies has enabled new discoveries in disease pathogenesis as well as new approaches to patient care. In pediatric oncology, patients may now receive individualized genomic analysis to identify molecular aberrations of relevance for diagnosis and/or treatment. In this context, several recent clinical studies have begun to explore the feasibility and utility of genomics-driven precision medicine. Here, we review the major developments in this field, discuss current limitations, and explore aspects of the clinical implementation of precision medicine, which lack consensus. Lastly, we discuss ongoing scientific efforts in this arena, which may yield future clinical applications. © 2016 Wiley Periodicals, Inc.
Precision medicine in pediatric oncology: Lessons learned and next steps
Mody, Rajen J.; Prensner, John R.; Everett, Jessica; Parsons, D. Williams; Chinnaiyan, Arul M.
2017-01-01
The maturation of genomic technologies has enabled new discoveries in disease pathogenesis as well as new approaches to patient care. In pediatric oncology, patients may now receive individualized genomic analysis to identify molecular aberrations of relevance for diagnosis and/or treatment. In this context, several recent clinical studies have begun to explore the feasibility and utility of genomics-driven precision medicine. Here, we review the major developments in this field, discuss current limitations, and explore aspects of the clinical implementation of precision medicine, which lack consensus. Lastly, we discuss ongoing scientific efforts in this arena, which may yield future clinical applications. PMID:27748023
Best, Megan; Newson, Ainsley J; Meiser, Bettina; Juraskova, Ilona; Goldstein, David; Tucker, Kathy; Ballinger, Mandy L; Hess, Dominique; Schlub, Timothy E; Biesecker, Barbara; Vines, Richard; Vines, Kate; Thomas, David; Young, Mary-Anne; Savard, Jacqueline; Jacobs, Chris; Butow, Phyllis
2018-04-23
Advances in genomics offer promise for earlier detection or prevention of cancer, by personalisation of medical care tailored to an individual's genomic risk status. However genome sequencing can generate an unprecedented volume of results for the patient to process with potential implications for their families and reproductive choices. This paper describes a protocol for a study (PiGeOn) that aims to explore how patients and their blood relatives experience germline genomic sequencing, to help guide the appropriate future implementation of genome sequencing into routine clinical practice. We have designed a mixed-methods, prospective, cohort sub-study of a germline genomic sequencing study that targets adults with cancer suggestive of a genetic aetiology. One thousand probands and 2000 of their blood relatives will undergo germline genomic sequencing as part of the parent study in Sydney, Australia between 2016 and 2020. Test results are expected within12-15 months of recruitment. For the PiGeOn sub-study, participants will be invited to complete surveys at baseline, three months and twelve months after baseline using self-administered questionnaires, to assess the experience of long waits for results (despite being informed that results may not be returned) and expectations of receiving them. Subsets of both probands and blood relatives will be purposively sampled and invited to participate in three semi-structured qualitative interviews (at baseline and each follow-up) to triangulate the data. Ethical themes identified in the data will be used to inform critical revisions of normative ethical concepts or frameworks. This will be one of the first studies internationally to follow the psychosocial impact on probands and their blood relatives who undergo germline genome sequencing, over time. Study results will inform ongoing ethical debates on issues such as informed consent for genomic sequencing, and informing participants and their relatives of specific results. The study will also provide important outcome data concerning the psychological impact of prolonged waiting for germline genomic sequencing. These data are needed to ensure that when germline genomic sequencing is introduced into standard clinical settings, ethical concepts are embedded, and patients and their relatives are adequately prepared and supported during and after the testing process.
Plant Genome Size Research: A Field In Focus
BENNETT, M. D.; LEITCH, I. J.
2005-01-01
This Special Issue contains 18 papers arising from presentations at the Second Plant Genome Size Workshop and Discussion Meeting (hosted by the Royal Botanic Gardens, Kew, 8–12 September, 2003). This preface provides an overview of these papers, setting their key contents in the broad framework of this highly active field. It also highlights a few overarching issues with wide biological impact or interest, including (1) the need to unify terminology relating to C-value and genome size, (2) the ongoing quest for accurate gold standards for accurate plant genome size estimation, (3) how knowledge of species' DNA amounts has increased in recent years, (4) the existence, causes and significance of intraspecific variation, (5) recent progress in understanding the mechanisms and evolutionary patterns of genome size change, and (6) the impact of genome size knowledge on related biological activities such as genetic fingerprinting and quantitative genetics. The paper offers a vision of how increased knowledge and understanding of genome size will contribute to holisitic genomic studies in both plants and animals in the next decade. PMID:15596455
Genomic signals of selection predict climate-driven population declines in a migratory bird.
Bay, Rachael A; Harrigan, Ryan J; Underwood, Vinh Le; Gibbs, H Lisle; Smith, Thomas B; Ruegg, Kristen
2018-01-05
The ongoing loss of biodiversity caused by rapid climatic shifts requires accurate models for predicting species' responses. Despite evidence that evolutionary adaptation could mitigate climate change impacts, evolution is rarely integrated into predictive models. Integrating population genomics and environmental data, we identified genomic variation associated with climate across the breeding range of the migratory songbird, yellow warbler ( Setophaga petechia ). Populations requiring the greatest shifts in allele frequencies to keep pace with future climate change have experienced the largest population declines, suggesting that failure to adapt may have already negatively affected populations. Broadly, our study suggests that the integration of genomic adaptation can increase the accuracy of future species distribution models and ultimately guide more effective mitigation efforts. Copyright © 2018, American Association for the Advancement of Science.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennetzen, Jeffrey L; Yang, Xiaohan; Ye, Chuyu
We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species thatmore » demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).« less
Epigenetics, chromatin and genome organization: recent advances from the ENCODE project.
Siggens, L; Ekwall, K
2014-09-01
The organization of the genome into functional units, such as enhancers and active or repressed promoters, is associated with distinct patterns of DNA and histone modifications. The Encyclopedia of DNA Elements (ENCODE) project has advanced our understanding of the principles of genome, epigenome and chromatin organization, identifying hundreds of thousands of potential regulatory regions and transcription factor binding sites. Part of the ENCODE consortium, GENCODE, has annotated the human genome with novel transcripts including new noncoding RNAs and pseudogenes, highlighting transcriptional complexity. Many disease variants identified in genome-wide association studies are located within putative enhancer regions defined by the ENCODE project. Understanding the principles of chromatin and epigenome organization will help to identify new disease mechanisms, biomarkers and drug targets, particularly as ongoing epigenome mapping projects generate data for primary human cell types that play important roles in disease. © 2014 The Association for the Publication of the Journal of Internal Medicine.
Liolios, Konstantinos; Chen, I-Min A; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor M; Kyrpides, Nikos C
2010-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2009, GOLD contains information for more than 5800 sequencing projects, of which 1100 have been completed and their sequence data deposited in a public repository. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification. GOLD is available at: http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece, at: http://gold.imbb.forth.gr/
Liolios, Konstantinos; Chen, I-Min A.; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor M.; Kyrpides, Nikos C.
2010-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2009, GOLD contains information for more than 5800 sequencing projects, of which 1100 have been completed and their sequence data deposited in a public repository. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification. GOLD is available at: http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece, at: http://gold.imbb.forth.gr/ PMID:19914934
Pathways and Mechanisms that Prevent Genome Instability in Saccharomyces cerevisiae
Putnam, Christopher D.; Kolodner, Richard D.
2017-01-01
Genome rearrangements result in mutations that underlie many human diseases, and ongoing genome instability likely contributes to the development of many cancers. The tools for studying genome instability in mammalian cells are limited, whereas model organisms such as Saccharomyces cerevisiae are more amenable to these studies. Here, we discuss the many genetic assays developed to measure the rate of occurrence of Gross Chromosomal Rearrangements (called GCRs) in S. cerevisiae. These genetic assays have been used to identify many types of GCRs, including translocations, interstitial deletions, and broken chromosomes healed by de novo telomere addition, and have identified genes that act in the suppression and formation of GCRs. Insights from these studies have contributed to the understanding of pathways and mechanisms that suppress genome instability and how these pathways cooperate with each other. Integrated models for the formation and suppression of GCRs are discussed. PMID:28684602
The Power and Potential of Genomics in Weed Biology and Management.
Ravet, Karl; Patterson, Eric L; Krähmer, Hansjörg; Hamouzová, Kateřina; Fan, Longjiang; Jasieniuk, Marie; Lawton-Rauh, Amy; Malone, Jenna M; Scott McElroy, J; Merotto, Aldo; Westra, Philip; Preston, Christopher; Vila-Aiub, Martin M; Busi, Roberto; Tranel, Patrick J; Reinhardt, Carl; Saski, Christopher; Beffa, Roland; Neve, Paul; Gaines, Todd A
2018-04-24
There have been previous calls for, and efforts focused on, realizing the power and potential of weed genomics for better understanding of weeds. Sustained advances in genome sequencing and assembly technologies now make it possible for individual research groups to generate reference genomes for multiple weed species at reasonable costs. Here, we present the outcomes from several meetings, discussions, and workshops focused on establishing an International Weed Genomics Consortium (IWGC) for a coordinated international effort in weed genomics. We review the 'state of the art' in genomics and weed genomics, including technologies, applications, and on-going weed genome projects. We also report the outcomes from a workshop and a global survey of the weed science community to identify priority species, key biological questions, and weed management applications that can be addressed through greater availability of, and access to, genomic resources. Major focus areas include the evolution of herbicide resistance and weedy traits, the development of molecular diagnostics, and the identification of novel targets and approaches for weed management. There is increasing interest in, and need for, weed genomics, and the establishment of the IWGC will provide the necessary global platform for communication and coordination of weed genomics research. This article is protected by copyright. All rights reserved.
Aoki, Koh; Yano, Kentaro; Suzuki, Ayako; Kawamura, Shingo; Sakurai, Nozomu; Suda, Kunihiro; Kurabayashi, Atsushi; Suzuki, Tatsuya; Tsugane, Taneaki; Watanabe, Manabu; Ooga, Kazuhide; Torii, Maiko; Narita, Takanori; Shin-I, Tadasu; Kohara, Yuji; Yamamoto, Naoki; Takahashi, Hideki; Watanabe, Yuichiro; Egusa, Mayumi; Kodama, Motoichiro; Ichinose, Yuki; Kikuchi, Mari; Fukushima, Sumire; Okabe, Akiko; Arie, Tsutomu; Sato, Yuko; Yazawa, Katsumi; Satoh, Shinobu; Omura, Toshikazu; Ezura, Hiroshi; Shibata, Daisuke
2010-03-30
The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.
2015-01-01
The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402
Positive selection on sociobiological traits in invasive fire ants.
Privman, Eyal; Cohen, Pnina; Cohanim, Amir B; Riba-Grognuz, Oksana; Shoemaker, DeWayne; Keller, Laurent
2018-06-19
The fire ant Solenopsis invicta and its close relatives are highly invasive. Enhanced social cooperation may facilitate invasiveness in these and other invasive ant species. We investigated whether invasiveness in Solenopsis fire ants was accompanied by positive selection on sociobiological traits by applying a phylogenomics approach to infer ancient selection, and a population genomics approach to infer recent and ongoing selection in both native and introduced S. invicta populations. A combination of whole-genome sequencing of 40 haploid males and reduced-representation genomic sequencing of 112 diploid workers identified 1,758,116 and 169,682 polymorphic markers, respectively. The resulting high-resolution maps of genomic polymorphism provide high inference power to test for positive selection. Our analyses provide evidence of positive selection on putative ion channel genes, which are implicated in neurological functions, and on vitellogenin, which is a key regulator of development and caste determination. Furthermore, molecular functions implicated in pheromonal signaling have experienced recent positive selection. Genes with signatures of positive selection were significantly more often those over-expressed in workers compared with queens and males, suggesting that worker traits are under stronger selection than queen and male traits. These results provide insights into selection pressures and ongoing adaptation in an invasive social insect and support the hypothesis that sociobiological traits are under more positive selection than traits related to non-social traits in such invasive species. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Candidate gene association studies in syndromic and non-syndromic cleft lip and palate
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daack-Hirsch, S.; Basart, A.; Frischmeyer, P.
1994-09-01
Using ongoing case ascertainment through a birth defects registry, we have collected 219 nuclear families with non-syndromic cleft lip and/or palate and 111 families with a collection of syndromic forms. Syndromic cases include 24 with recognized forms and 72 with unrecognized syndromes. Candidate gene studies as well as genome-wide searches for evidence of microdeletions and isodisomy are currently being carried out. Candidate gene association studies, to date, have made use of PCR-based polymorphisms for TGFA, MSX1, CLPG13 (a CA repeat associated with a human homologue of a locus that results in craniofacial dysmorphogenesis in the mouse) and an STRP foundmore » in a Van der Woude syndrome microdeletion. Control tetranucleotide repeats, which insure that population-based differences are not responsible for any observed associations, are also tested. Studies of the syndromic cases have included the same list of candidate genes searching for evidence of microdeletions and a genome-wide search using tri- and tetranucleotide polymorphic markers to search for isodisomy or structural rearrangements. Significant associations have previously been identified for TGFA, and, in this report, identified for MSX1 and nonsyndromic cleft palate only (p = 0.04, uncorrected). Preliminary results of the genome-wide scan for isodisomy has returned no true positives and there has been no evidence for microdeletion cases.« less
Disruption of MBD5 contributes to a spectrum of psychopathology and neurodevelopmental abnormalities
Hodge, Jennelle C.; Mitchell, Elyse; Pillalamarri, Vamsee; Toler, Tomi L.; Bartel, Frank; Kearney, Hutton M.; Zou, Ying S.; Tan, Wen-Hann; Hanscom, Carrie; Kirmani, Salman; Hanson, Rae R.; Skinner, Steven A.; Rogers, Curtis; Everman, David B.; Boyd, Ellen; Mullegama, Sureni V.; Keelean-Fuller, Debra; Powell, Cynthia M.; Elsea, Sarah H.; Morton, Cynthia C.; Gusella, James F.; DuPont, Barbara; Chaubey, Alka; Lin, Angela E.; Talkowski, Michael E.
2016-01-01
Microdeletions of chromosomal region 2q23.1 that disrupt MBD5 contribute to a spectrum of neurodevelopmental phenotypes, however the impact of this locus in human psychopathology has not been described. To characterize the structural variation landscape of MBD5 disruptions and the associated psychopathology, 22 individuals with genomic disruption of MBD5 (translocation, point mutation, and deletion) were identified through whole-genome sequencing or cytogenomic microarray at 11 molecular diagnostic centers. The genomic impact ranged from a single base pair to 5.4 Mb. Parents were available for 11 cases, all of which confirmed the rearrangement arose de novo. Phenotypes were largely indistinguishable between patients with full-segment 2q23.1 deletions and those with intragenic MBD5 rearrangements, including alterations confined entirely to the 5′UTR, confirming the critical impact of non-coding sequence at this locus. We found heterogeneous, multi-system pathogenic effects of MBD5 disruption and characterized the associated spectrum of psychopathology, which includes sensory integration disorder, anxiety, self-hugging, bipolar disorder and others. Importantly, unique features of the oldest assessed patient were early-onset dementia and behavioral regression. Analyses also revealed phenotypes that distinguish MBD5 disruptions from seven well-established syndromes with significant diagnostic overlap. This study indicates that haploinsufficiency of MBD5 causes diverse phenotypes, yields insight into the spectrum of resulting neurodevelopmental and behavioral psychopathology, and provides clinical context for interpretation of MBD5 structural variations. Empirical evidence also suggests that disruption of non-coding MBD5 regulatory regions is sufficient for clinical manifestation, highlighting the limitations of exon-focused assessments. These results suggest an ongoing perturbation of neurological function throughout the lifespan, including risks for neurobehavioral regression and early-onset dementia. PMID:23587880
Determination and analysis of the full-length chicken parvovirus genome.
USDA-ARS?s Scientific Manuscript database
Viral enteric disease in poultry is an ongoing problem in many parts of the world. Many enteric viruses have been identified in turkeys and chickens, including avian astroviruses, rotaviruses, reoviruses, and coronaviruses. Through the application of a molecular screening method targeting particle-a...
Deciphering the origin of mito-nuclear discordance in two sibling caddisfly species.
Weigand, Hannah; Weiss, Martina; Cai, Huimin; Li, Yongping; Yu, Lili; Zhang, Christine; Leese, Florian
2017-10-01
An increasing number of phylogenetic studies have reported discordances among nuclear and mitochondrial markers. These discrepancies are highly relevant to widely used biodiversity assessment approaches, such as DNA barcoding, that rely almost exclusively on mitochondrial markers. Although the theoretical causes of mito-nuclear discordances are well understood, it is often extremely challenging to determine the principal underlying factor in a given study system. In this study, we uncovered significant mito-nuclear discordances in a pair of sibling caddisfly species. Application of genome sequencing, ddRAD and DNA barcoding revealed ongoing hybridization, as well as historical hybridization in Pleistocene refugia, leading us to identify introgression as the ultimate cause of the observed discordance pattern. Our novel genomic data, the discovery of a European-wide hybrid zone and the availability of established techniques for laboratory breeding make this species pair an ideal model system for studying species boundaries with ongoing gene flow. © 2017 John Wiley & Sons Ltd.
Best, Megan; Newson, Ainsley J; Meiser, Bettina; Juraskova, Ilona; Goldstein, David; Tucker, Kathy; Ballinger, Mandy L; Hess, Dominique; Schlub, Timothy E; Biesecker, Barbara; Vines, Richard; Vines, Kate; Thomas, David; Young, Mary-Anne; Savard, Jacqueline; Jacobs, Chris; Butow, Phyllis
2018-04-05
Genomic sequencing in cancer (both tumour and germline), and development of therapies targeted to tumour genetic status, hold great promise for improvement of patient outcomes. However, the imminent introduction of genomics into clinical practice calls for better understanding of how patients value, experience, and cope with this novel technology and its often complex results. Here we describe a protocol for a novel mixed-methods, prospective study (PiGeOn) that aims to examine patients' psychosocial, cognitive, affective and behavioural responses to tumour genomic profiling and to integrate a parallel critical ethical analysis of returning results. This is a cohort sub-study of a parent tumour genomic profiling programme enrolling patients with advanced cancer. One thousand patients will be recruited for the parent study in Sydney, Australia from 2016 to 2019. They will be asked to complete surveys at baseline, three, and five months. Primary outcomes are: knowledge, preferences, attitudes and values. A purposively sampled subset of patients will be asked to participate in three semi-structured interviews (at each time point) to provide deeper data interpretation. Relevant ethical themes will be critically analysed to iteratively develop or refine normative ethical concepts or frameworks currently used in the return of genetic information. This will be the first Australian study to collect longitudinal data on cancer patients' experience of tumour genomic profiling. Findings will be used to inform ongoing ethical debates on issues such as how to effectively obtain informed consent for genomic profiling return results, distinguish between research and clinical practice and manage patient expectations. The combination of quantitative and qualitative methods will provide comprehensive and critical data on how patients cope with 'actionable' and 'non-actionable' results. This information is needed to ensure that when tumour genomic profiling becomes part of routine clinical care, ethical considerations are embedded, and patients are adequately prepared and supported during and after receiving results. Not required for this sub-study, parent trial registration ACTRN12616000908437 .
Brizuela, Leonardo; Richardson, Aaron; Marsischky, Gerald; Labaer, Joshua
2002-01-01
Thanks to the results of the multiple completed and ongoing genome sequencing projects and to the newly available recombination-based cloning techniques, it is now possible to build gene repositories with no precedent in their composition, formatting, and potential. This new type of gene repository is necessary to address the challenges imposed by the post-genomic era, i.e., experimentation on a genome-wide scale. We are building the FLEXGene (Full Length EXpression-ready) repository. This unique resource will contain clones representing the complete ORFeome of different organisms, including Homo sapiens as well as several pathogens and model organisms. It will consist of a comprehensive, characterized (sequence-verified), and arrayed gene repository. This resource will allow full exploitation of the genomic information by enabling genome-wide scale experimentation at the level of functional/phenotypic assays as well as at the level of protein expression, purification, and analysis. Here we describe the rationale and construction of this resource and focus on the data obtained from the Saccharomyces cerevisiae project.
Kelly, Laura J; Renny-Byfield, Simon; Pellicer, Jaume; Macas, Jiří; Novák, Petr; Neumann, Pavel; Lysak, Martin A; Day, Peter D; Berger, Madeleine; Fay, Michael F; Nichols, Richard A; Leitch, Andrew R; Leitch, Ilia J
2015-10-01
Plants exhibit an extraordinary range of genome sizes, varying by > 2000-fold between the smallest and largest recorded values. In the absence of polyploidy, changes in the amount of repetitive DNA (transposable elements and tandem repeats) are primarily responsible for genome size differences between species. However, there is ongoing debate regarding the relative importance of amplification of repetitive DNA versus its deletion in governing genome size. Using data from 454 sequencing, we analysed the most repetitive fraction of some of the largest known genomes for diploid plant species, from members of Fritillaria. We revealed that genomic expansion has not resulted from the recent massive amplification of just a handful of repeat families, as shown in species with smaller genomes. Instead, the bulk of these immense genomes is composed of highly heterogeneous, relatively low-abundance repeat-derived DNA, supporting a scenario where amplified repeats continually accumulate due to infrequent DNA removal. Our results indicate that a lack of deletion and low turnover of repetitive DNA are major contributors to the evolution of extremely large genomes and show that their size cannot simply be accounted for by the activity of a small number of high-abundance repeat families. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Current Status and Future Prospects of Marine Natural Products (MNPs) as Antimicrobials.
Choudhary, Alka; Naughton, Lynn M; Montánchez, Itxaso; Dobson, Alan D W; Rai, Dilip K
2017-08-28
The marine environment is a rich source of chemically diverse, biologically active natural products, and serves as an invaluable resource in the ongoing search for novel antimicrobial compounds. Recent advances in extraction and isolation techniques, and in state-of-the-art technologies involved in organic synthesis and chemical structure elucidation, have accelerated the numbers of antimicrobial molecules originating from the ocean moving into clinical trials. The chemical diversity associated with these marine-derived molecules is immense, varying from simple linear peptides and fatty acids to complex alkaloids, terpenes and polyketides, etc. Such an array of structurally distinct molecules performs functionally diverse biological activities against many pathogenic bacteria and fungi, making marine-derived natural products valuable commodities, particularly in the current age of antimicrobial resistance. In this review, we have highlighted several marine-derived natural products (and their synthetic derivatives), which have gained recognition as effective antimicrobial agents over the past five years (2012-2017). These natural products have been categorized based on their chemical structures and the structure-activity mediated relationships of some of these bioactive molecules have been discussed. Finally, we have provided an insight into how genome mining efforts are likely to expedite the discovery of novel antimicrobial compounds.
TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS
Jones, Matthew R.; Good, Jeffrey M.
2016-01-01
The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993
Kiselev, O I; Vasin, A V; Shevyryova, M P; Deeva, E G; Sivak, K V; Egorov, V V; Tsvetkov, V B; Egorov, A Yu; Romanovskaya-Romanko, E A; Stepanova, L A; Komissarov, A B; Tsybalova, L M; Ignatjev, G M
2015-01-01
Ebola hemorrhagic fever (EHF) epidemic currently ongoing in West Africa is not the first among numerous epidemics in the continent. Yet it seems to be the worst EHF epidemic outbreak caused by Ebola virus Zaire since 1976 as regards its extremely large scale and rapid spread in the population. Experiments to study the agent have continued for more than 20 years. The EHF virus has a relatively simple genome with seven genes and additional reading frame resulting from RNA editing. While being of a relatively low genetic capacity, the virus can be ranked as a standard for pathogenicity with the ability to evade the host immune response in uttermost perfection. The EHF virus has similarities with retroviruses, but belongs to (-)RNA viruses of a nonretroviral origin. Genetic elements of the virus, NIRV, were detected in animal and human genomes. EHF virus glycoprotein (GP) is a class I fusion protein and shows more similarities than distinctions in tertiary structure with SIV and HIV gp41 proteins and even influenza virus hemagglutinin. EHF is an unusual infectious disease, and studying the molecular basis of its pathogenesis may contribute to new findings in therapy of severe conditions leading to a fatal outcome.
The Pisum Genus: Getting out of Pea Soup!
USDA-ARS?s Scientific Manuscript database
Pea (Pisum sativum L.) has long been a model for plant genetics and is a widely grown pulse crop producing protein-rich seeds in a sustainable manner. However, many questions remain open about (sub)species relationships in the Pisumgenus. The ongoing pea genome sequencing project and the recent geno...
USDA-ARS?s Scientific Manuscript database
Tepary bean (Phaseolus acutifolius A. Gray) is adapted to high temperature arid agroecological zones. In light of the ongoing and rapid changes in the world climate, the evaluation and development of alternate grain legume species that have similar nutritional and culinary characteristics as common ...
How could disclosing incidental information from whole-genome sequencing affect patient behavior?
Christensen, Kurt D; Green, Robert C
2013-01-01
In this article, we argue that disclosure of incidental findings from whole-genome sequencing has the potential to motivate individuals to change health behaviors through psychological mechanisms that differ from typical risk assessment interventions. Their ability to do so, however, is likely to be highly contingent upon the nature of the incidental findings and how they are disclosed, the context of the disclosure and the characteristics of the patient. Moreover, clinicians need to be aware that behavioral responses may occur in unanticipated ways. This article argues for commentators and policy makers to take a cautious but optimistic perspective while empirical evidence is collected through ongoing research involving whole-genome sequencing and the disclosure of incidental information. PMID:24319470
Genome characteristics dictate poly-R-(3)-hydroxyalkanoate production in Cupriavidus necator H16.
Kutralam-Muniasamy, Gurusamy; Peréz-Guevara, Fermín
2018-05-24
Cupriavidus necator H16 is a well-recognized enterprise with efficient manufacturing machineries to produce diverse polymers belonging to polyhydroxyalkanoates (PHAs) family. The genome fingerprints, including PHA machinery proteins and fatty acid metabolism, had educated engineering strategies to enhance PHAs production. This outstanding progress has enlightened us to present an exhaustive examination of the ongoing research, addressing the great potential design of genome features towards PHA production and furthermore, we show how those acquired knowledge have been explored in other biotechnological applications. This updated-review concludes that the combination of an optimal strain selection, suitable metabolic engineering and a large-scale fermentation on oil substrates is critical to endow the ability of incorporating mcl-PHAs monomers in this organism.
How could disclosing incidental information from whole-genome sequencing affect patient behavior?
Christensen, Kurt D; Green, Robert C
2013-06-01
In this article, we argue that disclosure of incidental findings from whole-genome sequencing has the potential to motivate individuals to change health behaviors through psychological mechanisms that differ from typical risk assessment interventions. Their ability to do so, however, is likely to be highly contingent upon the nature of the incidental findings and how they are disclosed, the context of the disclosure and the characteristics of the patient. Moreover, clinicians need to be aware that behavioral responses may occur in unanticipated ways. This article argues for commentators and policy makers to take a cautious but optimistic perspective while empirical evidence is collected through ongoing research involving whole-genome sequencing and the disclosure of incidental information.
The three-dimensional genome organization of Drosophila melanogaster through data integration.
Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank
2017-07-31
Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.
Genetics and Genomics of Coronary Artery Disease.
Pjanic, Milos; Miller, Clint L; Wirka, Robert; Kim, Juyong B; DiRenzo, Daniel M; Quertermous, Thomas
2016-10-01
Coronary artery disease (or coronary heart disease), is the leading cause of mortality in many of the developing as well as the developed countries of the world. Cholesterol-enriched plaques in the heart's blood vessels combined with inflammation lead to the lesion expansion, narrowing of blood vessels, reduced blood flow, and may subsequently cause lesion rupture and a heart attack. Even though several environmental risk factors have been established, such as high LDL-cholesterol, diabetes, and high blood pressure, the underlying genetic composition may substantially modify the disease risk; hence, genome composition and gene-environment interactions may be critical for disease progression. Ongoing scientific efforts have seen substantial advancements related to the fields of genetics and genomics, with the major breakthroughs yet to come. As genomics is the most rapidly advancing field in the life sciences, it is important to present a comprehensive overview of current efforts. Here, we present a summary of various genetic and genomics assays and approaches applied to coronary artery disease research.
The Metamorphosis of Amphibian Toxicogenomics
Helbing, Caren C.
2012-01-01
Amphibians are important vertebrates in toxicology often representing both aquatic and terrestrial forms within the life history of the same species. Of the thousands of species, only two have substantial genomics resources: the recently published genome of the Pipid, Xenopus (Silurana) tropicalis, and transcript information (and ongoing genome sequencing project) of Xenopus laevis. However, many more species representative of regional ecological niches and life strategies are used in toxicology worldwide. Since Xenopus species diverged from the most populous frog family, the Ranidae, ~200 million years ago, there are notable differences between them and the even more distant Caudates (salamanders) and Caecilians. These differences include genome size, gene composition, and extent of polyploidization. Application of toxicogenomics to amphibians requires the mobilization of resources and expertise to develop de novo sequence assemblies and analysis strategies for a broader range of amphibian species. The present mini-review will present the advances in toxicogenomics as pertains to amphibians with particular emphasis upon the development and use of genomic techniques (inclusive of transcriptomics, proteomics, and metabolomics) and the challenges inherent therein. PMID:22435070
Shaffer, Christopher D.; Alvarez, Consuelo; Bailey, Cheryl; Barnard, Daron; Bhalla, Satish; Chandrasekaran, Chitra; Chandrasekaran, Vidya; Chung, Hui-Min; Dorer, Douglas R.; Du, Chunguang; Eckdahl, Todd T.; Poet, Jeff L.; Frohlich, Donald; Goodman, Anya L.; Gosser, Yuying; Hauser, Charles; Hoopes, Laura L.M.; Johnson, Diana; Jones, Christopher J.; Kaehler, Marian; Kokan, Nighat; Kopp, Olga R.; Kuleck, Gary A.; McNeil, Gerard; Moss, Robert; Myka, Jennifer L.; Nagengast, Alexis; Morris, Robert; Overvoorde, Paul J.; Shoop, Elizabeth; Parrish, Susan; Reed, Kelynne; Regisford, E. Gloria; Revie, Dennis; Rosenwald, Anne G.; Saville, Ken; Schroeder, Stephanie; Shaw, Mary; Skuse, Gary; Smith, Christopher; Smith, Mary; Spana, Eric P.; Spratt, Mary; Stamm, Joyce; Thompson, Jeff S.; Wawersik, Matthew; Wilson, Barbara A.; Youngblom, Jim; Leung, Wilson; Buhler, Jeremy; Mardis, Elaine R.; Lopatto, David
2010-01-01
Genomics is not only essential for students to understand biology but also provides unprecedented opportunities for undergraduate research. The goal of the Genomics Education Partnership (GEP), a collaboration between a growing number of colleges and universities around the country and the Department of Biology and Genome Center of Washington University in St. Louis, is to provide such research opportunities. Using a versatile curriculum that has been adapted to many different class settings, GEP undergraduates undertake projects to bring draft-quality genomic sequence up to high quality and/or participate in the annotation of these sequences. GEP undergraduates have improved more than 2 million bases of draft genomic sequence from several species of Drosophila and have produced hundreds of gene models using evidence-based manual annotation. Students appreciate their ability to make a contribution to ongoing research, and report increased independence and a more active learning approach after participation in GEP projects. They show knowledge gains on pre- and postcourse quizzes about genes and genomes and in bioinformatic analysis. Participating faculty also report professional gains, increased access to genomics-related technology, and an overall positive experience. We have found that using a genomics research project as the core of a laboratory course is rewarding for both faculty and students. PMID:20194808
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reddy, Tatiparthi B. K.; Thomas, Alex D.; Stamatis, Dimitri
The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencingmore » projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.« less
Clinical providers' experiences with returning results from genomic sequencing: an interview study.
Wynn, Julia; Lewis, Katie; Amendola, Laura M; Bernhardt, Barbara A; Biswas, Sawona; Joshi, Manasi; McMullen, Carmit; Scollon, Sarah
2018-05-08
Current medical practice includes the application of genomic sequencing (GS) in clinical and research settings. Despite expanded use of this technology, the process of disclosure of genomic results to patients and research participants has not been thoroughly examined and there are no established best practices. We conducted semi-structured interviews with 21 genetic and non-genetic clinicians returning results of GS as part of the NIH funded Clinical Sequencing Exploratory Research (CSER) Consortium projects. Interviews focused on the logistics of sessions, participant/patient reactions and factors influencing them, how the sessions changed with experience, and resources and training recommended to return genomic results. The length of preparation and disclosure sessions varied depending on the type and number of results and their implications. Internal and external databases, online resources and result review meetings were used to prepare. Respondents reported that participants' reactions were variable and ranged from enthusiasm and relief to confusion and disappointment. Factors influencing reactions were types of results, expectations and health status. A recurrent challenge was managing inflated expectations about GS. Other challenges included returning multiple, unanticipated and/or uncertain results and navigating a rare diagnosis. Methods to address these challenges included traditional genetic counseling techniques and modifying practice over time in order to provide anticipatory guidance and modulate expectations. Respondents made recommendations to improve access to genomic resources and genetic referrals to prepare future providers as the uptake of GS increases in both genetic and non-genetic settings. These findings indicate that returning genomic results is similar to return of results in traditional genetic testing but is magnified by the additional complexity and potential uncertainty of the results. Managing patient expectations, initially identified in studies of informed consent, remains an ongoing challenge and highlights the need to address this issue throughout the testing process. The results of this study will help to guide future providers in the disclosure of genomic results and highlight educational needs and resources necessary to prepare providers. Future research on the patient experience, understanding and follow-up of recommendations is needed to more fully understand the disclosure process.
Concise classification of the genomic porcine endogenous retroviral gamma1 load to defined lineages.
Klymiuk, Nikolai; Wolf, Eckhard; Aigner, Bernhard
2008-02-05
We investigated the infection history of porcine endogenous retroviruses (PERV) gamma1 by analyzing published env and LTR sequences. PERV sequences from various breeds, porcine cell lines and infected human primary cells were included in the study. We identified a considerable number of retroviral lineages indicating multiple independent colonization events of the porcine genome. A recent boost of the proviral load in an isolated pig herd and exclusive occurrence of distinct lineages in single studies indicated the ongoing colonization of the porcine genome with endogenous retroviruses. Retroviral recombination between co-packaged genomes was a general factor for PERV gamma1 diversity which indicated the simultaneous expression of different proviral loci over a period of time. In total, our detailed description of endogenous retroviral lineages is the prerequisite for breeding approaches to minimize the infectious potential of porcine tissues for the subsequent use in xenotransplantation.
Pagani, Ioanna; Liolios, Konstantinos; Jansson, Jakob; Chen, I-Min A.; Smirnova, Tatyana; Nosrat, Bahador; Markowitz, Victor M.; Kyrpides, Nikos C.
2012-01-01
The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11 472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond. PMID:22135293
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daum, Christopher; Zane, Matthew; Han, James
2011-01-31
The U.S. Department of Energy (DOE) Joint Genome Institute's (JGI) Production Sequencing group is committed to the generation of high-quality genomic DNA sequence to support the mission areas of renewable energy generation, global carbon management, and environmental characterization and clean-up. Within the JGI's Production Sequencing group, a robust Illumina Genome Analyzer and HiSeq pipeline has been established. Optimization of the sesequencer pipelines has been ongoing with the aim of continual process improvement of the laboratory workflow, reducing operational costs and project cycle times to increases ample throughput, and improving the overall quality of the sequence generated. A sequence QC analysismore » pipeline has been implemented to automatically generate read and assembly level quality metrics. The foremost of these optimization projects, along with sequencing and operational strategies, throughput numbers, and sequencing quality results will be presented.« less
A Molecular Phylogeny of Living Primates
Perelman, Polina; Johnson, Warren E.; Roos, Christian; Seuánez, Hector N.; Horvath, Julie E.; Moreira, Miguel A. M.; Kessing, Bailey; Pontius, Joan; Roelke, Melody; Rumpler, Yves; Schneider, Maria Paula C.; Silva, Artur; O'Brien, Stephen J.; Pecon-Slattery, Jill
2011-01-01
Comparative genomic analyses of primates offer considerable potential to define and understand the processes that mold, shape, and transform the human genome. However, primate taxonomy is both complex and controversial, with marginal unifying consensus of the evolutionary hierarchy of extant primate species. Here we provide new genomic sequence (∼8 Mb) from 186 primates representing 61 (∼90%) of the described genera, and we include outgroup species from Dermoptera, Scandentia, and Lagomorpha. The resultant phylogeny is exceptionally robust and illuminates events in primate evolution from ancient to recent, clarifying numerous taxonomic controversies and providing new data on human evolution. Ongoing speciation, reticulate evolution, ancient relic lineages, unequal rates of evolution, and disparate distributions of insertions/deletions among the reconstructed primate lineages are uncovered. Our resolution of the primate phylogeny provides an essential evolutionary framework with far-reaching applications including: human selection and adaptation, global emergence of zoonotic diseases, mammalian comparative genomics, primate taxonomy, and conservation of endangered species. PMID:21436896
Pagani, Ioanna; Liolios, Konstantinos; Jansson, Jakob; Chen, I-Min A; Smirnova, Tatyana; Nosrat, Bahador; Markowitz, Victor M; Kyrpides, Nikos C
2012-01-01
The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11,472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond.
Butler, J B; Vaillancourt, R E; Potts, B M; Lee, D J; King, G J; Baten, A; Shepherd, M; Freeman, J S
2017-05-22
Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomic information between these important taxa and provide further insights into the rate of structural change in tree genomes. We constructed three high density linkage maps for two Corymbia species (Corymbia citriodora subsp. variegata and Corymbia torelliana) which were used to compare genome structure between both species and Eucalyptus grandis. Genome structure was highly conserved between the Corymbia species. However, the comparison of Corymbia and E. grandis suggests large (from 1-13 MB) intra-chromosomal rearrangements have occurred on seven of the 11 chromosomes. Most rearrangements were supported through comparisons of the three independent Corymbia maps to the E. grandis genome sequence, and to other independently constructed Eucalyptus linkage maps. These are the first large scale chromosomal rearrangements discovered between eucalypts. Nonetheless, in the general context of plants, the genomic structure of the two genera was remarkably conserved; adding to a growing body of evidence that conservation of genome structure is common amongst woody angiosperms.
2014-01-01
Background The advent of human genome sequencing project has led to a spurt in the number of protein sequences in the databanks. Success of structure based drug discovery severely hinges on the availability of structures. Despite significant progresses in the area of experimental protein structure determination, the sequence-structure gap is continually widening. Data driven homology based computational methods have proved successful in predicting tertiary structures for sequences sharing medium to high sequence similarities. With dwindling similarities of query sequences, advanced homology/ ab initio hybrid approaches are being explored to solve structure prediction problem. Here we describe Bhageerath-H, a homology/ ab initio hybrid software/server for predicting protein tertiary structures with advancing drug design attempts as one of the goals. Results Bhageerath-H web-server was validated on 75 CASP10 targets which showed TM-scores ≥0.5 in 91% of the cases and Cα RMSDs ≤5Å from the native in 58% of the targets, which is well above the CASP10 water mark. Comparison with some leading servers demonstrated the uniqueness of the hybrid methodology in effectively sampling conformational space, scoring best decoys and refining low resolution models to high and medium resolution. Conclusion Bhageerath-H methodology is web enabled for the scientific community as a freely accessible web server. The methodology is fielded in the on-going CASP11 experiment. PMID:25521245
Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.
Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan
2016-01-01
Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.
Characterization of Transposable Elements in Laccaria bicolor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle
2012-01-01
Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copies elements distributed within 172 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs are ancient except some terminal inverted repeats (TIRS),more » long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TEs expansion in L. bicolor; the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 500,000 years ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis represents an initial characterization of TEs in the L. bicolor genome, contributes to genome assembly and to a greater understanding of the role TEs played in genome organization and evolution, and provides a valuable resource for the ongoing Laccaria Pan-Genome project supported by the U.S.-DOE Joint Genome Institute.« less
Functional Genomics of Allergen Gene Families in Fruits
Maghuly, Fatemeh; Marzban, Gorji; Laimer, Margit
2009-01-01
Fruit consumption is encouraged for health reasons; however, fruits may harbour a series of allergenic proteins that may cause discomfort or even represent serious threats to certain individuals. Thus, the identification and characterization of allergens in fruits requires novel approaches involving genomic and proteomic tools. Since avoidance of fruits also negatively affects the quality of patients’ lives, biotechnological interventions are ongoing to produce low allergenic fruits by down regulating specific genes. In this respect, the control of proteins associated with allergenicity could be achieved by fine tuning the spatial and temporal expression of the relevant genes. PMID:22253972
Postdoctoral Fellow | Center for Cancer Research
One postdoctoral position is available immediately to join the ongoing laboratory research program aimed at defining the mechanism that ensures chromosome stability in normal cells, stem cells as well as in pre-cancerous cells. This research project aims to provide critical insight into the molecular pathways that cause genome instability and promote tumorigenesis. The ideal
Li, Jian; Harris, R. Alan; Cheung, Sau Wai; Coarfa, Cristian; Jeong, Mira; Goodell, Margaret A.; White, Lisa D.; Patel, Ankita; Kang, Sung-Hae; Shaw, Chad; Chinault, A. Craig; Gambin, Tomasz; Gambin, Anna; Lupski, James R.; Milosavljevic, Aleksandar
2012-01-01
The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease. PMID:22615578
Dron, M; Hartmann, C; Rode, A; Sevignac, M
1985-01-01
We have characterized a 1.7 kb sequence, containing a tRNA Leu2 gene shared by the ct and mt genomes of Brassica oleracea. The two sequences are completely homologous except in two short regions where two distinct gene conversion events have occurred between two sets of direct repeats leading to the insertion of 5 bp in the T loop of the mt copy of the ct gene. This is the first evidence that gene conversion represents the initial evolutionary step in inactivation of transferred ct genes in the mt genome. We also indicate that organelle DNA transfer by organelle fusion is an ongoing process which could be useful in genetic engineering. PMID:4080548
Families of transposable elements, population structure and the origin of species.
Jurka, Jerzy; Bao, Weidong; Kojima, Kenji K
2011-09-19
Eukaryotic genomes harbor diverse families of repetitive DNA derived from transposable elements (TEs) that are able to replicate and insert into genomic DNA. The biological role of TEs remains unclear, although they have profound mutagenic impact on eukaryotic genomes and the origin of repetitive families often correlates with speciation events. We present a new hypothesis to explain the observed correlations based on classical concepts of population genetics. The main thesis presented in this paper is that the TE-derived repetitive families originate primarily by genetic drift in small populations derived mostly by subdivisions of large populations into subpopulations. We outline the potential impact of the emerging repetitive families on genetic diversification of different subpopulations, and discuss implications of such diversification for the origin of new species. Several testable predictions of the hypothesis are examined. First, we focus on the prediction that the number of diverse families of TEs fixed in a representative genome of a particular species positively correlates with the cumulative number of subpopulations (demes) in the historical metapopulation from which the species has emerged. Furthermore, we present evidence indicating that human AluYa5 and AluYb8 families might have originated in separate proto-human subpopulations. We also revisit prior evidence linking the origin of repetitive families to mammalian phylogeny and present additional evidence linking repetitive families to speciation based on mammalian taxonomy. Finally, we discuss evidence that mammalian orders represented by the largest numbers of species may be subject to relatively recent population subdivisions and speciation events. The hypothesis implies that subdivision of a population into small subpopulations is the major step in the origin of new families of TEs as well as of new species. The origin of new subpopulations is likely to be driven by the availability of new biological niches, consistent with the hypothesis of punctuated equilibria. The hypothesis also has implications for the ongoing debate on the role of genetic drift in genome evolution.
Contrasting Patterns of rDNA Homogenization within the Zygosaccharomyces rouxii Species Complex
Chand Dakal, Tikam; Giudici, Paolo; Solieri, Lisa
2016-01-01
Arrays of repetitive ribosomal DNA (rDNA) sequences are generally expected to evolve as a coherent family, where repeats within such a family are more similar to each other than to orthologs in related species. The continuous homogenization of repeats within individual genomes is a recombination process termed concerted evolution. Here, we investigated the extent and the direction of concerted evolution in 43 yeast strains of the Zygosaccharomyces rouxii species complex (Z. rouxii, Z. sapae, Z. mellis), by analyzing two portions of the 35S rDNA cistron, namely the D1/D2 domains at the 5’ end of the 26S rRNA gene and the segment including the internal transcribed spacers (ITS) 1 and 2 (ITS regions). We demonstrate that intra-genomic rDNA sequence variation is unusually frequent in this clade and that rDNA arrays in single genomes consist of an intermixing of Z. rouxii, Z. sapae and Z. mellis-like sequences, putatively evolved by reticulate evolutionary events that involved repeated hybridization between lineages. The levels and distribution of sequence polymorphisms vary across rDNA repeats in different individuals, reflecting four patterns of rDNA evolution: I) rDNA repeats that are homogeneous within a genome but are chimeras derived from two parental lineages via recombination: Z. rouxii in the ITS region and Z. sapae in the D1/D2 region; II) intra-genomic rDNA repeats that retain polymorphisms only in ITS regions; III) rDNA repeats that vary only in their D1/D2 domains; IV) heterogeneous rDNA arrays that have both polymorphic ITS and D1/D2 regions. We argue that an ongoing process of homogenization following allodiplodization or incomplete lineage sorting gave rise to divergent evolutionary trajectories in different strains, depending upon temporal, structural and functional constraints. We discuss the consequences of these findings for Zygosaccharomyces species delineation and, more in general, for yeast barcoding. PMID:27501051
Kurath, G.; Dodds, J.A.
1995-01-01
The high level of genetic diversity and rapid evolution of viral RNA genomes are well documented, but few studies have characterized the rate and nature of ongoing genetic change over time under controlled experimental conditions, especially in plant hosts. The RNA genome of satellite tobacco mosaic virus (STMV) was used as an effective model for such studies because of advantageous features of its genome structure and because the extant genetic heterogeneity of STMV has been characterized previously. In the present study, the process of genetic change over time was studied by monitoring multiple serial passage lines of STMV populations for changes in their consensus sequences. A total of 42 passage lines were initiated by inoculation of tobacco plants with a helper tobamovirus and one of four STMV RNA inocula that were transcribed from full-length infectious STMV clones or extracted from purified STMV type strain virions. Ten serial passages were carried out for each line and the consensus genotypes of progeny STMV populations were assessed for genetic change by RNase protection analyses of the entire 1,059-nt STMV genome. Three different types of genetic change were observed, including the fixation of novel mutations in 9 of 42 lines, mutation at the major heterogeneity site near nt 751 in 5 of the 19 lines inoculated with a single genotype, and selection of a single major genotype in 6 of the 23 lines inoculated with mixed genotypes. Sequence analyses showed that the majority of mutations were single base substitutions. The distribution of mutation sites included three clusters in which mutations occurred at or very near the same site, suggesting hot spots of genetic change in the STMV genome. The diversity of genetic changes in sibling lines is clear evidence for the important role of chance and random sampling events in the process of genetic diversification of STMV virus populations.
A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6
2011-01-01
Background The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. Results Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. Conclusions The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays. PMID:21846342
A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6.
Saski, Christopher A; Feltus, Frank A; Staton, Margaret E; Blackmon, Barbara P; Ficklin, Stephen P; Kuhn, David N; Schnell, Raymond J; Shapiro, Howard; Motamayor, Juan Carlos
2011-08-16
The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.
Issa, Amalia M; Hutchinson, Janis F; Tufail, Waqas; Fletcher, Erica; Ajike, Roseline; Tenorio, Jose
2011-07-01
Several novel pharmacogenomic diagnostic tests are commercially available for breast and colorectal cancer, and are increasingly being used in clinical practice for improving treatment decisions. However, there is little evidence evaluating the value of these new genomic technologies from the perspective of patients. As part of an ongoing effort to understand the continuum of the process of adoption of genomic diagnostics, our aim in this study was to examine the value of genomic diagnostics to breast and colorectal cancer patients, and their willingness to adopt and use genomic diagnostics. We conducted six focus groups of breast and colorectal cancer patients from the oncology clinics at The Methodist Hospital, Houston, TX, USA. An adapted Q-sort instrument was also administered to focus group participants. The majority of breast and colorectal cancer patients are interested in using novel genomic diagnostics for deciding about treatment options. Most participants in our study expressed a willingness to pay out-of-pocket for genomic testing (z = 0.736). Reliability and validity of genomic testing were of significant concern (z = 1.32) for the majority of breast and colorectal cancer patients. Participants identified several facilitators and barriers within health systems that might either facilitate or impede the widespread adoption and use of genomic diagnostics in healthcare delivery. This study demonstrates breast and colorectal cancer patients' willingness to adopt and pay for novel genomic diagnostics, as well as identifies several salient factors associated with patient preferences for genomic diagnostics.
Ultra-Structure database design methodology for managing systems biology data and analyses
Maier, Christopher W; Long, Jeffrey G; Hemminger, Bradley M; Giddings, Morgan C
2009-01-01
Background Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping). Results We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. Conclusion We find Ultra-Structure offers substantial benefits for biological information systems, the largest being the integration of diverse information sources into a common framework. This facilitates systems biology research by integrating data from disparate high-throughput techniques. It also enables us to readily incorporate new data types, sources, and domain knowledge with no change to the database structure or associated computer code. Ultra-Structure may be a significant step towards solving the hard problem of data management and integration in the systems biology era. PMID:19691849
NASA Astrophysics Data System (ADS)
Dick, G. J.; Andersson, A.; Banfield, J. F.
2007-12-01
Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are not expected to reflect the tetranucleotide frequency signature of the host genome. Four unknown tetranucleotide frequency clusters with significant sequence (6 Mb total) were noted and analyzed further. Based on phylogenetic markers and BLAST results, these clusters represent low abundance bacteria including Acintobacteria, Firmicutes, and Proteobacteria. Functional analysis of these clusters revealved that the low- abundance bacteria harbor genes that could potentially encode important ecosystem functions such as sulfur utilization (e.g. polysulfide reductase) and polymer degradation (e.g. chitinase and glycoside hydrolase). We conclude that ESOM clustering of tetranucleotide frequency patterns is an effective method for rapidly binning shotgun community genomic sequences and a valuable tool for analyzing minor community members, which despite their low abundance may play crucial ecological roles.
2009-01-01
Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and generates a large volume of experimental data continuously. Therefore, the project calls for an efficient database system for recording, management, statistical analysis, and information exchange. Results This paper presents a database application called MP-PBmice (insertional mutation mapping system of PB Mutagenesis Information Center), which is developed to serve the on-going large-scale PB insertional mutagenesis project. A lightweight enterprise-level development framework Struts-Spring-Hibernate is used here to ensure constructive and flexible support to the application. The MP-PBmice database system has three major features: strict access-control, efficient workflow control, and good expandability. It supports the collaboration among different groups that enter data and exchange information on daily basis, and is capable of providing real time progress reports for the whole project. MP-PBmice can be easily adapted for other large-scale insertional mutation mapping projects and the source code of this software is freely available at http://www.idmshanghai.cn/PBmice. Conclusion MP-PBmice is a web-based application for large-scale insertional mutation mapping onto the mouse genome, implemented with the widely used framework Struts-Spring-Hibernate. This system is already in use by the on-going genome-wide PB insertional mutation mapping project at IDM, Fudan University. PMID:19958505
Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine
2013-01-01
Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
Improved tank car design development : ongoing studies on sandwich structures
DOT National Transportation Integrated Search
2009-03-02
The Government and industry have a common interest in : improving the safety performance of railroad tank cars carrying : hazardous materials. Research is ongoing to develop strategies : to maintain the structural integrity of railroad tank cars carr...
A second-generation anchored genetic linkage map of the tammar wallaby (Macropus eugenii)
2011-01-01
Background The tammar wallaby, Macropus eugenii, a small kangaroo used for decades for studies of reproduction and metabolism, is the model Australian marsupial for genome sequencing and genetic investigations. The production of a more comprehensive cytogenetically-anchored genetic linkage map will significantly contribute to the deciphering of the tammar wallaby genome. It has great value as a resource to identify novel genes and for comparative studies, and is vital for the ongoing genome sequence assembly and gene ordering in this species. Results A second-generation anchored tammar wallaby genetic linkage map has been constructed based on a total of 148 loci. The linkage map contains the original 64 loci included in the first-generation map, plus an additional 84 microsatellite loci that were chosen specifically to increase coverage and assist with the anchoring and orientation of linkage groups to chromosomes. These additional loci were derived from (a) sequenced BAC clones that had been previously mapped to tammar wallaby chromosomes by fluorescence in situ hybridization (FISH), (b) End sequence from BACs subsequently FISH-mapped to tammar wallaby chromosomes, and (c) tammar wallaby genes orthologous to opossum genes predicted to fill gaps in the tammar wallaby linkage map as well as three X-linked markers from a published study. Based on these 148 loci, eight linkage groups were formed. These linkage groups were assigned (via FISH-mapped markers) to all seven autosomes and the X chromosome. The sex-pooled map size is 1402.4 cM, which is estimated to provide 82.6% total coverage of the genome, with an average interval distance of 10.9 cM between adjacent markers. The overall ratio of female/male map length is 0.84, which is comparable to the ratio of 0.78 obtained for the first-generation map. Conclusions Construction of this second-generation genetic linkage map is a significant step towards complete coverage of the tammar wallaby genome and considerably extends that of the first-generation map. It will be a valuable resource for ongoing tammar wallaby genetic research and assembling the genome sequence. The sex-pooled map is available online at http://compldb.angis.org.au/. PMID:21854616
A second-generation anchored genetic linkage map of the tammar wallaby (Macropus eugenii).
Wang, Chenwei; Webley, Lee; Wei, Ke-jun; Wakefield, Matthew J; Patel, Hardip R; Deakin, Janine E; Alsop, Amber; Marshall Graves, Jennifer A; Cooper, Desmond W; Nicholas, Frank W; Zenger, Kyall R
2011-08-19
The tammar wallaby, Macropus eugenii, a small kangaroo used for decades for studies of reproduction and metabolism, is the model Australian marsupial for genome sequencing and genetic investigations. The production of a more comprehensive cytogenetically-anchored genetic linkage map will significantly contribute to the deciphering of the tammar wallaby genome. It has great value as a resource to identify novel genes and for comparative studies, and is vital for the ongoing genome sequence assembly and gene ordering in this species. A second-generation anchored tammar wallaby genetic linkage map has been constructed based on a total of 148 loci. The linkage map contains the original 64 loci included in the first-generation map, plus an additional 84 microsatellite loci that were chosen specifically to increase coverage and assist with the anchoring and orientation of linkage groups to chromosomes. These additional loci were derived from (a) sequenced BAC clones that had been previously mapped to tammar wallaby chromosomes by fluorescence in situ hybridization (FISH), (b) End sequence from BACs subsequently FISH-mapped to tammar wallaby chromosomes, and (c) tammar wallaby genes orthologous to opossum genes predicted to fill gaps in the tammar wallaby linkage map as well as three X-linked markers from a published study. Based on these 148 loci, eight linkage groups were formed. These linkage groups were assigned (via FISH-mapped markers) to all seven autosomes and the X chromosome. The sex-pooled map size is 1402.4 cM, which is estimated to provide 82.6% total coverage of the genome, with an average interval distance of 10.9 cM between adjacent markers. The overall ratio of female/male map length is 0.84, which is comparable to the ratio of 0.78 obtained for the first-generation map. Construction of this second-generation genetic linkage map is a significant step towards complete coverage of the tammar wallaby genome and considerably extends that of the first-generation map. It will be a valuable resource for ongoing tammar wallaby genetic research and assembling the genome sequence. The sex-pooled map is available online at http://compldb.angis.org.au/.
Global Organization of a Positive-strand RNA Virus Genome
Wu, Baodong; Grigull, Jörg; Ore, Moriam O.; Morin, Sylvie; White, K. Andrew
2013-01-01
The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV) contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2′-hydroxyl acylation analysed by primer extension (i.e. SHAPE), which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context. PMID:23717202
Auch, Alexander F; Klenk, Hans-Peter; Göker, Markus
2010-01-28
DNA-DNA hybridization (DDH) is a widely applied wet-lab technique to obtain an estimate of the overall similarity between the genomes of two organisms. To base the species concept for prokaryotes ultimately on DDH was chosen by microbiologists as a pragmatic approach for deciding about the recognition of novel species, but also allowed a relatively high degree of standardization compared to other areas of taxonomy. However, DDH is tedious and error-prone and first and foremost cannot be used to incrementally establish a comparative database. Recent studies have shown that in-silico methods for the comparison of genome sequences can be used to replace DDH. Considering the ongoing rapid technological progress of sequencing methods, genome-based prokaryote taxonomy is coming into reach. However, calculating distances between genomes is dependent on multiple choices for software and program settings. We here provide an overview over the modifications that can be applied to distance methods based in high-scoring segment pairs (HSPs) or maximally unique matches (MUMs) and that need to be documented. General recommendations on determining HSPs using BLAST or other algorithms are also provided. As a reference implementation, we introduce the GGDC web server (http://ggdc.gbdp.org).
Beldade, P; McMillan, W O; Papanicolaou, A
2008-02-01
Technological and conceptual advances of the last decade have led to an explosion of genomic data and the emergence of new research avenues. Evolutionary and ecological functional genomics, with its focus on the genes that affect ecological success and adaptation in natural populations, benefits immensely from a phylogenetically widespread sampling of biological patterns and processes. Among those organisms outside established model systems, butterflies offer exceptional opportunities for multidisciplinary research on the processes generating and maintaining variation in ecologically relevant traits. Here we highlight research on wing color pattern variation in two groups of Nymphalid butterflies, the African species Bicyclus anynana (subfamily Satyrinae) and species of the South American genus Heliconius (subfamily Heliconiinae), which are emerging as important systems for studying the nature and origins of functional diversity. Growing genomic resources including genomic and cDNA libraries, dense genetic maps, high-density gene arrays, and genetic transformation techniques are extending current gene mapping and expression profiling analysis and enabling the next generation of research questions linking genes, development, form, and fitness. Efforts to develop such resources in Bicyclus and Heliconius underscore the general challenges facing the larger research community and highlight the need for a community-wide effort to extend ongoing functional genomic research on butterflies.
Schully, Sheri D; Lam, Tram Kim; Dotson, W David; Chang, Christine Q; Aronson, Naomi; Birkeland, Marian L; Brewster, Stephanie Jo; Boccia, Stefania; Buchanan, Adam H; Calonge, Ned; Calzone, Kathleen; Djulbegovic, Benjamin; Goddard, Katrina A B; Klein, Roger D; Klein, Teri E; Lau, Joseph; Long, Rochelle; Lyman, Gary H; Morgan, Rebecca L; Palmer, Christina G S; Relling, Mary V; Rubinstein, Wendy S; Swen, Jesse J; Terry, Sharon F; Williams, Marc S; Khoury, Muin J
2015-01-01
With the accelerated implementation of genomic medicine, health-care providers will depend heavily on professional guidelines and recommendations. Because genomics affects many diseases across the life span, no single professional group covers the entirety of this rapidly developing field. To pursue a discussion of the minimal elements needed to develop evidence-based guidelines in genomics, the Centers for Disease Control and Prevention and the National Cancer Institute jointly held a workshop to engage representatives from 35 organizations with interest in genomics (13 of which make recommendations). The workshop explored methods used in evidence synthesis and guideline development and initiated a dialogue to compare these methods and to assess whether they are consistent with the Institute of Medicine report "Clinical Practice Guidelines We Can Trust." The participating organizations that develop guidelines or recommendations all had policies to manage guideline development and group membership, and processes to address conflicts of interests. However, there was wide variation in the reliance on external reviews, regular updating of recommendations, and use of systematic reviews to assess the strength of scientific evidence. Ongoing efforts are required to establish criteria for guideline development in genomic medicine as proposed by the Institute of Medicine.
USDA-ARS?s Scientific Manuscript database
The ongoing genome sequencing effort in peanut will result in numerous molecular markers that can be applied to the diverse collection of recently purified mini-core germplasm. This will provide an opportunity to mine valuable genes for peanut cultivar improvement. Association mapping based on linka...
Current Status and Future Prospects of Marine Natural Products (MNPs) as Antimicrobials
Choudhary, Alka; Naughton, Lynn M.; Montánchez, Itxaso
2017-01-01
The marine environment is a rich source of chemically diverse, biologically active natural products, and serves as an invaluable resource in the ongoing search for novel antimicrobial compounds. Recent advances in extraction and isolation techniques, and in state-of-the-art technologies involved in organic synthesis and chemical structure elucidation, have accelerated the numbers of antimicrobial molecules originating from the ocean moving into clinical trials. The chemical diversity associated with these marine-derived molecules is immense, varying from simple linear peptides and fatty acids to complex alkaloids, terpenes and polyketides, etc. Such an array of structurally distinct molecules performs functionally diverse biological activities against many pathogenic bacteria and fungi, making marine-derived natural products valuable commodities, particularly in the current age of antimicrobial resistance. In this review, we have highlighted several marine-derived natural products (and their synthetic derivatives), which have gained recognition as effective antimicrobial agents over the past five years (2012–2017). These natural products have been categorized based on their chemical structures and the structure-activity mediated relationships of some of these bioactive molecules have been discussed. Finally, we have provided an insight into how genome mining efforts are likely to expedite the discovery of novel antimicrobial compounds. PMID:28846659
ERIC Educational Resources Information Center
Wasley, Patricia A.; Lear, Richard J.
2001-01-01
Small school size (fewer than 400 students) makes possible success-enhancing structures and practices: strong, ongoing student/adult and home/school relationships; flat organizational structure; concentration on a few goals; ongoing, site-specific professional development; a respectful culture; and community engagement. Implementation barriers are…
Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.
Pauling, Josch; Klipp, Edda
2016-12-22
Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.
The genomic applications in practice and prevention network.
Khoury, Muin J; Feero, W Gregory; Reyes, Michele; Citrin, Toby; Freedman, Andrew; Leonard, Debra; Burke, Wylie; Coates, Ralph; Croyle, Robert T; Edwards, Karen; Kardia, Sharon; McBride, Colleen; Manolio, Teri; Randhawa, Gurvaneet; Rasooly, Rebekah; St Pierre, Jeannette; Terry, Sharon
2009-07-01
The authors describe the rationale and initial development of a new collaborative initiative, the Genomic Applications in Practice and Prevention Network. The network convened by the Centers for Disease Control and Prevention and the National Institutes of Health includes multiple stakeholders from academia, government, health care, public health, industry and consumers. The premise of Genomic Applications in Practice and Prevention Network is that there is an unaddressed chasm between gene discoveries and demonstration of their clinical validity and utility. This chasm is due to the lack of readily accessible information about the utility of most genomic applications and the lack of necessary knowledge by consumers and providers to implement what is known. The mission of Genomic Applications in Practice and Prevention Network is to accelerate and streamline the effective integration of validated genomic knowledge into the practice of medicine and public health, by empowering and sponsoring research, evaluating research findings, and disseminating high quality information on candidate genomic applications in practice and prevention. Genomic Applications in Practice and Prevention Network will develop a process that links ongoing collection of information on candidate genomic applications to four crucial domains: (1) knowledge synthesis and dissemination for new and existing technologies, and the identification of knowledge gaps, (2) a robust evidence-based recommendation development process, (3) translation research to evaluate validity, utility and impact in the real world and how to disseminate and implement recommended genomic applications, and (4) programs to enhance practice, education, and surveillance.
CoryneBase: Corynebacterium Genomic Resources and Analysis Tools at Your Fingertips
Tan, Mui Fern; Jakubovics, Nick S.; Wee, Wei Yee; Mutha, Naresh V. R.; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh
2014-01-01
Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/. PMID:24466021
Comparative regulatory approaches for groups of new plant breeding techniques.
Lusser, Maria; Davies, Howard V
2013-06-25
This manuscript provides insights into ongoing debates on the regulatory issues surrounding groups of biotechnology-driven 'New Plant Breeding Techniques' (NPBTs). It presents the outcomes of preliminary discussions and in some cases the initial decisions taken by regulators in the following countries: Argentina, Australia, Canada, EU, Japan, South Africa and USA. In the light of these discussions we suggest in this manuscript a structured approach to make the evaluation more consistent and efficient. The issue appears to be complex as these groups of new technologies vary widely in both the technologies deployed and their impact on heritable changes in the plant genome. An added complication is that the legislation, definitions and regulatory approaches for biotechnology-derived crops differ significantly between these countries. There are therefore concerns that this situation will lead to non-harmonised regulatory approaches and asynchronous development and marketing of such crops resulting in trade disruptions. Copyright © 2013 Elsevier B.V. All rights reserved.
Continually emerging mechanistic complexity of the multi-enzyme cellulosome complex.
Smith, Steven P; Bayer, Edward A; Czjzek, Mirjam
2017-06-01
The robust plant cell wall polysaccharide-degrading properties of anaerobic bacteria are harnessed within elegant, marcomolecular assemblages called cellulosomes, in which proteins of complementary activities amass on scaffold protein networks. Research efforts have focused and continue to focus on providing detailed mechanistic insights into cellulosomal complex assembly, topology, and function. The accumulated information is expanding our fundamental understanding of the lignocellulosic biomass decomposition process and enhancing the potential of engineered cellulosomal systems for biotechnological purposes. Ongoing biochemical studies continue to reveal unexpected functional diversity within traditional cellulase families. Genomic, proteomic, and functional analyses have uncovered unanticipated cellulosomal proteins that augment the function of the native and designer cellulosomes. In addition, complementary structural and computational methods are continuing to provide much needed insights on the influence of cellulosomal interdomain linker regions on cellulosomal assembly and activity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Relating hybrid advantage and genome replacement in unisexual salamanders.
Charney, Noah D
2012-05-01
Unisexual vertebrates are model systems for understanding the evolution of sex. Many predominantly clonal lineages allow occasional genetic recombination, which may be sufficient to avoid the accumulation of deleterious mutations and parasites. Introgression of paternal DNA into an all-female lineage represents a one-way flow of genetic material. Over many generations, this could result in complete replacement of the unisexual genomes by those of the donor species. The process of genome replacement may be counteracted by contemporary dispersal or by positive selection on hybrid nuclear genomes in ecotones. I present a conceptual model that relates nuclear genome replacement, positive selection on hybrids and biogeography in unisexual systems. I execute an individual-based simulation of the fate of hybrid genotypes in contact with a single host species. I parameterize these models for unisexual salamanders in the Ambystoma genus, for which the frequency of genome replacement has been a source of ongoing debate. I find that, if genome replacement occurs at a rate greater than 1/10,000 in Ambystoma, then there must be compensating positive selection in order to maintain observed levels of hybrid nuclei. Future researchers studying unisexual systems may use this framework as a guide to evaluating the hybrid superiority hypothesis. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
Zhao, Qiang; Yue, Shengjie; Bilal, Muhammad; Hu, Hongbo; Wang, Wei; Zhang, Xuehong
2017-12-31
Bacteria belonging to the genera Sphingomonas and Sphingobium are known for their ability to catabolize aromatic compounds. In this study, we analyzed the whole genome sequences of 26 strains in the genera Sphingomonas and Sphingobium to gain insight into dissemination of bioremediation capabilities, biodegradation potential, central pathways and genome plasticity. Phylogenetic analysis revealed that both Sphingomonas sp. strain BHC-A and Sphingomonas paucimobilis EPA505 should be placed in the genus Sphingobium. The bph and xyl gene cluster was found in 6 polycyclic aromatic hydrocarbons-degrading strains. Transposase and IS coding genes were found in the 6 gene clusters, suggesting the mobility of bph and xyl gene clusters. β-ketoadipate and homogentisate pathways were the main central pathways in Sphingomonas and Sphingobium strains. A large number of oxygenase coding genes were predicted in the 26 genomes, indicating a huge biodegradation potential of the Sphingomonas and Sphingobium strains. Horizontal gene transfer related genes and prophages were predicted in the analyzed strains, suggesting the ongoing evolution and shaping of the genomes. Analysis of the 26 genomes in this work contributes to the understanding of dispersion of bioremediation capabilities, bioremediation potential and genome plasticity in strains belonging to the genera Sphingomonas and Sphingobium. Copyright © 2017 Elsevier B.V. All rights reserved.
Structural Genomics: Correlation Blocks, Population Structure, and Genome Architecture
Hu, Xin-Sheng; Yeh, Francis C.; Wang, Zhiquan
2011-01-01
An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level. PMID:21886455
Assaying gene function by growth competition experiment.
Merritt, Joshua; Edwards, Jeremy S
2004-07-01
High-throughput screening and analysis is one of the emerging paradigms in biotechnology. In particular, high-throughput methods are essential in the field of functional genomics because of the vast amount of data generated in recent and ongoing genome sequencing efforts. In this report we discuss integrated functional analysis methodologies which incorporate both a growth competition component and a highly parallel assay used to quantify results of the growth competition. Several applications of the two most widely used technologies in the field, i.e., transposon mutagenesis and deletion strain library growth competition, and individual applications of several developing or less widely reported technologies are presented.
Systematic Identification of Combinatorial Drivers and Targets in Cancer Cell Lines
Tabchy, Adel; Eltonsy, Nevine; Housman, David E.; Mills, Gordon B.
2013-01-01
There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance. PMID:23577104
Systematic identification of combinatorial drivers and targets in cancer cell lines.
Tabchy, Adel; Eltonsy, Nevine; Housman, David E; Mills, Gordon B
2013-01-01
There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
This volume contains the proceedings of the fourth Contractor-Grantee Workshop for the Department of Energy (DOE) Human Genome Program. Of the 204 abstracts in this book, some 200 describe the genome research of DOE-funded grantees and contractors located at the multidisciplinary centers at Lawrence Berkeley Laboratory, Lawrence Livermore National Laboratory, and Los Alamos National Laboratory; other DOE-supported laboratories; and more than 54 universities, research organizations, and companies in the United States and abroad. Included are 16 abstracts from ongoing projects in the Ethical, Legal, and Social Issues (ELSI) component, an area that continues to attract considerable attention from a widemore » variety of interested parties. Three abstracts summarize work in the new Microbial Genome Initiative launched this year by the Office of Health and Environmental Research (OHER) to provide genome sequence and mapping data on industrially important microorganisms and those that live under extreme conditions. Many of the projects will be discussed at plenary sessions held throughout the workshop, and all are represented in the poster sessions.« less
Deeg, Christoph M; Chow, Cheryl-Emiliane T
2018-01-01
Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses. PMID:29582753
Single-cell sequencing and tumorigenesis: improved understanding of tumor evolution and metastasis.
Ellsworth, Darrell L; Blackburn, Heather L; Shriver, Craig D; Rabizadeh, Shahrooz; Soon-Shiong, Patrick; Ellsworth, Rachel E
2017-12-01
Extensive genomic and transcriptomic heterogeneity in human cancer often negatively impacts treatment efficacy and survival, thus posing a significant ongoing challenge for modern treatment regimens. State-of-the-art DNA- and RNA-sequencing methods now provide high-resolution genomic and gene expression portraits of individual cells, facilitating the study of complex molecular heterogeneity in cancer. Important developments in single-cell sequencing (SCS) technologies over the past 5 years provide numerous advantages over traditional sequencing methods for understanding the complexity of carcinogenesis, but significant hurdles must be overcome before SCS can be clinically useful. In this review, we: (1) highlight current methodologies and recent technological advances for isolating single cells, single-cell whole-genome and whole-transcriptome amplification using minute amounts of nucleic acids, and SCS, (2) summarize research investigating molecular heterogeneity at the genomic and transcriptomic levels and how this heterogeneity affects clonal evolution and metastasis, and (3) discuss the promise for integrating SCS in the clinical care arena for improved patient care.
POPcorn: An Online Resource Providing Access to Distributed and Diverse Maize Project Data.
Cannon, Ethalinda K S; Birkett, Scott M; Braun, Bremen L; Kodavali, Sateesh; Jennewein, Douglas M; Yilmaz, Alper; Antonescu, Valentin; Antonescu, Corina; Harper, Lisa C; Gardiner, Jack M; Schaeffer, Mary L; Campbell, Darwin A; Andorf, Carson M; Andorf, Destri; Lisch, Damon; Koch, Karen E; McCarty, Donald R; Quackenbush, John; Grotewold, Erich; Lushbough, Carol M; Sen, Taner Z; Lawrence, Carolyn J
2011-01-01
The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time-sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorn's utility are provided herein.
POPcorn: An Online Resource Providing Access to Distributed and Diverse Maize Project Data
Cannon, Ethalinda K. S.; Birkett, Scott M.; Braun, Bremen L.; Kodavali, Sateesh; Jennewein, Douglas M.; Yilmaz, Alper; Antonescu, Valentin; Antonescu, Corina; Harper, Lisa C.; Gardiner, Jack M.; Schaeffer, Mary L.; Campbell, Darwin A.; Andorf, Carson M.; Andorf, Destri; Lisch, Damon; Koch, Karen E.; McCarty, Donald R.; Quackenbush, John; Grotewold, Erich; Lushbough, Carol M.; Sen, Taner Z.; Lawrence, Carolyn J.
2011-01-01
The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time—sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorn's utility are provided herein. PMID:22253616
A global analysis of adaptive evolution of operons in cyanobacteria.
Memon, Danish; Singh, Abhay K; Pakrasi, Himadri B; Wangikar, Pramod P
2013-02-01
Operons are an important feature of prokaryotic genomes. Evolution of operons is hypothesized to be adaptive and has contributed significantly towards coordinated optimization of functions. Two conflicting theories, based on (i) in situ formation to achieve co-regulation and (ii) horizontal gene transfer of functionally linked gene clusters, are generally considered to explain why and how operons have evolved. Furthermore, effects of operon evolution on genomic traits such as intergenic spacing, operon size and co-regulation are relatively less explored. Based on the conservation level in a set of diverse prokaryotes, we categorize the operonic gene pair associations and in turn the operons as ancient and recently formed. This allowed us to perform a detailed analysis of operonic structure in cyanobacteria, a morphologically and physiologically diverse group of photoautotrophs. Clustering based on operon conservation showed significant similarity with the 16S rRNA-based phylogeny, which groups the cyanobacterial strains into three clades. Clade C, dominated by strains that are believed to have undergone genome reduction, shows a larger fraction of operonic genes that are tightly packed in larger sized operons. Ancient operons are in general larger, more tightly packed, better optimized for co-regulation and part of key cellular processes. A sub-clade within Clade B, which includes Synechocystis sp. PCC 6803, shows a reverse trend in intergenic spacing. Our results suggest that while in situ formation and vertical descent may be a dominant mechanism of operon evolution in cyanobacteria, optimization of intergenic spacing and co-regulation are part of an ongoing process in the life-cycle of operons.
Rare De Novo Copy Number Variants in Patients with Congenital Pulmonary Atresia
Xie, Li; Chen, Jin-Lan; Zhang, Wei-Zhi; Wang, Shou-Zheng; Zhao, Tian-Li; Huang, Can; Wang, Jian; Yang, Jin-Fu; Yang, Yi-Feng; Tan, Zhi-Ping
2014-01-01
Background Ongoing studies using genomic microarrays and next-generation sequencing have demonstrated that the genetic contributions to cardiovascular diseases have been significantly ignored in the past. The aim of this study was to identify rare copy number variants in individuals with congenital pulmonary atresia (PA). Methods and Results Based on the hypothesis that rare structural variants encompassing key genes play an important role in heart development in PA patients, we performed high-resolution genome-wide microarrays for copy number variations (CNVs) in 82 PA patient-parent trios and 189 controls with an Illumina SNP array platform. CNVs were identified in 17/82 patients (20.7%), and eight of these CNVs (9.8%) are considered potentially pathogenic. Five de novo CNVs occurred at two known congenital heart disease (CHD) loci (16p13.1 and 22q11.2). Two de novo CNVs that may affect folate and vitamin B12 metabolism were identified for the first time. A de novo 1-Mb deletion at 17p13.2 may represent a rare genomic disorder that involves mild intellectual disability and associated facial features. Conclusions Rare CNVs contribute to the pathogenesis of PA (9.8%), suggesting that the causes of PA are heterogeneous and pleiotropic. Together with previous data from animal models, our results might help identify a link between CHD and folate-mediated one-carbon metabolism (FOCM). With the accumulation of high-resolution SNP array data, these previously undescribed rare CNVs may help reveal critical gene(s) in CHD and may provide novel insights about CHD pathogenesis. PMID:24826987
Čejková, Darina; Strouhal, Michal; Norris, Steven J; Weinstock, George M; Šmajs, David
2015-01-01
Pathogenic uncultivable treponemes comprise human and animal pathogens including agents of syphilis, yaws, bejel, pinta, and venereal spirochetosis in rabbits and hares. A set of 10 treponemal genome sequences including those of 4 Treponema pallidum ssp. pallidum (TPA) strains (Nichols, DAL-1, Mexico A, SS14), 4 T. p. ssp. pertenue (TPE) strains (CDC-2, Gauthier, Samoa D, Fribourg-Blanc), 1 T. p. ssp. endemicum (TEN) strain (Bosnia A) and one strain (Cuniculi A) of Treponema paraluisleporidarum ecovar Cuniculus (TPLC) were examined with respect to the presence of nucleotide intrastrain heterogeneous sites. The number of identified intrastrain heterogeneous sites in individual genomes ranged between 0 and 7. Altogether, 23 intrastrain heterogeneous sites (in 17 genes) were found in 5 out of 10 investigated treponemal genomes including TPA strains Nichols (n = 5), DAL-1 (n = 4), and SS14 (n = 7), TPE strain Samoa D (n = 1), and TEN strain Bosnia A (n = 5). Although only one heterogeneous site was identified among 4 tested TPE strains, 16 such sites were identified among 4 TPA strains. Heterogeneous sites were mostly strain-specific and were identified in four tpr genes (tprC, GI, I, K), in genes involved in bacterial motility and chemotaxis (fliI, cheC-fliY), in genes involved in cell structure (murC), translation (prfA), general and DNA metabolism (putative SAM dependent methyltransferase, topA), and in seven hypothetical genes. Heterogeneous sites likely represent both the selection of adaptive changes during infection of the host as well as an ongoing diversifying evolutionary process.
Judge, Kim; Hunt, Martin; Reuter, Sandra; Tracey, Alan; Quail, Michael A; Parkhill, Julian; Peacock, Sharon J
2016-09-01
Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of associated analysis software. Here, we use a multidrug-resistant Enterobacter kobei isolate as a model organism to compare open source software for the assembly of genome data, and relate this to the time taken to generate actionable information. Three software tools (PBcR, Canu and miniasm) were used to assemble MinION data and a fourth (SPAdes) was used to combine MinION and Illumina data to produce a hybrid assembly. All four had a similar number of contigs and were more contiguous than the assembly using Illumina data alone, with SPAdes producing a single chromosomal contig. Evaluation of the four assemblies to represent the genome structure revealed a single large inversion in the SPAdes assembly, which also incorrectly integrated a plasmid into the chromosomal contig. Almost 50 %, 80 % and 90 % of MinION pass reads were generated in the first 6, 9 and 12 h, respectively. Using data from the first 6 h alone led to a less accurate, fragmented assembly, but data from the first 9 or 12 h generated similar assemblies to that from 48 h sequencing. Assemblies were generated in 2 h using Canu, indicating that going from isolate to assembled data is possible in less than 48 h. MinION data identified that genes responsible for resistance were carried by two plasmids encoding resistance to carbapenem and to sulphonamides, rifampicin and aminoglycosides, respectively.
Rare de novo copy number variants in patients with congenital pulmonary atresia.
Xie, Li; Chen, Jin-Lan; Zhang, Wei-Zhi; Wang, Shou-Zheng; Zhao, Tian-Li; Huang, Can; Wang, Jian; Yang, Jin-Fu; Yang, Yi-Feng; Tan, Zhi-Ping
2014-01-01
Ongoing studies using genomic microarrays and next-generation sequencing have demonstrated that the genetic contributions to cardiovascular diseases have been significantly ignored in the past. The aim of this study was to identify rare copy number variants in individuals with congenital pulmonary atresia (PA). Based on the hypothesis that rare structural variants encompassing key genes play an important role in heart development in PA patients, we performed high-resolution genome-wide microarrays for copy number variations (CNVs) in 82 PA patient-parent trios and 189 controls with an Illumina SNP array platform. CNVs were identified in 17/82 patients (20.7%), and eight of these CNVs (9.8%) are considered potentially pathogenic. Five de novo CNVs occurred at two known congenital heart disease (CHD) loci (16p13.1 and 22q11.2). Two de novo CNVs that may affect folate and vitamin B12 metabolism were identified for the first time. A de novo 1-Mb deletion at 17p13.2 may represent a rare genomic disorder that involves mild intellectual disability and associated facial features. Rare CNVs contribute to the pathogenesis of PA (9.8%), suggesting that the causes of PA are heterogeneous and pleiotropic. Together with previous data from animal models, our results might help identify a link between CHD and folate-mediated one-carbon metabolism (FOCM). With the accumulation of high-resolution SNP array data, these previously undescribed rare CNVs may help reveal critical gene(s) in CHD and may provide novel insights about CHD pathogenesis.
Challenges in NMR-based structural genomics
NASA Astrophysics Data System (ADS)
Sue, Shih-Che; Chang, Chi-Fon; Huang, Yao-Te; Chou, Ching-Yu; Huang, Tai-huang
2005-05-01
Understanding the functions of the vast number of proteins encoded in many genomes that have been completely sequenced recently is the main challenge for biologists in the post-genomics era. Since the function of a protein is determined by its exact three-dimensional structure it is paramount to determine the 3D structures of all proteins. This need has driven structural biologists to undertake the structural genomics project aimed at determining the structures of all known proteins. Several centers for structural genomics studies have been established throughout the world. Nuclear magnetic resonance (NMR) spectroscopy has played a major role in determining protein structures in atomic details and in a physiologically relevant solution state. Since the number of new genes being discovered daily far exceeds the number of structures determined by both NMR and X-ray crystallography, a high-throughput method for speeding up the process of protein structure determination is essential for the success of the structural genomics effort. In this article we will describe NMR methods currently being employed for protein structure determination. We will also describe methods under development which may drastically increase the throughput, as well as point out areas where opportunities exist for biophysicists to make significant contribution in this important field.
Optimized guide RNA structure for genome editing via Cas9
Xu, Jianyong; Lian, Wei; Jia, Yuning; Li, Lingyun; Huang, Zhong
2017-01-01
The genome editing tool Cas9-gRNA (guide RNA) has been successfully applied in different cell types and organisms with high efficiency. However, more efforts need to be made to enhance both efficiency and specificity. In the current study, we optimized the guide RNA structure of Streptococcus pyogenes CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated) system to improve its genome editing efficiency. Comparing with the original functional structure of guide RNA, which is composed of crRNA and tracrRNA, the widely used chimeric gRNA has shorter crRNA and tracrRNA sequence. The deleted RNA sequence could form extra loop structure, which might enhance the stability of the guide RNA structure and subsequently the genome editing efficiency. Thus the genome editing efficiency of different forms of guide RNA was tested. And we found that the chimeric structure of gRNA with original full length of crRNA and tracrRNA showed higher genome editing efficiency than the conventional chimeric structure or other types of gRNA we tested. Therefore our data here uncovered the new type of gRNA structure with higher genome editing efficiency. PMID:29212218
Thépault, Amandine; Méric, Guillaume; Rivoal, Katell; Pascoe, Ben; Mageiros, Leonardos; Touzain, Fabrice; Rose, Valérie; Béven, Véronique; Chemaly, Marianne
2017-01-01
ABSTRACT Campylobacter is among the most common worldwide causes of bacterial gastroenteritis. This organism is part of the commensal microbiota of numerous host species, including livestock, and these animals constitute potential sources of human infection. Molecular typing approaches, especially multilocus sequence typing (MLST), have been used to attribute the source of human campylobacteriosis by quantifying the relative abundance of alleles at seven MLST loci among isolates from animal reservoirs and human infection, implicating chicken as a major infection source. The increasing availability of bacterial genomes provides data on allelic variation at loci across the genome, providing the potential to improve the discriminatory power of data for source attribution. Here we present a source attribution approach based on the identification of novel epidemiological markers among a reference pan-genome list of 1,810 genes identified by gene-by-gene comparison of 884 genomes of Campylobacter jejuni isolates from animal reservoirs, the environment, and clinical cases. Fifteen loci involved in metabolic activities, protein modification, signal transduction, and stress response or coding for hypothetical proteins were selected as host-segregating markers and used to attribute the source of 42 French and 281 United Kingdom clinical C. jejuni isolates. Consistent with previous studies of British campylobacteriosis, analyses performed using STRUCTURE software attributed 56.8% of British clinical cases to chicken, emphasizing the importance of this host reservoir as an infection source in the United Kingdom. However, among French clinical isolates, approximately equal proportions of isolates were attributed to chicken and ruminant reservoirs, suggesting possible differences in the relative importance of animal host reservoirs and indicating a benefit for further national-scale attribution modeling to account for differences in production, behavior, and food consumption. IMPORTANCE Accurately quantifying the relative contribution of different host reservoirs to human Campylobacter infection is an ongoing challenge. This study, based on the development of a novel source attribution approach, provides the first results of source attribution in Campylobacter jejuni in France. A systematic analysis using gene-by-gene comparison of 884 genomes of C. jejuni isolates, with a pan-genome list of genes, identified 15 novel epidemiological markers for source attribution. The different proportions of French and United Kingdom clinical isolates attributed to each host reservoir illustrate a potential role for local/national variations in C. jejuni transmission dynamics. PMID:28115376
Producing genome structure populations with the dynamic and automated PGS software.
Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank
2018-05-01
Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.
Xiao, Shijun; Li, Jiongtang; Ma, Fengshou; Fang, Lujing; Xu, Shuangbin; Chen, Wei; Wang, Zhi Yong
2015-09-03
Large yellow croaker (Larimichthys crocea) is an important commercial fish in China and East-Asia. The annual product of the species from the aqua-farming industry is about 90 thousand tons. In spite of its economic importance, genetic studies of economic traits and genomic selections of the species are hindered by the lack of genomic resources. Specifically, a whole-genome physical map of large yellow croaker is still missing. The traditional BAC-based fingerprint method is extremely time- and labour-consuming. Here we report the first genome map construction using the high-throughput whole-genome mapping technique by nanochannel arrays in BioNano Genomics Irys system. For an optimal marker density of ~10 per 100 kb, the nicking endonuclease Nt.BspQ1 was chosen for the genome map generation. 645,305 DNA molecules with a total length of ~112 Gb were labelled and detected, covering more than 160X of the large yellow croaker genome. Employing IrysView package and signature patterns in raw DNA molecules, a whole-genome map of large yellow croaker was assembled into 686 maps with a total length of 727 Mb, which was consistent with the estimated genome size. The N50 length of the whole-genome map, including 126 maps, was up to 1.7 Mb. The excellent hybrid alignment with large yellow croaker draft genome validated the consensus genome map assembly and highlighted a promising application of whole-genome mapping on draft genome sequence super-scaffolding. The genome map data of large yellow croaker are accessible on lycgenomics.jmu.edu.cn/pm. Using the state-of-the-art whole-genome mapping technique in Irys system, the first whole-genome map for large yellow croaker has been constructed and thus highly facilitates the ongoing genomic and evolutionary studies for the species. To our knowledge, this is the first public report on genome map construction by the whole-genome mapping for aquatic-organisms. Our study demonstrates a promising application of the whole-genome mapping on genome maps construction for other non-model organisms in a fast and reliable manner.
Expanding the Karyotype of Slash Pine as a Prelude to Physical Mapping
M. Oard
1999-01-01
Cytological exploration of the pine genome has been ongoing for more than a century. For the first seventy years we knew little more than chromosome number for pines. Constancy in chromosome number throughout the genus coupled with uniformity in size and morphology between chromosomes within species has given cytologists few practical means by which to distinguish...
G2S: a web-service for annotating genomic variants on 3D protein structures.
Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong
2018-06-01
Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that supports programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online.
Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun
2015-01-01
Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
The pig genome project has plenty to squeal about.
Fan, B; Gorbach, D M; Rothschild, M F
2011-01-01
Significant progress on pig genetics and genomics research has been witnessed in recent years due to the integration of advanced molecular biology techniques, bioinformatics and computational biology, and the collaborative efforts of researchers in the swine genomics community. Progress on expanding the linkage map has slowed down, but the efforts have created a higher-resolution physical map integrating the clone map and BAC end sequence. The number of QTL mapped is still growing and most of the updated QTL mapping results are available through PigQTLdb. Additionally, expression studies using high-throughput microarrays and other gene expression techniques have made significant advancements. The number of identified non-coding RNAs is rapidly increasing and their exact regulatory functions are being explored. A publishable draft (build 10) of the swine genome sequence was available for the pig genomics community by the end of December 2010. Build 9 of the porcine genome is currently available with Ensembl annotation; manual annotation is ongoing. These drafts provide useful tools for such endeavors as comparative genomics and SNP scans for fine QTL mapping. A recent community-wide effort to create a 60K porcine SNP chip has greatly facilitated whole-genome association analyses, haplotype block construction and linkage disequilibrium mapping, which can contribute to whole-genome selection. The future 'systems biology' that integrates and optimizes the information from all research levels can enhance the pig community's understanding of the full complexity of the porcine genome. These recent technological advances and where they may lead are reviewed. Copyright © 2011 S. Karger AG, Basel.
24 CFR 35.1220 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 24 Housing and Urban Development 1 2014-04-01 2014-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Tenant-Based Rental Assistance § 35.1220 Ongoing lead-based paint maintenance activities...
24 CFR 35.1220 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 24 Housing and Urban Development 1 2012-04-01 2012-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Tenant-Based Rental Assistance § 35.1220 Ongoing lead-based paint maintenance activities...
24 CFR 35.1220 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 24 Housing and Urban Development 1 2011-04-01 2011-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Tenant-Based Rental Assistance § 35.1220 Ongoing lead-based paint maintenance activities...
24 CFR 35.1220 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 24 Housing and Urban Development 1 2013-04-01 2013-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Tenant-Based Rental Assistance § 35.1220 Ongoing lead-based paint maintenance activities...
24 CFR 35.1220 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Tenant-Based Rental Assistance § 35.1220 Ongoing lead-based paint maintenance activities...
The Paris-Sud yeast structural genomics pilot-project: from structure to function.
Quevillon-Cheruel, Sophie; Liger, Dominique; Leulliot, Nicolas; Graille, Marc; Poupon, Anne; Li de La Sierra-Gallay, Inès; Zhou, Cong-Zhao; Collinet, Bruno; Janin, Joël; Van Tilbeurgh, Herman
2004-01-01
We present here the outlines and results from our yeast structural genomics (YSG) pilot-project. A lab-scale platform for the systematic production and structure determination is presented. In order to validate this approach, 250 non-membrane proteins of unknown structure were targeted. Strategies and final statistics are evaluated. We finally discuss the opportunity of structural genomics programs to contribute to functional biochemical annotation.
Kugelman, Jeffrey R; Wiley, Michael R; Mate, Suzanne; Ladner, Jason T; Beitzel, Brett; Fakoli, Lawrence; Taweh, Fahn; Prieto, Karla; Diclaro, Joseph W; Minogue, Timothy; Schoepp, Randal J; Schaecher, Kurt E; Pettitt, James; Bateman, Stacey; Fair, Joseph; Kuhn, Jens H; Hensley, Lisa; Park, Daniel J; Sabeti, Pardis C; Sanchez-Lockhart, Mariano; Bolay, Fatorma K; Palacios, Gustavo
2015-07-01
To support Liberia's response to the ongoing Ebola virus (EBOV) disease epidemic in Western Africa, we established in-country advanced genomic capabilities to monitor EBOV evolution. Twenty-five EBOV genomes were sequenced at the Liberian Institute for Biomedical Research, which provided an in-depth view of EBOV diversity in Liberia during September 2014-February 2015. These sequences were consistent with a single virus introduction to Liberia; however, shared ancestry with isolates from Mali indicated at least 1 additional instance of movement into or out of Liberia. The pace of change is generally consistent with previous estimates of mutation rate. We observed 23 nonsynonymous mutations and 1 nonsense mutation. Six of these changes are within known binding sites for sequence-based EBOV medical countermeasures; however, the diagnostic and therapeutic impact of EBOV evolution within Liberia appears to be low.
Kugelman, Jeffrey R.; Wiley, Michael R.; Mate, Suzanne; Ladner, Jason T.; Beitzel, Brett; Fakoli, Lawrence; Taweh, Fahn; Prieto, Karla; Diclaro, Joseph W.; Minogue, Timothy; Schoepp, Randal J.; Schaecher, Kurt E.; Pettitt, James; Bateman, Stacey; Fair, Joseph; Kuhn, Jens H.; Hensley, Lisa; Park, Daniel J.; Sabeti, Pardis C.; Sanchez-Lockhart, Mariano; Bolay, Fatorma K.
2015-01-01
To support Liberia’s response to the ongoing Ebola virus (EBOV) disease epidemic in Western Africa, we established in-country advanced genomic capabilities to monitor EBOV evolution. Twenty-five EBOV genomes were sequenced at the Liberian Institute for Biomedical Research, which provided an in-depth view of EBOV diversity in Liberia during September 2014–February 2015. These sequences were consistent with a single virus introduction to Liberia; however, shared ancestry with isolates from Mali indicated at least 1 additional instance of movement into or out of Liberia. The pace of change is generally consistent with previous estimates of mutation rate. We observed 23 nonsynonymous mutations and 1 nonsense mutation. Six of these changes are within known binding sites for sequence-based EBOV medical countermeasures; however, the diagnostic and therapeutic impact of EBOV evolution within Liberia appears to be low. PMID:26079255
Population and clinical genetics of human transposable elements in the (post) genomic era
Rishishwar, Lavanya; Wang, Lu; Clayton, Evan A.; Mariño-Ramírez, Leonardo; McDonald, John F.; Jordan, I. King
2017-01-01
ABSTRACT Recent technological developments—in genomics, bioinformatics and high-throughput experimental techniques—are providing opportunities to study ongoing human transposable element (TE) activity at an unprecedented level of detail. It is now possible to characterize genome-wide collections of TE insertion sites for multiple human individuals, within and between populations, and for a variety of tissue types. Comparison of TE insertion site profiles between individuals captures the germline activity of TEs and reveals insertion site variants that segregate as polymorphisms among human populations, whereas comparison among tissue types ascertains somatic TE activity that generates cellular heterogeneity. In this review, we provide an overview of these new technologies and explore their implications for population and clinical genetic studies of human TEs. We cover both recent published results on human TE insertion activity as well as the prospects for future TE studies related to human evolution and health. PMID:28228978
The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function
Brosius, Jürgen
2014-01-01
Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery. PMID:25081515
Creating reference gene annotation for the mouse C57BL6/J genome assembly.
Mudge, Jonathan M; Harrow, Jennifer
2015-10-01
Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.
Genomic landscape of gastric cancer: molecular classification and potential targets.
Guo, Jiawei; Yu, Weiwei; Su, Hui; Pang, Xiufeng
2017-02-01
Gastric cancer imposes a considerable health burden worldwide, and its mortality ranks as the second highest for all types of cancers. The limited knowledge of the molecular mechanisms underlying gastric cancer tumorigenesis hinders the development of therapeutic strategies. However, ongoing collaborative sequencing efforts facilitate molecular classification and unveil the genomic landscape of gastric cancer. Several new drivers and tumorigenic pathways in gastric cancer, including chromatin remodeling genes, RhoA-related pathways, TP53 dysregulation, activation of receptor tyrosine kinases, stem cell pathways and abnormal DNA methylation, have been revealed. These newly identified genomic alterations await translation into clinical diagnosis and targeted therapies. Considering that loss-of-function mutations are intractable, synthetic lethality could be employed when discussing feasible therapeutic strategies. Although many challenges remain to be tackled, we are optimistic regarding improvements in the prognosis and treatment of gastric cancer in the near future.
CSGRqtl: A Comparative Quantitative Trait Locus Database for Saccharinae Grasses.
Zhang, Dong; Paterson, Andrew H
2017-01-01
Conventional biparental quantitative trait locus (QTL) mapping has led to some successes in the identification of causal genes in many organisms. QTL likelihood intervals not only provide "prior information" for finer-resolution approaches such as GWAS but also provide better statistical power than GWAS to detect variants with low/rare frequency in a natural population. Here, we describe a new element of an ongoing effort to provide online resources to facilitate study and improvement of the important Saccharinae clade. The primary goal of this new resource is the anchoring of published QTLs for this clade to the Sorghum genome. Genetic map alignments translate a wealth of genomic information from sorghum to Saccharum spp., Miscanthus spp., and other taxa. In addition, genome alignments facilitate comparison of the Saccharinae QTL sets to those of other taxa that enjoy comparable resources, exemplified herein by rice.
Murine endogenous retroviruses
2016-01-01
Up to 10% of the mouse genome is comprised of endogenous retrovirus (ERV) sequences, and most represent the remains of ancient germ line infections. Our knowledge of the three distinct classes of ERVs is inversely correlated with their copy number, and their characterization has benefited from the availability of divergent wild mouse species and subspecies, and from ongoing analysis of the Mus genome sequence. In contrast to human ERVs, which are nearly all extinct, active mouse ERVs can still be found in all three ERV classes. The distribution and diversity of ERVs has been shaped by host-virus interactions over the course of evolution, but ERVs have also been pivotal in shaping the mouse genome by altering host genes through insertional mutagenesis, by adding novel regulatory and coding sequences, and by their co-option by host cells as retroviral resistance genes. We review mechanisms by which an adaptive coexistence has evolved. (Part of a Multi-author Review) PMID:18818872
Fifty Years of Research in ARDS. Genomic Contributions and Opportunities.
Reilly, John P; Christie, Jason D; Meyer, Nuala J
2017-11-01
Clinical factors alone poorly explain acute respiratory distress syndrome (ARDS) risk and ARDS outcome. In the search for individual factors that may influence ARDS risk, the past 20 years have witnessed the identification of numerous genes and genetic variants that are associated with ARDS. The field of ARDS genomics has cycled from candidate gene association studies to bias-free approaches that identify new candidates, and increasing effort is made to understand the functional consequences that may underlie significant associations. More recently, methodologies of causal inference are being applied to maximize the information gained from genetic associations. Although challenges of sample size, both recognized and unrecognized phenotypic heterogeneity, and the paucity of early ARDS lung tissue limit some applications of the rapidly evolving field of genomic investigation, ongoing genetic research offers unique contributions to elucidating ARDS pathogenesis and the paradigm of precision ARDS medicine.
Engineered LINE-1 retrotransposition in nondividing human neurons
Macia, Angela; Widmann, Thomas J.; Heras, Sara R.; Ayllon, Veronica; Sanchez, Laura; Benkaddour-Boumzaouad, Meriem; Muñoz-Lopez, Martin; Rubio, Alejandro; Amador-Cubero, Suyapa; Blanco-Jimenez, Eva; Garcia-Castro, Javier; Menendez, Pablo; Ng, Philip; Muotri, Alysson R.; Goodier, John L.; Garcia-Perez, Jose L.
2017-01-01
Half the human genome is made of transposable elements (TEs), whose ongoing activity continues to impact our genome. LINE-1 (or L1) is an autonomous non-LTR retrotransposon in the human genome, comprising 17% of its genomic mass and containing an average of 80–100 active L1s per average genome that provide a source of inter-individual variation. New LINE-1 insertions are thought to accumulate mostly during human embryogenesis. Surprisingly, the activity of L1s can further impact the somatic human brain genome. However, it is currently unknown whether L1 can retrotranspose in other somatic healthy tissues or if L1 mobilization is restricted to neuronal precursor cells (NPCs) in the human brain. Here, we took advantage of an engineered L1 retrotransposition assay to analyze L1 mobilization rates in human mesenchymal (MSCs) and hematopoietic (HSCs) somatic stem cells. Notably, we have observed that L1 expression and engineered retrotransposition is much lower in both MSCs and HSCs when compared to NPCs. Remarkably, we have further demonstrated for the first time that engineered L1s can retrotranspose efficiently in mature nondividing neuronal cells. Thus, these findings suggest that the degree of somatic mosaicism and the impact of L1 retrotransposition in the human brain is likely much higher than previously thought. PMID:27965292
The clinical applications of genome editing in HIV.
Wang, Cathy X; Cannon, Paula M
2016-05-26
HIV/AIDS has long been at the forefront of the development of gene- and cell-based therapies. Although conventional gene therapy approaches typically involve the addition of anti-HIV genes to cells using semirandomly integrating viral vectors, newer genome editing technologies based on engineered nucleases are now allowing more precise genetic manipulations. The possible outcomes of genome editing include gene disruption, which has been most notably applied to the CCR5 coreceptor gene, or the introduction of small mutations or larger whole gene cassette insertions at a targeted locus. Disruption of CCR5 using zinc finger nucleases was the first-in-human application of genome editing and remains the most clinically advanced platform, with 7 completed or ongoing clinical trials in T cells and hematopoietic stem/progenitor cells (HSPCs). Here we review the laboratory and clinical findings of CCR5 editing in T cells and HSPCs for HIV therapy and summarize other promising genome editing approaches for future clinical development. In particular, recent advances in the delivery of genome editing reagents and the demonstration of highly efficient homology-directed editing in both T cells and HSPCs are expected to spur the development of even more sophisticated applications of this technology for HIV therapy. © 2016 by The American Society of Hematology.
Transitioning from genotypes to epigenotypes: why the time has come for medulloblastoma epigenomics.
Batora, N V; Sturm, D; Jones, D T W; Kool, M; Pfister, S M; Northcott, P A
2014-04-04
Recent advances in genomic technologies have allowed for tremendous progress in our understanding of the biology underlying medulloblastoma, a malignant childhood brain tumor. Consensus molecular subgroups have been put forth by the pediatric neuro-oncology community and next-generation genomic studies have led to an improved description of driver genes and pathways somatically altered in these subgroups. In contrast to the impressive pace at which advances have been made at the level of the medulloblastoma genome, comparable studies of the epigenome have lagged behind. Complementary data yielded from genomic sequencing and copy number profiling have verified frequent targeting of chromatin modifiers in medulloblastoma, highly suggestive of prominent epigenetic deregulation in the disease. Past studies of DNA methylation-dependent gene silencing and microRNA expression analyses further support the concept of medulloblastoma as an epigenetic disease. In this Review, we aim to summarize the key findings of past reports pertaining to medulloblastoma epigenetics as well as recent and ongoing genomic efforts linking somatic alterations of the genome with inferred deregulation of the epigenome. In addition, we predict what is on the horizon for medulloblastoma epigenetics and how aberrant changes in the medulloblastoma epigenome might serve as an attractive target for future therapies. Copyright © 2013 IBRO. Published by Elsevier Ltd. All rights reserved.
Uchiyama, Ikuo
2008-10-31
Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.
Zhang, Jia; Yang, Ming-Kun; Zeng, Honghui; Ge, Feng
2016-11-01
Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
24 CFR 35.935 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2011 CFR
2011-04-01
..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Rehabilitation § 35.935 Ongoing lead-based paint maintenance activities. In the case of a rental... 24 Housing and Urban Development 1 2011-04-01 2011-04-01 false Ongoing lead-based paint...
24 CFR 35.935 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2012 CFR
2012-04-01
..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Rehabilitation § 35.935 Ongoing lead-based paint maintenance activities. In the case of a rental... 24 Housing and Urban Development 1 2012-04-01 2012-04-01 false Ongoing lead-based paint...
24 CFR 35.935 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2014 CFR
2014-04-01
..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Rehabilitation § 35.935 Ongoing lead-based paint maintenance activities. In the case of a rental... 24 Housing and Urban Development 1 2014-04-01 2014-04-01 false Ongoing lead-based paint...
24 CFR 35.935 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2010 CFR
2010-04-01
..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Rehabilitation § 35.935 Ongoing lead-based paint maintenance activities. In the case of a rental... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Ongoing lead-based paint...
24 CFR 35.935 - Ongoing lead-based paint maintenance activities.
Code of Federal Regulations, 2013 CFR
2013-04-01
..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Rehabilitation § 35.935 Ongoing lead-based paint maintenance activities. In the case of a rental... 24 Housing and Urban Development 1 2013-04-01 2013-04-01 false Ongoing lead-based paint...
24 CFR 35.825 - Ongoing lead-based paint maintenance and reevaluation.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 24 Housing and Urban Development 1 2013-04-01 2013-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES HUD-Owned and Mortgagee-in-Possession Multifamily Property § 35.825 Ongoing lead-based paint...
24 CFR 35.825 - Ongoing lead-based paint maintenance and reevaluation.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 24 Housing and Urban Development 1 2012-04-01 2012-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES HUD-Owned and Mortgagee-in-Possession Multifamily Property § 35.825 Ongoing lead-based paint...
24 CFR 35.825 - Ongoing lead-based paint maintenance and reevaluation.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 24 Housing and Urban Development 1 2014-04-01 2014-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES HUD-Owned and Mortgagee-in-Possession Multifamily Property § 35.825 Ongoing lead-based paint...
24 CFR 35.825 - Ongoing lead-based paint maintenance and reevaluation.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 24 Housing and Urban Development 1 2011-04-01 2011-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES HUD-Owned and Mortgagee-in-Possession Multifamily Property § 35.825 Ongoing lead-based paint...
24 CFR 35.825 - Ongoing lead-based paint maintenance and reevaluation.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Ongoing lead-based paint..., Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES HUD-Owned and Mortgagee-in-Possession Multifamily Property § 35.825 Ongoing lead-based paint...
Bowden, Deborah L; Vargas-Caro, Carolina; Ovenden, Jennifer R; Bennett, Michael B; Bustamante, Carlos
2016-11-01
The complete mitochondrial genome of the grey nurse shark Carcharias taurus is described from 25 963 828 sequences obtained using Illumina NGS technology. Total length of the mitogenome is 16 715 bp, consisting of 2 rRNAs, 13 protein-coding regions, 22 tRNA and 2 non-coding regions thus updating the previously published mitogenome for this species. The phylogenomic reconstruction inferred from the mitogenome of 15 species of Lamniform and Carcharhiniform sharks supports the inclusion of C. taurus in a clade with the Lamnidae and Cetorhinidae. This complete mitogenome contributes to ongoing investigation into the monophyly of the Family Odontaspididae.
The rapidly expanding universe of giant viruses: Mimivirus, Pandoravirus, Pithovirus and Mollivirus.
Abergel, Chantal; Legendre, Matthieu; Claverie, Jean-Michel
2015-11-01
More than a century ago, the term 'virus' was introduced to describe infectious agents that are invisible by light microscopy and capable of passing through sterilizing filters. In addition to their extremely small size, most viruses have minimal genomes and gene contents, and rely almost entirely on host cell-encoded functions to multiply. Unexpectedly, four different families of eukaryotic 'giant viruses' have been discovered over the past 10 years with genome sizes, gene contents and particle dimensions overlapping with that of cellular microbes. Their ongoing analyses are challenging accepted ideas about the diversity, evolution and origin of DNA viruses. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Multiscale modeling of three-dimensional genome
NASA Astrophysics Data System (ADS)
Zhang, Bin; Wolynes, Peter
The genome, the blueprint of life, contains nearly all the information needed to build and maintain an entire organism. A comprehensive understanding of the genome is of paramount interest to human health and will advance progress in many areas, including life sciences, medicine, and biotechnology. The overarching goal of my research is to understand the structure-dynamics-function relationships of the human genome. In this talk, I will be presenting our efforts in moving towards that goal, with a particular emphasis on studying the three-dimensional organization, the structure of the genome with multi-scale approaches. Specifically, I will discuss the reconstruction of genome structures at both interphase and metaphase by making use of data from chromosome conformation capture experiments. Computationally modeling of chromatin fiber at atomistic level from first principles will also be presented as our effort for studying the genome structure from bottom up.
G23D: Online tool for mapping and visualization of genomic variants on 3D protein structures.
Solomon, Oz; Kunik, Vered; Simon, Amos; Kol, Nitzan; Barel, Ortal; Lev, Atar; Amariglio, Ninette; Somech, Raz; Rechavi, Gidi; Eyal, Eran
2016-08-26
Evaluation of the possible implications of genomic variants is an increasingly important task in the current high throughput sequencing era. Structural information however is still not routinely exploited during this evaluation process. The main reasons can be attributed to the partial structural coverage of the human proteome and the lack of tools which conveniently convert genomic positions, which are the frequent output of genomic pipelines, to proteins and structure coordinates. We present G23D, a tool for conversion of human genomic coordinates to protein coordinates and protein structures. G23D allows mapping of genomic positions/variants on evolutionary related (and not only identical) protein three dimensional (3D) structures as well as on theoretical models. By doing so it significantly extends the space of variants for which structural insight is feasible. To facilitate interpretation of the variant consequence, pathogenic variants, functional sites and polymorphism sites are displayed on protein sequence and structure diagrams alongside the input variants. G23D also provides modeling of the mutant structure, analysis of intra-protein contacts and instant access to functional predictions and predictions of thermo-stability changes. G23D is available at http://www.sheba-cancer.org.il/G23D . G23D extends the fraction of variants for which structural analysis is applicable and provides better and faster accessibility for structural data to biologists and geneticists who routinely work with genomic information.
A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds
USDA-ARS?s Scientific Manuscript database
The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identi...
Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko
2008-06-23
The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.
Shao, Changwei; Niu, Yongchao; Rastas, Pasi; Liu, Yang; Xie, Zhiyuan; Li, Hengde; Wang, Lei; Jiang, Yong; Tai, Shuaishuai; Tian, Yongsheng; Sakamoto, Takashi; Chen, Songlin
2015-01-01
High-resolution genetic maps are essential for fine mapping of complex traits, genome assembly, and comparative genomic analysis. Single-nucleotide polymorphisms (SNPs) are the primary molecular markers used for genetic map construction. In this study, we identified 13,362 SNPs evenly distributed across the Japanese flounder (Paralichthys olivaceus) genome. Of these SNPs, 12,712 high-confidence SNPs were subjected to high-throughput genotyping and assigned to 24 consensus linkage groups (LGs). The total length of the genetic linkage map was 3,497.29 cM with an average distance of 0.47 cM between loci, thereby representing the densest genetic map currently reported for Japanese flounder. Nine positive quantitative trait loci (QTLs) forming two main clusters for Vibrio anguillarum disease resistance were detected. All QTLs could explain 5.1–8.38% of the total phenotypic variation. Synteny analysis of the QTL regions on the genome assembly revealed 12 immune-related genes, among them 4 genes strongly associated with V. anguillarum disease resistance. In addition, 246 genome assembly scaffolds with an average size of 21.79 Mb were anchored onto the LGs; these scaffolds, comprising 522.99 Mb, represented 95.78% of assembled genomic sequences. The mapped assembly scaffolds in Japanese flounder were used for genome synteny analyses against zebrafish (Danio rerio) and medaka (Oryzias latipes). Flounder and medaka were found to possess almost one-to-one synteny, whereas flounder and zebrafish exhibited a multi-syntenic correspondence. The newly developed high-resolution genetic map, which will facilitate QTL mapping, scaffold assembly, and genome synteny analysis of Japanese flounder, marks a milestone in the ongoing genome project for this species. PMID:25762582
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-09-18
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-01-01
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. PMID:25990738
A High-Density Linkage Map for Astyanax mexicanus Using Genotyping-by-Sequencing Technology
Carlson, Brian M.; Onusko, Samuel W.; Gross, Joshua B.
2014-01-01
The Mexican tetra, Astyanax mexicanus, is a unique model system consisting of cave-adapted and surface-dwelling morphotypes that diverged >1 million years (My) ago. This remarkable natural experiment has enabled powerful genetic analyses of cave adaptation. Here, we describe the application of next-generation sequencing technology to the creation of a high-density linkage map. Our map comprises more than 2200 markers populating 25 linkage groups constructed from genotypic data generated from a single genotyping-by-sequencing project. We leveraged emergent genomic and transcriptomic resources to anchor hundreds of anonymous Astyanax markers to the genome of the zebrafish (Danio rerio), the most closely related model organism to our study species. This facilitated the identification of 784 distinct connections between our linkage map and the Danio rerio genome, highlighting several regions of conserved genomic architecture between the two species despite ∼150 My of divergence. Using a Mendelian cave-associated trait as a proof-of-principle, we successfully recovered the genomic position of the albinism locus near the gene Oca2. Further, our map successfully informed the positions of unplaced Astyanax genomic scaffolds within particular linkage groups. This ability to identify the relative location, orientation, and linear order of unaligned genomic scaffolds will facilitate ongoing efforts to improve on the current early draft and assemble future versions of the Astyanax physical genome. Moreover, this improved linkage map will enable higher-resolution genetic analyses and catalyze the discovery of the genetic basis for cave-associated phenotypes. PMID:25520037
Allying with armored snails: the complete genome of gammaproteobacterial endosymbiont.
Nakagawa, Satoshi; Shimamura, Shigeru; Takaki, Yoshihiro; Suzuki, Yohey; Murakami, Shun-ichi; Watanabe, Tamaki; Fujiyoshi, So; Mino, Sayaka; Sawabe, Tomoo; Maeda, Takahiro; Makita, Hiroko; Nemoto, Suguru; Nishimura, Shin-Ichiro; Watanabe, Hiromi; Watsuji, Tomo-o; Takai, Ken
2014-01-01
Deep-sea vents harbor dense populations of various animals that have their specific symbiotic bacteria. Scaly-foot gastropods, which are snails with mineralized scales covering the sides of its foot, have a gammaproteobacterial endosymbiont in their enlarged esophageal glands and diverse epibionts on the surface of their scales. In this study, we report the complete genome sequencing of gammaproteobacterial endosymbiont. The endosymbiont genome displays features consistent with ongoing genome reduction such as large proportions of pseudogenes and insertion elements. The genome encodes functions commonly found in deep-sea vent chemoautotrophs such as sulfur oxidation and carbon fixation. Stable carbon isotope ((13)C)-labeling experiments confirmed the endosymbiont chemoautotrophy. The genome also includes an intact hydrogenase gene cluster that potentially has been horizontally transferred from phylogenetically distant bacteria. Notable findings include the presence and transcription of genes for flagellar assembly, through which proteins are potentially exported from bacterium to the host. Symbionts of snail individuals exhibited extreme genetic homogeneity, showing only two synonymous changes in 19 different genes (13 810 positions in total) determined for 32 individual gastropods collected from a single colony at one time. The extremely low genetic individuality in endosymbionts probably reflects that the stringent symbiont selection by host prevents the random genetic drift in the small population of horizontally transmitted symbiont. This study is the first complete genome analysis of gastropod endosymbiont and offers an opportunity to study genome evolution in a recently evolved endosymbiont.
Bridging the Resolution Gap in Structural Modeling of 3D Genome Organization
Marti-Renom, Marc A.; Mirny, Leonid A.
2011-01-01
Over the last decade, and especially after the advent of fluorescent in situ hybridization imaging and chromosome conformation capture methods, the availability of experimental data on genome three-dimensional organization has dramatically increased. We now have access to unprecedented details of how genomes organize within the interphase nucleus. Development of new computational approaches to leverage this data has already resulted in the first three-dimensional structures of genomic domains and genomes. Such approaches expand our knowledge of the chromatin folding principles, which has been classically studied using polymer physics and molecular simulations. Our outlook describes computational approaches for integrating experimental data with polymer physics, thereby bridging the resolution gap for structural determination of genomes and genomic domains. PMID:21779160
Application of Comparative Functional Genomics to Identify Regeneration-Specific Genes
2014-08-25
The first objective will extend an ongoing study of the transcriptional basis of limb regeneration in the Mexican axolotl (Ambystoma mexicanum) to...Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 Limb Regeneration, Transcriptome, Salamander, Axolotl REPORT...transcriptional basis of limb regeneration in the Mexican axolotl (Ambystoma mexicanum) to three additional salamander species (A. tigrinum, A. maculatum, and
Eyres, Isobel; Boschetti, Chiara; Crisp, Alastair; Smith, Thomas P; Fontaneto, Diego; Tunnacliffe, Alan; Barraclough, Timothy G
2015-11-04
Although prevalent in prokaryotes, horizontal gene transfer (HGT) is rarer in multicellular eukaryotes. Bdelloid rotifers are microscopic animals that contain a higher proportion of horizontally transferred, non-metazoan genes in their genomes than typical of animals. It has been hypothesized that bdelloids incorporate foreign DNA when they repair their chromosomes following double-strand breaks caused by desiccation. HGT might thereby contribute to species divergence and adaptation, as in prokaryotes. If so, we expect that species should differ in their complement of foreign genes, rather than sharing the same set of foreign genes inherited from a common ancestor. Furthermore, there should be more foreign genes in species that desiccate more frequently. We tested these hypotheses by surveying HGT in four congeneric species of bdelloids from different habitats: two from permanent aquatic habitats and two from temporary aquatic habitats that desiccate regularly. Transcriptomes of all four species contain many genes with a closer match to non-metazoan genes than to metazoan genes. Whole genome sequencing of one species confirmed the presence of these foreign genes in the genome. Nearly half of foreign genes are shared between all four species and an outgroup from another family, but many hundreds are unique to particular species, which indicates that HGT is ongoing. Using a dated phylogeny, we estimate an average of 12.8 gains versus 2.0 losses of foreign genes per million years. Consistent with the desiccation hypothesis, the level of HGT is higher in the species that experience regular desiccation events than those that do not. However, HGT still contributed hundreds of foreign genes to the species from permanently aquatic habitats. Foreign genes were mainly enzymes with various annotated functions that include catabolism of complex polysaccharides and stress responses. We found evidence of differential loss of ancestral foreign genes previously associated with desiccation protection in the two non-desiccating species. Nearly half of foreign genes were acquired before the divergence of bdelloid families over 60 Mya. Nonetheless, HGT is ongoing in bdelloids and has contributed to putative functional differences among species. Variation among our study species is consistent with the hypothesis that desiccating habitats promote HGT.
Scanning the human genome at kilobase resolution.
Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming
2008-05-01
Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trindade, Inês B.; Fonseca, Bruno M.; Matias, Pedro M.
The gene encoding a putative siderophore-interacting protein from the marine bacterium S. frigidimarina was successfully cloned, followed by expression and purification of the gene product. Optimized crystals diffracted to 1.35 Å resolution and preliminary crystallographic analysis is promising with respect to structure determination and increased insight into the poorly understood molecular mechanisms underlying iron acquisition. Siderophore-binding proteins (SIPs) perform a key role in iron acquisition in multiple organisms. In the genome of the marine bacterium Shewanella frigidimarina NCIMB 400, the gene tagged as SFRI-RS12295 encodes a protein from this family. Here, the cloning, expression, purification and crystallization of this proteinmore » are reported, together with its preliminary X-ray crystallographic analysis to 1.35 Å resolution. The SIP crystals belonged to the monoclinic space group P2{sub 1}, with unit-cell parameters a = 48.04, b = 78.31, c = 67.71 Å, α = 90, β = 99.94, γ = 90°, and are predicted to contain two molecules per asymmetric unit. Structure determination by molecular replacement and the use of previously determined ∼2 Å resolution SIP structures with ∼30% sequence identity as templates are ongoing.« less
Visualization of RNA structure models within the Integrative Genomics Viewer.
Busan, Steven; Weeks, Kevin M
2017-07-01
Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
USDA-ARS?s Scientific Manuscript database
Genomic structural variations, including segmental duplications (SD) and copy number variations (CNV), contribute significantly to individual health and disease in primates and rodents. As a part of the bovine genome annotation effort, we performed the first genome-wide analysis of SD in cattle usin...
The structure and evolution of angiosperm nuclear genomes.
Bennetzen, J L
1998-04-01
Despite several decades of investigation, the organization of angiosperm genomes remained largely unknown until very recently. Data describing the sequence composition of large segments of genomes, covering hundreds of kilobases of contiguous sequence, have only become available in the past two years. Recent results indicate commonalities in the characteristics of many plant genomes, including in the structure of chromosomal components like telomeres and centromeres, and in the order and content of genes. Major differences between angiosperms have been associated mainly with repetitive DNAs, both gene families and mobile elements. Intriguing new studies have begun to characterize the dynamic three-dimensional structures of chromosomes and chromatin, and the relationship between genome structure and co-ordinated gene function.
Thiesen, H.-J.; Steinbeck, F.; Maruschke, M.; Koczan, D.; Ziems, B.; Hakenberg, O. W.
2017-01-01
Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value. PMID:28486536
Thiesen, H-J; Steinbeck, F; Maruschke, M; Koczan, D; Ziems, B; Hakenberg, O W
2017-01-01
Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value.
O'Neill, F J; Gao, Y; Xu, X
1993-11-01
The DNAs of polyomaviruses ordinarily exist as a single circular molecule of approximately 5000 base pairs. Variants of SV40, BKV and JCV have been described which contain two complementing defective DNA molecules. These defectives, which form a bipartite genome structure, contain either the viral early region or the late region. The defectives have the unique property of being able to tolerate variable sized reiterations of regulatory and terminus region sequences, and portions of the coding region. They can also exchange coding region sequences with other polyomaviruses. It has been suggested that the bipartite genome structure might be a stage in the evolution of polyomaviruses which can uniquely sustain genome and sequence diversity. However, it is not known if the regulatory and terminus region sequences are highly mutable. Also, it is not known if the bipartite genome structure is reversible and what the conditions might be which would favor restoration of the monomolecular genome structure. We addressed the first question by sequencing the reiterated regulatory and terminus regions of E- and L-SV40 DNAs. This revealed a large number of mutations in the regulatory regions of the defective genomes, including deletions, insertions, rearrangements and base substitutions. We also detected insertions and base substitutions in the T-antigen gene. We addressed the second question by introducing into permissive simian cells, E- and L-SV40 genomes which had been engineered to contain only a single regulatory region. Analysis of viral DNA from transfected cells demonstrated recombined genomes containing a wild type monomolecular DNA structure. However, the complete defectives, containing reiterated regulatory regions, could often compete away the wild type genomes. The recombinant monomolecular genomes were isolated, cloned and found to be infectious. All of the DNA alterations identified in one of the regulatory regions of E-SV40 DNA were present in the recombinant monomolecular genomes. These and other findings indicate that the bipartite genome state can sustain many mutations which wtSV40 cannot directly sustain. However, the mutations can later be introduced into the wild type genomes when the E- and L-SV40 DNAs recombine to generate a new monomolecular genome structure.
Genomics-Guided Precise Anti-Epileptic Drug Development.
Delanty, Norman; Cavallleri, Gianpiero
2017-07-01
Traditional antiepileptic drug development approaches have yielded many important clinically valuable anti-epileptic drugs. However, the screening of promising compounds has been naturally agnostic to epilepsy etiology in individual human patients. Now, genomic medicine is changing the way we view human disease. International collaborations are unraveling the many molecular genetic causes of the epilepsies, including the early onset epileptic encephalopathies, and some of the familial focal epilepsies. Further advances in precision diagnostics will be facilitated by ongoing large collaborations and the wider availability of whole exome and whole genome sequencing in clinical practice. Securing a precise molecular diagnosis in some individual patients will pave the way for the advent of precision therapeutics of new and re-purposed compounds in the treatment of the epilepsies. This new approach is already beginning, e.g., with the use of everolimus in patients with tuberous sclerosis complex (and perhaps other mTORopathies), the use of quinidine in some children with KCNT1 mutations, and the use of the ketogenic diet in individuals with GLUT-1 deficiency. This article explores the promise of genomics guided drug development as an approach to complement the more traditional model.
Gao, Jingfang; Wang, Bang; Han, Xiaoyun; Tian, Chaoguang
2017-01-25
The lignocellulolytic filamentous fungus Neurospora crassa is able to assimilate various mono- and oligo-saccharides. However, more than half of predicted sugar transporters in the genome are still waiting for functional elucidation. In this study, system analysis of substrate spectra of predicted sugar transporters in N. crassa was performed at genome-wide level. NCU01868 and NCU08152 have the capability of uptaking various hexose, which are named as NcHXT-1 and NcHXT-2 respectively. Their transport activities for glucose were further confirmed by fluorescence resonance energy transfer analysis. Over-expression of either NcHXT-1 or NcHXT-2 in the null-hexose-transporter yeast EBY.VW4000 restored the growth and ethanol fermentation under submerged fermentation with glucose, galactose, or mannose as the sole carbon source. NcHXT-1/-2 homologues were found in a variety of cellulolytic fungi. Functional identification of two filamentous fungal-conserved hexose transporters NcHXT-1/-2 via genome scanning would represent novel targets for ongoing efforts in engineering cellulolytic fungi and hexose fermentation in yeast.
Molecular hyperdiversity and evolution in very large populations.
Cutter, Asher D; Jovelin, Richard; Dey, Alivia
2013-04-01
The genomic density of sequence polymorphisms critically affects the sensitivity of inferences about ongoing sequence evolution, function and demographic history. Most animal and plant genomes have relatively low densities of polymorphisms, but some species are hyperdiverse with neutral nucleotide heterozygosity exceeding 5%. Eukaryotes with extremely large populations, mimicking bacterial and viral populations, present novel opportunities for studying molecular evolution in sexually reproducing taxa with complex development. In particular, hyperdiverse species can help answer controversial questions about the evolution of genome complexity, the limits of natural selection, modes of adaptation and subtleties of the mutation process. However, such systems have some inherent complications and here we identify topics in need of theoretical developments. Close relatives of the model organisms Caenorhabditis elegans and Drosophila melanogaster provide known examples of hyperdiverse eukaryotes, encouraging functional dissection of resulting molecular evolutionary patterns. We recommend how best to exploit hyperdiverse populations for analysis, for example, in quantifying the impact of noncrossover recombination in genomes and for determining the identity and micro-evolutionary selective pressures on noncoding regulatory elements. © 2013 Blackwell Publishing Ltd.
Mitochondrial DNA repairs double-strand breaks in yeast chromosomes.
Ricchetti, M; Fairhead, C; Dujon, B
1999-11-04
The endosymbiotic theory for the origin of eukaryotic cells proposes that genetic information can be transferred from mitochondria to the nucleus of a cell, and genes that are probably of mitochondrial origin have been found in nuclear chromosomes. Occasionally, short or rearranged sequences homologous to mitochondrial DNA are seen in the chromosomes of different organisms including yeast, plants and humans. Here we report a mechanism by which fragments of mitochondrial DNA, in single or tandem array, are transferred to yeast chromosomes under natural conditions during the repair of double-strand breaks in haploid mitotic cells. These repair insertions originate from noncontiguous regions of the mitochondrial genome. Our analysis of the Saccharomyces cerevisiae mitochondrial genome indicates that the yeast nuclear genome does indeed contain several short sequences of mitochondrial origin which are similar in size and composition to those that repair double-strand breaks. These sequences are located predominantly in non-coding regions of the chromosomes, frequently in the vicinity of retrotransposon long terminal repeats, and appear as recent integration events. Thus, colonization of the yeast genome by mitochondrial DNA is an ongoing process.
Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa
Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu
2012-01-01
Polyploidization, both ancient and recent, is frequent among plants. A “two-step theory" was proposed to explain the meso-triplication of the Brassica “A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that “two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa. PMID:22567157
Biased gene fractionation and dominant gene expression among the subgenomes of Brassica rapa.
Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu
2012-01-01
Polyploidization, both ancient and recent, is frequent among plants. A "two-step theory" was proposed to explain the meso-triplication of the Brassica "A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that "two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa.
RNA structural constraints in the evolution of the influenza A virus genome NP segment
Gultyaev, Alexander P; Tsyganov-Bodounov, Anton; Spronken, Monique IJ; van der Kooij, Sander; Fouchier, Ron AM; Olsthoorn, René CL
2014-01-01
Conserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length, including protein-coding regions. Calculations of mutual information values at the paired nucleotide positions demonstrate that these structures impose considerable constraints on the virus genome evolution. Functional importance of a pseudoknot structure, predicted in the NP packaging signal region, was confirmed by plaque assays of the mutant viruses with disrupted structure and those with restored folding using compensatory substitutions. Possible functions of the conserved RNA folding patterns in the influenza A virus genome are discussed. PMID:25180940
Energy Landscapes of Folding Chromosomes
NASA Astrophysics Data System (ADS)
Zhang, Bin
The genome, the blueprint of life, contains nearly all the information needed to build and maintain an entire organism. A comprehensive understanding of the genome is of paramount interest to human health and will advance progress in many areas, including life sciences, medicine, and biotechnology. The overarching goal of my research is to understand the structure-dynamics-function relationships of the human genome. In this talk, I will be presenting our efforts in moving towards that goal, with a particular emphasis on studying the three-dimensional organization, the structure of the genome with multi-scale approaches. Specifically, I will discuss the reconstruction of genome structures at both interphase and metaphase by making use of data from chromosome conformation capture experiments. Computationally modeling of chromatin fiber at atomistic level from first principles will also be presented as our effort for studying the genome structure from bottom up.
Kanamori, Hajime; Parobek, Christian M; Juliano, Jonathan J; Johnson, James R; Johnston, Brian D; Johnson, Timothy J; Weber, David J; Rutala, William A; Anderson, Deverick J
2017-08-01
Escherichia coli sequence type 131 (ST131) predominates globally among multidrug-resistant (MDR) E. coli strains. We used whole-genome sequencing (WGS) to investigate 63 MDR E. coli isolates from 7 North Carolina community hospitals (2010 to 2015). Of these, 39 (62%) represented ST131, including 37 (95%) from the ST131- H 30R subclone: 10 (27%) from its H 30R1 subset and 27 (69%) from its H 30Rx subset. ST131 core genomes differed by a median of 15 (range, 0 to 490) single-nucleotide variants (SNVs) overall versus only 7 within H 30R1 (range, 3 to 12 SNVs) and 11 within H 30Rx (range, 0 to 21). The four isolates with identical core genomes were all H 30Rx. Epidemiological and clinical characteristics did not vary significantly by strain type, but many patients with MDR E. coli or H 30Rx infection were critically ill and had poor outcomes. H 30Rx isolates characteristically exhibited fluoroquinolone resistance and CTX-M-15 production, had a high prevalence of trimethoprim-sulfamethoxazole resistance (89%), sul1 (89%), and dfrA17 (85%), and were enriched for specific virulence traits, and all qualified as extraintestinal pathogenic E. coli The high overall prevalence of CTX-M-15 appeared to be possibly attributable to its association with the ST131- H 30Rx subclone and IncF[F2:A1:B-] plasmids. Some phylogenetically clustered non-ST131 MDR E. coli isolates also had distinctive serotypes/ fimH types, fluoroquinolone mutations, CTX-M variants, and IncF types. Thus, WGS analysis of our community hospital source MDR E. coli isolates suggested ongoing circulation and differentiation of E. coli ST131 subclones, with clonal segregation of CTX-M variants, other resistance genes, Inc-type plasmids, and virulence genes. Copyright © 2017 American Society for Microbiology.
Seal, B S; Neill, J D; Ridpath, J F
1994-07-01
Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.
Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun
2012-01-01
Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184
Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine
2013-01-01
MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest. PMID:23193269
Real-time, portable genome sequencing for Ebola surveillance.
Quick, Joshua; Loman, Nicholas J; Duraffour, Sophie; Simpson, Jared T; Severi, Ettore; Cowley, Lauren; Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan Hj; Becker-Ziaja, Beate; Boettcher, Jan-Peter; Cabeza-Cabrerizo, Mar; Camino-Sanchez, Alvaro; Carter, Lisa L; Doerrbecker, Juiliane; Enkirch, Theresa; Dorival, Isabel Graciela García; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigail; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallash, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Yasmin; Sachse, Andrea; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Trina, Racine; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N'Faly; Williams, Cecelia V; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Franck; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, Jamie; Rachwal, Philip; Turner, Daniel; Pollakis, Georgios; Hiscox, Julian A; Matthews, David A; O'Shea, Matthew K; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Woelfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A; Koivogui, Lamine; Diallo, Boubacar; Keita, Sakoba; Rambaut, Andrew; Formenty, Pierre; Gunther, Stephan; Carroll, Miles W
2016-02-11
The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks.
Musunuru, Kiran; Bernstein, Daniel; Cole, F Sessions; Khokha, Mustafa K; Lee, Frank S; Lin, Shin; McDonald, Thomas V; Moskowitz, Ivan P; Quertermous, Thomas; Sankaran, Vijay G; Schwartz, David A; Silverman, Edwin K; Zhou, Xiaobo; Hasan, Ahmed A K; Luo, Xiao-Zhong James
2018-04-01
The National Institutes of Health have made substantial investments in genomic studies and technologies to identify DNA sequence variants associated with human disease phenotypes. The National Heart, Lung, and Blood Institute has been at the forefront of these commitments to ascertain genetic variation associated with heart, lung, blood, and sleep diseases and related clinical traits. Genome-wide association studies, exome- and genome-sequencing studies, and exome-genotyping studies of the National Heart, Lung, and Blood Institute-funded epidemiological and clinical case-control studies are identifying large numbers of genetic variants associated with heart, lung, blood, and sleep phenotypes. However, investigators face challenges in identification of genomic variants that are functionally disruptive among the myriad of computationally implicated variants. Studies to define mechanisms of genetic disruption encoded by computationally identified genomic variants require reproducible, adaptable, and inexpensive methods to screen candidate variant and gene function. High-throughput strategies will permit a tiered variant discovery and genetic mechanism approach that begins with rapid functional screening of a large number of computationally implicated variants and genes for discovery of those that merit mechanistic investigation. As such, improved variant-to-gene and gene-to-function screens-and adequate support for such studies-are critical to accelerating the translation of genomic findings. In this White Paper, we outline the variety of novel technologies, assays, and model systems that are making such screens faster, cheaper, and more accurate, referencing published work and ongoing work supported by the National Heart, Lung, and Blood Institute's R21/R33 Functional Assays to Screen Genomic Hits program. We discuss priorities that can accelerate the impressive but incomplete progress represented by big data genomic research. © 2018 American Heart Association, Inc.
Discovery of novel bacterial toxins by genomics and computational biology.
Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare
2018-06-01
Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.
New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison
Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.
2011-01-01
Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011
Mms1 is an assistant for regulating G-quadruplex DNA structures.
Schwindt, Eike; Paeschke, Katrin
2018-06-01
The preservation of genome stability is fundamental for every cell. Genomic integrity is constantly challenged. Among those challenges are also non-canonical nucleic acid structures. In recent years, scientists became aware of the impact of G-quadruplex (G4) structures on genome stability. It has been shown that folded G4-DNA structures cause changes in the cell, such as transcriptional up/down-regulation, replication stalling, or enhanced genome instability. Multiple helicases have been identified to regulate G4 structures and by this preserve genome stability. Interestingly, although these helicases are mostly ubiquitous expressed, they show specificity for G4 regulation in certain cellular processes (e.g., DNA replication). To this date, it is not clear how this process and target specificity of helicases are achieved. Recently, Mms1, an ubiquitin ligase complex protein, was identified as a novel G4-DNA-binding protein that supports genome stability by aiding Pif1 helicase binding to these regions. In this perspective review, we discuss the question if G4-DNA interacting proteins are fundamental for helicase function and specificity at G4-DNA structures.
Genetic Characterization of Simian Foamy Viruses Infecting Humans
Rua, Réjane; Betsem, Edouard; Calattini, Sara; Saib, Ali
2012-01-01
Simian foamy viruses (SFVs) are retroviruses that are widespread among nonhuman primates (NHPs). SFVs actively replicate in their oral cavity and can be transmitted to humans after NHP bites, giving rise to a persistent infection even decades after primary infection. Very few data on the genetic structure of such SFVs found in humans are available. In the framework of ongoing studies searching for SFV-infected humans in south Cameroon rainforest villages, we studied 38 SFV-infected hunters whose times of infection had presumably been determined. By long-term cocultures of peripheral blood mononuclear cells with BHK-21 cells, we isolated five new SFV strains and obtained complete genomes of SFV strains from chimpanzee (Pan troglodytes troglodytes; strains BAD327 and AG15), monkey (Cercopithecus nictitans; strain AG16), and gorilla (Gorilla gorilla; strains BAK74 and BAD468). These zoonotic strains share a very high degree of similarity with their NHP counterparts and have a high degree of conservation of the genetic elements important for viral replication. Interestingly, analysis of FV DNA sequences obtained before cultivation revealed variants with deletions in both the U3 region and tas that may correlate with in vivo chronicity in humans. Genomic changes in bet (a premature stop codon) and gag were also observed. To determine if such changes were specific to zoonotic strains, we studied local SFV-infected chimpanzees and found the same genomic changes. Our study reveals that natural polymorphism of SFV strains does exist at both the intersubspecies level (gag, bet) and the intrasubspecies (U3, tas) levels but does not seem to reflect a viral adaptation specific to zoonotic SFV strains. PMID:23015714
Bourret, Vincent; Kent, Matthew P; Primmer, Craig R; Vasemägi, Anti; Karlsson, Sten; Hindar, Kjetil; McGinnity, Philip; Verspoor, Eric; Bernatchez, Louis; Lien, Sigbjørn
2013-02-01
Atlantic salmon (Salmo salar) is one of the most extensively studied fish species in the world due to its significance in aquaculture, fisheries and ongoing conservation efforts to protect declining populations. Yet, limited genomic resources have hampered our understanding of genetic architecture in the species and the genetic basis of adaptation to the wide range of natural and artificial environments it occupies. In this study, we describe the development of a medium-density Atlantic salmon single nucleotide polymorphism (SNP) array based on expressed sequence tags (ESTs) and genomic sequencing. The array was used in the most extensive assessment of population genetic structure performed to date in this species. A total of 6176 informative SNPs were successfully genotyped in 38 anadromous and freshwater wild populations distributed across the species natural range. Principal component analysis clearly differentiated European and North American populations, and within Europe, three major regional genetic groups were identified for the first time in a single analysis. We assessed the potential for the array to disentangle neutral and putative adaptive divergence of SNP allele frequencies across populations and among regional groups. In Europe, secondary contact zones were identified between major clusters where endogenous and exogenous barriers could be associated, rendering the interpretation of environmental influence on potentially adaptive divergence equivocal. A small number of markers highly divergent in allele frequencies (outliers) were observed between (multiple) freshwater and anadromous populations, between northern and southern latitudes, and when comparing Baltic populations to all others. We also discuss the potential future applications of the SNP array for conservation, management and aquaculture. © 2012 Blackwell Publishing Ltd.
Multi-scale structural community organisation of the human genome.
Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin
2017-04-11
Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.
Genome expansion via lineage splitting and genome reduction in the cicada endosymbiont Hodgkinia.
Campbell, Matthew A; Van Leuven, James T; Meister, Russell C; Carey, Kaitlin M; Simon, Chris; McCutcheon, John P
2015-08-18
Comparative genomics from mitochondria, plastids, and mutualistic endosymbiotic bacteria has shown that the stable establishment of a bacterium in a host cell results in genome reduction. Although many highly reduced genomes from endosymbiotic bacteria are stable in gene content and genome structure, organelle genomes are sometimes characterized by dramatic structural diversity. Previous results from Candidatus Hodgkinia cicadicola, an endosymbiont of cicadas, revealed that some lineages of this bacterium had split into two new cytologically distinct yet genetically interdependent species. It was hypothesized that the long life cycle of cicadas in part enabled this unusual lineage-splitting event. Here we test this hypothesis by investigating the structure of the Ca. Hodgkinia genome in one of the longest-lived cicadas, Magicicada tredecim. We show that the Ca. Hodgkinia genome from M. tredecim has fragmented into multiple new chromosomes or genomes, with at least some remaining partitioned into discrete cells. We also show that this lineage-splitting process has resulted in a complex of Ca. Hodgkinia genomes that are 1.1-Mb pairs in length when considered together, an almost 10-fold increase in size from the hypothetical single-genome ancestor. These results parallel some examples of genome fragmentation and expansion in organelles, although the mechanisms that give rise to these extreme genome instabilities are likely different.
Dukić, Marinela; Berner, Daniel; Roesti, Marius; Haag, Christoph R; Ebert, Dieter
2016-10-13
Recombination rate is an essential parameter for many genetic analyses. Recombination rates are highly variable across species, populations, individuals and different genomic regions. Due to the profound influence that recombination can have on intraspecific diversity and interspecific divergence, characterization of recombination rate variation emerges as a key resource for population genomic studies and emphasises the importance of high-density genetic maps as tools for studying genome biology. Here we present such a high-density genetic map for Daphnia magna, and analyse patterns of recombination rate across the genome. A F2 intercross panel was genotyped by Restriction-site Associated DNA sequencing to construct the third-generation linkage map of D. magna. The resulting high-density map included 4037 markers covering 813 scaffolds and contigs that sum up to 77 % of the currently available genome draft sequence (v2.4) and 55 % of the estimated genome size (238 Mb). Total genetic length of the map presented here is 1614.5 cM and the genome-wide recombination rate is estimated to 6.78 cM/Mb. Merging genetic and physical information we consistently found that recombination rate estimates are high towards the peripheral parts of the chromosomes, while chromosome centres, harbouring centromeres in D. magna, show very low recombination rate estimates. Due to its high-density, the third-generation linkage map for D. magna can be coupled with the draft genome assembly, providing an essential tool for genome investigation in this model organism. Thus, our linkage map can be used for the on-going improvements of the genome assembly, but more importantly, it has enabled us to characterize variation in recombination rate across the genome of D. magna for the first time. These new insights can provide a valuable assistance in future studies of the genome evolution, mapping of quantitative traits and population genetic studies.
Palermo, Giulia; Miao, Yinglong; Walker, Ross C; Jinek, Martin; McCammon, J Andrew
2016-10-26
The CRISPR (clustered regularly interspaced short palindromic repeats)-Cas9 system recently emerged as a transformative genome-editing technology that is innovating basic bioscience and applied medicine and biotechnology. The endonuclease Cas9 associates with a guide RNA to match and cleave complementary sequences in double stranded DNA, forming an RNA:DNA hybrid and a displaced non-target DNA strand. Although extensive structural studies are ongoing, the conformational dynamics of Cas9 and its interplay with the nucleic acids during association and DNA cleavage are largely unclear. Here, by employing multi-microsecond time scale molecular dynamics, we reveal the conformational plasticity of Cas9 and identify key determinants that allow its large-scale conformational changes during nucleic acid binding and processing. We show how the "closure" of the protein, which accompanies nucleic acid binding, fundamentally relies on highly coupled and specific motions of the protein domains, collectively initiating the prominent conformational changes needed for nucleic acid association. We further reveal a key role of the non-target DNA during the process of activation of the nuclease HNH domain, showing how the nontarget DNA positioning triggers local conformational changes that favor the formation of a catalytically competent Cas9. Finally, a remarkable conformational plasticity is identified as an intrinsic property of the HNH domain, constituting a necessary element that allows for the HNH repositioning. These novel findings constitute a reference for future experimental studies aimed at a full characterization of the dynamic features of the CRISPR-Cas9 system, and-more importantly-call for novel structure engineering efforts that are of fundamental importance for the rational design of new genome-engineering applications.
Life in the fast lane for protein crystallization and X-ray crystallography
NASA Technical Reports Server (NTRS)
Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.
2005-01-01
The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high-rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today's high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
Life in the Fast Lane for Protein Crystallization and X-Ray Crystallography
NASA Technical Reports Server (NTRS)
Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.
2004-01-01
The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today s high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
Child Development and Structural Variation in the Human Genome
ERIC Educational Resources Information Center
Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.
2013-01-01
Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…
An Approach to Using Toxicogenomic Data in US EPA Human ...
EPA announced the availability of the final report, An Approach to Using Toxicogenomic Data in U.S. EPA Human Health Risk Assessments: A Dibutyl Phthalate Case Study. This report outlines an approach to evaluate genomic data for use in risk assessment and a case study to illustrate the approach. The dibutyl phthalate (DBP) case study example focuses on male reproductive developmental effects and the qualitative application of genomic data because of the available data on DBP. The case study presented in this report is a separate activity from any of the ongoing IRIS human health assessments for the phthalates. The National Center for Environmental Assessment (NCEA) prepared this document for the purpose of describing and illustrating an approach for using toxicogenomic data in risk assessment.
Engineered LINE-1 retrotransposition in nondividing human neurons.
Macia, Angela; Widmann, Thomas J; Heras, Sara R; Ayllon, Veronica; Sanchez, Laura; Benkaddour-Boumzaouad, Meriem; Muñoz-Lopez, Martin; Rubio, Alejandro; Amador-Cubero, Suyapa; Blanco-Jimenez, Eva; Garcia-Castro, Javier; Menendez, Pablo; Ng, Philip; Muotri, Alysson R; Goodier, John L; Garcia-Perez, Jose L
2017-03-01
Half the human genome is made of transposable elements (TEs), whose ongoing activity continues to impact our genome. LINE-1 (or L1) is an autonomous non-LTR retrotransposon in the human genome, comprising 17% of its genomic mass and containing an average of 80-100 active L1s per average genome that provide a source of inter-individual variation. New LINE-1 insertions are thought to accumulate mostly during human embryogenesis. Surprisingly, the activity of L1s can further impact the somatic human brain genome. However, it is currently unknown whether L1 can retrotranspose in other somatic healthy tissues or if L1 mobilization is restricted to neuronal precursor cells (NPCs) in the human brain. Here, we took advantage of an engineered L1 retrotransposition assay to analyze L1 mobilization rates in human mesenchymal (MSCs) and hematopoietic (HSCs) somatic stem cells. Notably, we have observed that L1 expression and engineered retrotransposition is much lower in both MSCs and HSCs when compared to NPCs. Remarkably, we have further demonstrated for the first time that engineered L1s can retrotranspose efficiently in mature nondividing neuronal cells. Thus, these findings suggest that the degree of somatic mosaicism and the impact of L1 retrotransposition in the human brain is likely much higher than previously thought. © 2017 Macia et al.; Published by Cold Spring Harbor Laboratory Press.
Oliver, J M; Slashinski, M J; Wang, T; Kelly, P A; Hilsenbeck, S G; McGuire, A L
2012-01-01
Technological advancements are rapidly propelling the field of genome research forward, while lawmakers attempt to keep apace with the risks these advances bear. Balancing normative concerns of maximizing data utility and protecting human subjects, whose privacy is at risk due to the identifiability of DNA data, are central to policy decisions. Research on genome research participants making real-time data sharing decisions is limited; yet, these perspectives could provide critical information to ongoing deliberations. We conducted a randomized trial of 3 consent types affording varying levels of control over data release decisions. After debriefing participants about the randomization process, we invited them to a follow-up interview to assess their attitudes toward genetic research, privacy and data sharing. Participants were more restrictive in their reported data sharing preferences than in their actual data sharing decisions. They saw both benefits and risks associated with sharing their genomic data, but risks were seen as less concrete or happening in the future, and were largely outweighed by purported benefits. Policymakers must respect that participants' assessment of the risks and benefits of data sharing and their privacy-utility determinations, which are associated with their final data release decisions, vary. In order to advance the ethical conduct of genome research, proposed policy changes should carefully consider these stakeholder perspectives. Copyright © 2011 S. Karger AG, Basel.
Xu, Dong; Zhang, Yang
2013-01-01
Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms. PMID:23719418
The Divided Bacterial Genome: Structure, Function, and Evolution.
diCenzo, George C; Finan, Turlough M
2017-09-01
Approximately 10% of bacterial genomes are split between two or more large DNA fragments, a genome architecture referred to as a multipartite genome. This multipartite organization is found in many important organisms, including plant symbionts, such as the nitrogen-fixing rhizobia, and plant, animal, and human pathogens, including the genera Brucella , Vibrio , and Burkholderia . The availability of many complete bacterial genome sequences means that we can now examine on a broad scale the characteristics of the different types of DNA molecules in a genome. Recent work has begun to shed light on the unique properties of each class of replicon, the unique functional role of chromosomal and nonchromosomal DNA molecules, and how the exploitation of novel niches may have driven the evolution of the multipartite genome. The aims of this review are to (i) outline the literature regarding bacterial genomes that are divided into multiple fragments, (ii) provide a meta-analysis of completed bacterial genomes from 1,708 species as a way of reviewing the abundant information present in these genome sequences, and (iii) provide an encompassing model to explain the evolution and function of the multipartite genome structure. This review covers, among other topics, salient genome terminology; mechanisms of multipartite genome formation; the phylogenetic distribution of multipartite genomes; how each part of a genome differs with respect to genomic signatures, genetic variability, and gene functional annotation; how each DNA molecule may interact; as well as the costs and benefits of this genome structure. Copyright © 2017 American Society for Microbiology.
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tjong, Harianto; Li, Wenyuan; Kalhor, Reza
Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization
Tjong, Harianto; Li, Wenyuan; Kalhor, Reza; ...
2016-03-07
Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Reorganization of wheat and rye genomes in octoploid triticale (× Triticosecale).
Kalinka, Anna; Achrem, Magdalena
2018-04-01
The analysis of early generations of triticale showed numerous rearrangements of the genome. Complexed transformation included loss of chromosomes, t-heterochromatin content changes and the emergence of retrotransposons in new locations. This study investigated certain aspects of genomic transformations in the early generations (F5 and F8) of the primary octoploid triticale derived from the cross of hexaploid wheat with the diploid rye. Most of the plants tested were hypoploid; among eliminated chromosomes were rye chromosomes 4R and 5R and variable number of wheat chromosomes. Wheat chromosomes were eliminated to a higher extent. The lower content of telomeric heterochromatin was also found in rye chromosomes in comparison with parental rye. Studying the location of selected retrotransposons from Ty1-copia and Ty3-gypsy families using fluorescence in situ hybridization revealed additional locations of these retrotransposons that were not present in chromosomes of parental species. ISSR, IRAP and REMAP analyses showed significant changes at the level of specific DNA nucleotide sequences. In most cases, the disappearance of certain types of bands was observed, less frequently new types of bands appeared, not present in parental species. This demonstrates the scale of genome rearrangement and, above all, the elimination of wheat and rye sequences, largely due to the reduction of chromosome number. With regard to the proportion of wheat to rye genome, the rye genome was more affected by the changes, thus this study was focused more on the rye genome. Observations suggest that genome reorganization is not finished in the F5 generation but is still ongoing in the F8 generation.
Centromere reference models for human chromosomes X and Y satellite arrays
Miga, Karen H.; Newton, Yulia; Jain, Miten; Altemose, Nicolas; Willard, Huntington F.; Kent, W. James
2014-01-01
The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. PMID:24501022
Amino acid mutations in Ebola virus glycoprotein of the 2014 epidemic.
Giovanetti, Marta; Grifoni, Alba; Lo Presti, Alessandra; Cella, Eleonora; Montesano, Carla; Zehender, Gianguglielmo; Colizzi, Vittorio; Amicosante, Massimo; Ciccozzi, Massimo
2015-06-01
Zaire Ebola virus (EBOV) is an enveloped non-segmented negative strand RNA virus of 19 kb in length belonging to the family Filoviridae. The virus was isolated and identified in 1976 during the epidemic of hemorrhagic fever in Zaire. The most recent outbreak of EBOV among humans, was that occurred in the forested areas of south eastern Guinea, that began in February 2014 and is still ongoing. The recent Ebola outbreak, is affecting other countries in West Africa, in addiction to Guinea: Liberia, Nigeria, and Sierra Leone. In this article, a selective pressure analysis and homology modeling based on the G Glycoprotein (GP) sequences retrieved from public databases were used to investigate the genetic diversity and modification of antibody response in the recent outbreak of Ebola Virus. Structural and the evolutionary analysis underline the 2014 epidemic virus being under negative selective pressure does not change with respect to the old epidemic in terms of genome adaptation. © 2015 Wiley Periodicals, Inc.
Wong, Vanessa K; Baker, Stephen; Pickard, Derek J; Parkhill, Julian; Page, Andrew J; Feasey, Nicholas A; Kingsley, Robert A; Thomson, Nicholas R; Keane, Jacqueline A; Weill, François-Xavier; Edwards, David J; Hawkey, Jane; Harris, Simon R; Mather, Alison E; Cain, Amy K; Hadfield, James; Hart, Peter J; Thieu, Nga Tran Vu; Klemm, Elizabeth J; Glinos, Dafni A; Breiman, Robert F; Watson, Conall H; Kariuki, Samuel; Gordon, Melita A; Heyderman, Robert S; Okoro, Chinyere; Jacobs, Jan; Lunguya, Octavie; Edmunds, W John; Msefula, Chisomo; Chabalgoity, Jose A; Kama, Mike; Jenkins, Kylie; Dutta, Shanta; Marks, Florian; Campos, Josefina; Thompson, Corinne; Obaro, Stephen; MacLennan, Calman A; Dolecek, Christiane; Keddy, Karen H; Smith, Anthony M; Parry, Christopher M; Karkey, Abhilasha; Mulholland, E Kim; Campbell, James I; Dongol, Sabina; Basnyat, Buddha; Dufour, Muriel; Bandaranayake, Don; Naseri, Take Toleafoa; Singh, Shalini Pravin; Hatta, Mochammad; Newton, Paul; Onsare, Robert S; Isaia, Lupeoletalalei; Dance, David; Davong, Viengmon; Thwaites, Guy; Wijedoru, Lalith; Crump, John A; De Pinna, Elizabeth; Nair, Satheesh; Nilles, Eric J; Thanh, Duy Pham; Turner, Paul; Soeng, Sona; Valcanis, Mary; Powling, Joan; Dimovski, Karolina; Hogg, Geoff; Farrar, Jeremy; Holt, Kathryn E; Dougan, Gordon
2015-06-01
The emergence of multidrug-resistant (MDR) typhoid is a major global health threat affecting many countries where the disease is endemic. Here whole-genome sequence analysis of 1,832 Salmonella enterica serovar Typhi (S. Typhi) identifies a single dominant MDR lineage, H58, that has emerged and spread throughout Asia and Africa over the last 30 years. Our analysis identifies numerous transmissions of H58, including multiple transfers from Asia to Africa and an ongoing, unrecognized MDR epidemic within Africa itself. Notably, our analysis indicates that H58 lineages are displacing antibiotic-sensitive isolates, transforming the global population structure of this pathogen. H58 isolates can harbor a complex MDR element residing either on transmissible IncHI1 plasmids or within multiple chromosomal integration sites. We also identify new mutations that define the H58 lineage. This phylogeographical analysis provides a framework to facilitate global management of MDR typhoid and is applicable to similar MDR lineages emerging in other bacterial species.
Petty, Tom J.; Nishimura, Taisuke; Emamzadah, Soheila; Gabus, Caroline; Paszkowski, Jerzy; Halazonetis, Thanos D.; Thore, Stéphane
2010-01-01
Of the known epigenetic control regulators found in plants, the Morpheus’ molecule 1 (MOM1) protein is atypical in that the deletion of MOM1 does not affect the level of epigenetic marks controlling the transcriptional status of the genome. A short 197-amino-acid fragment of the MOM1 protein sequence can complement MOM1 deletion when coupled to a nuclear localization signal, suggesting that this region contains a functional domain that compensates for the loss of the full-length protein. Numerous constructs centred on the highly conserved MOM1 motif 2 (CMM2) present in these 197 residues have been generated and expressed in Escherichia coli. Following purification and crystallization screening, diamond-shaped single crystals were obtained that diffracted to ∼3.2 Å resolution. They belonged to the trigonal space group P3121 (or P3221), with unit-cell parameters a = 85.64, c = 292.74 Å. Structure determination is ongoing. PMID:20693667
Petty, Tom J; Nishimura, Taisuke; Emamzadah, Soheila; Gabus, Caroline; Paszkowski, Jerzy; Halazonetis, Thanos D; Thore, Stéphane
2010-08-01
Of the known epigenetic control regulators found in plants, the Morpheus' molecule 1 (MOM1) protein is atypical in that the deletion of MOM1 does not affect the level of epigenetic marks controlling the transcriptional status of the genome. A short 197-amino-acid fragment of the MOM1 protein sequence can complement MOM1 deletion when coupled to a nuclear localization signal, suggesting that this region contains a functional domain that compensates for the loss of the full-length protein. Numerous constructs centred on the highly conserved MOM1 motif 2 (CMM2) present in these 197 residues have been generated and expressed in Escherichia coli. Following purification and crystallization screening, diamond-shaped single crystals were obtained that diffracted to approximately 3.2 A resolution. They belonged to the trigonal space group P3(1)21 (or P3(2)21), with unit-cell parameters a=85.64, c=292.74 A. Structure determination is ongoing.
Wong, Vanessa K; Baker, Stephen; Pickard, Derek J; Parkhill, Julian; Page, Andrew J; Feasey, Nicholas A; Kingsley, Robert A; Thomson, Nicholas R; Keane, Jacqueline A; Weill, François-Xavier; Edwards, David J; Hawkey, Jane; Harris, Simon R; Mather, Alison E; Cain, Amy K; Hadfield, James; Hart, Peter J; Thieu, Nga Tran Vu; Klemm, Elizabeth J; Glinos, Dafni A; Breiman, Robert F; Watson, Conall H; Kariuki, Samuel; Gordon, Melita A; Heyderman, Robert S; Okoro, Chinyere; Jacobs, Jan; Lunguya, Octavie; Edmunds, W John; Msefula, Chisomo; Chabalgoity, Jose A; Kama, Mike; Jenkins, Kylie; Dutta, Shanta; Marks, Florian; Campos, Josefina; Thompson, Corinne; Obaro, Stephen; MacLennan, Calman A; Dolecek, Christiane; Keddy, Karen H; Smith, Anthony M; Parry, Christopher M; Karkey, Abhilasha; Mulholland, E Kim; Campbell, James I; Dongol, Sabina; Basnyat, Buddha; Dufour, Muriel; Bandaranayake, Don; Naseri, Take Toleafoa; Singh, Shalini Pravin; Hatta, Mochammad; Newton, Paul; Onsare, Robert S; Isaia, Lupeoletalalei; Dance, David; Davong, Viengmon; Thwaites, Guy; Wijedoru, Lalith; Crump, John A; De Pinna, Elizabeth; Nair, Satheesh; Nilles, Eric J; Thanh, Duy Pham; Turner, Paul; Soeng, Sona; Valcanis, Mary; Powling, Joan; Dimovski, Karolina; Hogg, Geoff; Farrar, Jeremy; Holt, Kathryn E; Dougan, Gordon
2016-01-01
The emergence of multidrug-resistant (MDR) typhoid is a major global health threat affecting many countries where the disease is endemic. Here whole-genome sequence analysis of 1,832 Salmonella enterica serovar Typhi (S. Typhi) identifies a single dominant MDR lineage, H58, that has emerged and spread throughout Asia and Africa over the last 30 years. Our analysis identifies numerous transmissions of H58, including multiple transfers from Asia to Africa and an ongoing, unrecognized MDR epidemic within Africa itself. Notably, our analysis indicates that H58 lineages are displacing antibiotic-sensitive isolates, transforming the global population structure of this pathogen. H58 isolates can harbor a complex MDR element residing either on transmissible IncHI1 plasmids or within multiple chromosomal integration sites. We also identify new mutations that define the H58 lineage. This phylogeographical analysis provides a framework to facilitate global management of MDR typhoid and is applicable to similar MDR lineages emerging in other bacterial species. PMID:25961941
[Progress on salt resistance in autopolyploid plants].
Zhu, Hong Ju; Liu, Wen Ge
2018-04-20
Polyploidization is a key driving force that plays a vital role in the evolution of higher plants. Autopolyploid plants often demonstrate altered physiology phenomena due to the different genome composition and gene expression patterns. For example, autopolyploid plants are more resistant to stresses than their homologous diploid ancestors. Soil salinity and secondary salinization are two vital factors affecting crop production which severely limit the sustainable development of agriculture in China. Polyploid plants are important germplasm resources in crop genetic improvement due to their higher salt tolerance. Revealing the mechanism of salt tolerance in homologous plants will provide a foundation for breeding new plants with improved salt resistance. In this review, we describe the existing and ongoing characterization of the mechanism of salt tolerance in autopolyploid plants, including the salt tolerance evolution, physiology, biochemistry, cell structure and molecular level researches. Finally, we also discuss the prospects in this field by using polyploid watermelon as an example, which will be helpful in polyploid research and plant breeding.
Voorhees, Ian E H; Dalziel, Benjamin D; Glaser, Amy; Dubovi, Edward J; Murcia, Pablo R; Newbury, Sandra; Toohey-Kurth, Kathy; Su, Shuo; Kriti, Divya; Van Bakel, Harm; Goodman, Laura B; Leutenegger, Christian; Holmes, Edward C; Parrish, Colin R
2018-06-06
Avian-origin H3N2 canine influenza virus (CIV) transferred to dogs in Asia around 2005, becoming enzootic throughout China and Korea before reaching the USA in early 2015. To understand the post-transfer evolution and epidemiology of this virus, particularly the cause of recent and ongoing increases in incidence in the USA, we performed an integrated analysis of whole-genome sequence data from 64 newly sequenced viruses and comprehensive surveillance data. This reveals that the circulation of H3N2 CIV within the USA is typified by recurrent epidemic burst-fadeout dynamics driven by multiple introductions of virus from Asia. Although all major viral lineages displayed similar rates of genomic sequence evolution, H3N2 CIV consistently exhibited proportionally more non-synonymous substitutions per site compared to avian reservoir viruses, indicative of a large-scale change in selection pressures. Despite these genotypic differences, we found no evidence of adaptive evolution or increased viral transmission, with epidemiological models indicating a basic reproductive number, R 0 , of between 1 and 1.5 across nearly all USA outbreaks, consistent with maintained, but heterogeneous circulation. We propose that CIV's mode of viral circulation may have resulted in evolutionary cul-de-sacs, in which there is little opportunity for the selection of the more transmissible H3N2 CIV phenotypes necessary to enable circulation through a general dog population characterized by widespread contact heterogeneity. CIV must therefore rely on metapopulations of high host density (notably animal shelters) within the greater dog population and reintroduction from other populations or face complete epidemic extinction. IMPORTANCE The relatively recent appearance of influenza A virus (IAV) epidemics in dogs expands our understanding of IAV host-range and ecology, providing useful and relevant models for understanding critical factors involved in viral emergence. Here, we integrate viral whole-genome sequence analysis and comprehensive surveillance data to examine the evolution of the emerging avian-origin H3N2 canine influenza virus (CIV), particularly the factors driving ongoing circulation and recent increase in incidence of the virus within the USA. Our results provide a detailed understanding of how H3N2 CIV achieves sustained circulation within the USA, despite widespread host contact heterogeneity and recurrent epidemic fade-out. Moreover, our findings suggest that the types and intensity of selection pressures an emerging virus experiences are highly dependent on host population structure and ecology, and may inhibit an emerging virus from acquiring sustained epidemic or pandemic circulation. Copyright © 2018 American Society for Microbiology.
Phenetic Comparison of Prokaryotic Genomes Using k-mers
Déraspe, Maxime; Raymond, Frédéric; Boisvert, Sébastien; Culley, Alexander; Roy, Paul H.; Laviolette, François; Corbeil, Jacques
2017-01-01
Abstract Bacterial genomics studies are getting more extensive and complex, requiring new ways to envision analyses. Using the Ray Surveyor software, we demonstrate that comparison of genomes based on their k-mer content allows reconstruction of phenetic trees without the need of prior data curation, such as core genome alignment of a species. We validated the methodology using simulated genomes and previously published phylogenomic studies of Streptococcus pneumoniae and Pseudomonas aeruginosa. We also investigated the relationship of specific genetic determinants with bacterial population structures. By comparing clusters from the complete genomic content of a genome population with clusters from specific functional categories of genes, we can determine how the population structures are correlated. Indeed, the strain clustering based on a subset of k-mers allows determination of its similarity with the whole genome clusters. We also applied this methodology on 42 species of bacteria to determine the correlational significance of five important bacterial genomic characteristics. For example, intrinsic resistance is more important in P. aeruginosa than in S. pneumoniae, and the former has increased correlation of its population structure with antibiotic resistance genes. The global view of the pangenome of bacteria also demonstrated the taxa-dependent interaction of population structure with antibiotic resistance, bacteriophage, plasmid, and mobile element k-mer data sets. PMID:28957508
Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.
McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael
2014-08-01
Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.
Clinical Applications of Genome Editing to HIV Cure.
Wang, Cathy X; Cannon, Paula M
2016-12-01
Despite significant advances in HIV drug treatment regimens, which grant near-normal life expectancies to infected individuals who have good virological control, HIV infection itself remains incurable. In recent years, novel gene- and cell-based therapies have gained increasing attention due to their potential to provide a functional or even sterilizing cure for HIV infection with a one-shot treatment. A functional cure would keep the infection in check and prevent progression to AIDS, while a sterilizing cure would eradicate all HIV viruses from the patient. Genome editing is the most precise form of gene therapy, able to achieve permanent genetic disruption, modification, or insertion at a predesignated genetic locus. The most well-studied candidate for anti-HIV genome editing is CCR5, an essential coreceptor for the majority of HIV strains, and the lack of which confers HIV resistance in naturally occurring homozygous individuals. Genetic disruption of CCR5 to treat HIV has undergone clinical testing, with seven completed or ongoing trials in T cells and hematopoietic stem and progenitor cells, and has shown promising safety and potential efficacy profiles. Here we summarize clinical findings of CCR5 editing for HIV therapy, as well as other genome editing-based approaches under pre-clinical development. The anticipated development of more sophisticated genome editing technologies should continue to benefit HIV cure efforts.
Informed Consent in Genome-Scale Research: What Do Prospective Participants Think?
Trinidad, Susan Brown; Fullerton, Stephanie M.; Bares, Julie M.; Jarvik, Gail P.; Larson, Eric B.; Burke, Wylie
2012-01-01
Background To promote effective genome-scale research, genomic and clinical data for large population samples must be collected, stored, and shared. Methods We conducted focus groups with 45 members of a Seattle-based integrated healthcare delivery system to learn about their views and expectations for informed consent in genome-scale studies. Results Participants viewed information about study purpose, aims, and how and by whom study data could be used to be at least as important as information about risks and possible harms. They generally supported a tiered consent approach for specific issues, including research purpose, data sharing, and access to individual research results. Participants expressed a continuum of opinions with respect to the acceptability of broad consent, ranging from completely acceptable to completely unacceptable. Older participants were more likely to view the consent process in relational – rather than contractual – terms, compared with younger participants. The majority of participants endorsed seeking study subjects’ permission regarding material changes in study purpose and data sharing. Conclusions Although this study sample was limited in terms of racial and socioeconomic diversity, our results suggest a strong positive interest in genomic research on the part of at least some prospective participants and indicate a need for increased public engagement, as well as strategies for ongoing communication with study participants. PMID:23493836
Dhir, Mashaal; Choudry, Haroon A; Holtzman, Matthew P; Pingpank, James F; Ahrendt, Steven A; Zureikat, Amer H; Hogg, Melissa E; Bartlett, David L; Zeh, Herbert J; Singhi, Aatur D; Bahary, Nathan
2017-01-01
The impact of genomic profiling on the outcomes of patients with advanced gastrointestinal (GI) malignancies remains unknown. The primary objectives of the study were to investigate the clinical benefit of genomic-guided therapy, defined as complete response (CR), partial response (PR), or stable disease (SD) at 3 months, and its impact on progression-free survival (PFS) in patients with advanced GI malignancies. Clinical and genomic data of all consecutive GI tumor samples from April, 2013 to April, 2016 sequenced by FoundationOne were obtained and analyzed. A total of 101 samples from 97 patients were analyzed. Ninety-eight samples from 95 patients could be amplified making this approach feasible in 97% of the samples. After removing duplicates, 95 samples from 95 patients were included in the further analysis. Median time from specimen collection to reporting was 11 days. Genomic alteration-guided treatment recommendations were considered new and clinically relevant in 38% (36/95) of the patients. Rapid decline in functional status was noted in 25% (9/36) of these patients who could therefore not receive genomic-guided therapy. Genomic-guided therapy was utilized in 13 patients (13.7%) and 7 patients (7.4%) experienced clinical benefit (6 PR and 1 SD). Among these seven patients, median PFS was 10 months with some ongoing durable responses. Genomic profiling-guided therapy can lead to clinical benefit in a subset of patients with advanced GI malignancies. Attempting genomic profiling earlier in the course of treatment prior to functional decline may allow more patients to benefit from these therapies. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
I-motif DNA structures are formed in the nuclei of human cells
NASA Astrophysics Data System (ADS)
Zeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, Daniel
2018-06-01
Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.
Sun, Xun; Lu, You; Bish, Lawrence T; Calcedo, Roberto; Wilson, James M; Gao, Guangping
2010-06-01
Vectors based on several new adeno-associated viral (AAV) serotypes demonstrated strong hepatocyte tropism and transduction efficiency in both small- and large-animal models for liver-directed gene transfer. Efficiency of liver transduction by AAV vectors can be further improved in both murine and nonhuman primate (NHP) animals when the vector genomes are packaged in a self-complementary (sc) format. In an attempt to understand potential molecular mechanism(s) responsible for enhanced transduction efficiency of the sc vector in liver, we performed extensive molecular studies of genome structures of conventional single-stranded (ss) and sc AAV vectors from liver after AAV gene transfer in both mice and NHPs. These included treatment with exonucleases with specific substrate preferences, single-cutter restriction enzyme digestion and polarity-specific hybridization-based vector genome mapping, and bacteriophage phi29 DNA polymerase-mediated and double-stranded circular template-specific rescue of persisted circular genomes. In mouse liver, vector genomes of both genome formats seemed to persist primarily as episomal circular forms, but sc vectors converted into circular forms more rapidly and efficiently. However, the overall differences in vector genome abundance and structure in the liver between ss and sc vectors could not account for the remarkable differences in transduction. Molecular structures of persistent genomes of both ss and sc vectors were significantly more heterogeneous in macaque liver, with noticeable structural rearrangements that warrant further characterizations.
Sun, Xun; Lu, You; Bish, Lawrence T.; Calcedo, Roberto; Wilson, James M.
2010-01-01
Abstract Vectors based on several new adeno-associated viral (AAV) serotypes demonstrated strong hepatocyte tropism and transduction efficiency in both small- and large-animal models for liver-directed gene transfer. Efficiency of liver transduction by AAV vectors can be further improved in both murine and nonhuman primate (NHP) animals when the vector genomes are packaged in a self-complementary (sc) format. In an attempt to understand potential molecular mechanism(s) responsible for enhanced transduction efficiency of the sc vector in liver, we performed extensive molecular studies of genome structures of conventional single-stranded (ss) and sc AAV vectors from liver after AAV gene transfer in both mice and NHPs. These included treatment with exonucleases with specific substrate preferences, single-cutter restriction enzyme digestion and polarity-specific hybridization-based vector genome mapping, and bacteriophage ϕ29 DNA polymerase-mediated and double-stranded circular template-specific rescue of persisted circular genomes. In mouse liver, vector genomes of both genome formats seemed to persist primarily as episomal circular forms, but sc vectors converted into circular forms more rapidly and efficiently. However, the overall differences in vector genome abundance and structure in the liver between ss and sc vectors could not account for the remarkable differences in transduction. Molecular structures of persistent genomes of both ss and sc vectors were significantly more heterogeneous in macaque liver, with noticeable structural rearrangements that warrant further characterizations. PMID:20113166
[Adherence to the Ongoing Education Program for family doctors in a southeastern Brazilian state].
d'Ávila, Luciana Souza; Assis, Lucília Nunes de; Melo, Marilene Barros de; Brant, Luiz Carlos
2014-02-01
Ongoing Health Education is a strategy for transformation of health practices, though the adherence of professionals is one of the challenges facing its implementation. Thus, the objective of this study was to investigate the factors associated with adherence of family doctors to the Ongoing Education Program in a southeastern Brazilian state from the perception of supervisors. It is a cross-sectional and quantitative study with the use of online questionnaires. Data were analyzed using the chi-square test with ongoing correction to determine the association between structure, topics, activities and difficulties of the supervisors working in Ongoing Health Education, difficulties of the physicians in Primary Health Care (PHC) and poor and good adherence to the program. Excellent medical participation was statistically related to the adequacy of physical space (p = 0.001), a multidisciplinary approach (p = 0.035) and epidemiological aspects (p = 0.043). Low adherence was associated with the inadequacy of the physical structure, difficulty understanding the methodology, less time in a supervisory position, multiple workdays, among others. A good adherence to Ongoing Health Education is a possibility for collective reconstruction of the everyday work of physicians in Primary Health Care.
Chromatin Insulators and Topological Domains: Adding New Dimensions to 3D Genome Architecture
Matharu, Navneet K.; Ahanger, Sajad H.
2015-01-01
The spatial organization of metazoan genomes has a direct influence on fundamental nuclear processes that include transcription, replication, and DNA repair. It is imperative to understand the mechanisms that shape the 3D organization of the eukaryotic genomes. Chromatin insulators have emerged as one of the central components of the genome organization tool-kit across species. Recent advancements in chromatin conformation capture technologies have provided important insights into the architectural role of insulators in genomic structuring. Insulators are involved in 3D genome organization at multiple spatial scales and are important for dynamic reorganization of chromatin structure during reprogramming and differentiation. In this review, we will discuss the classical view and our renewed understanding of insulators as global genome organizers. We will also discuss the plasticity of chromatin structure and its re-organization during pluripotency and differentiation and in situations of cellular stress. PMID:26340639
NASA Astrophysics Data System (ADS)
Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.
2018-03-01
Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.
Terminal structures of West Nile virus genomic RNA and their interactions with viral NS5 protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong Hongping; Zhang Bo; Shi Peiyong
2008-11-10
Genome cyclization is essential for flavivirus replication. We used RNases to probe the structures formed by the 5'-terminal 190 nucleotides and the 3'-terminal 111 nucleotides of the West Nile virus (WNV) genomic RNA. When analyzed individually, the two RNAs adopt stem-loop structures as predicted by the thermodynamic-folding program. However, when mixed together, the two RNAs form a duplex that is mediated through base-pairings of two sets of RNA elements (5'CS/3'CSI and 5'UAR/3'UAR). Formation of the RNA duplex facilitates a conformational change that leaves the 3'-terminal nucleotides of the genome (position - 8 to - 16) to be single-stranded. Viral NS5more » binds specifically to the 5'-terminal stem-loop (SL1) of the genomic RNA. The 5'SL1 RNA structure is essential for WNV replication. The study has provided further evidence to suggest that flavivirus genome cyclization and NS5/5'SL1 RNA interaction facilitate NS5 binding to the 3' end of the genome for the initiation of viral minus-strand RNA synthesis.« less
GAP Final Technical Report 12-14-04
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrew J. Bordner, PhD, Senior Research Scientist
2004-12-14
The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less
Self-similarity analysis of eubacteria genome based on weighted graph.
Qi, Zhao-Hui; Li, Ling; Zhang, Zhi-Meng; Qi, Xiao-Qin
2011-07-07
We introduce a weighted graph model to investigate the self-similarity characteristics of eubacteria genomes. The regular treating in similarity comparison about genome is to discover the evolution distance among different genomes. Few people focus their attention on the overall statistical characteristics of each gene compared with other genes in the same genome. In our model, each genome is attributed to a weighted graph, whose topology describes the similarity relationship among genes in the same genome. Based on the related weighted graph theory, we extract some quantified statistical variables from the topology, and give the distribution of some variables derived from the largest social structure in the topology. The 23 eubacteria recently studied by Sorimachi and Okayasu are markedly classified into two different groups by their double logarithmic point-plots describing the similarity relationship among genes of the largest social structure in genome. The results show that the proposed model may provide us with some new sights to understand the structures and evolution patterns determined from the complete genomes. Copyright © 2011 Elsevier Ltd. All rights reserved.
Brown, Nathan M; Mueller, Ryan S; Shepardson, Jonathan W; Landry, Zachary C; Morré, Jeffrey T; Maier, Claudia S; Hardy, F Joan; Dreher, Theo W
2016-06-13
Very few closed genomes of the cyanobacteria that commonly produce toxic blooms in lakes and reservoirs are available, limiting our understanding of the properties of these organisms. A new anatoxin-a-producing member of the Nostocaceae, Anabaena sp. WA102, was isolated from a freshwater lake in Washington State, USA, in 2013 and maintained in non-axenic culture. The Anabaena sp. WA102 5.7 Mbp genome assembly has been closed with long-read, single-molecule sequencing and separately a draft genome assembly has been produced with short-read sequencing technology. The closed and draft genome assemblies are compared, showing a correlation between long repeats in the genome and the many gaps in the short-read assembly. Anabaena sp. WA102 encodes anatoxin-a biosynthetic genes, as does its close relative Anabaena sp. AL93 (also introduced in this study). These strains are distinguished by differences in the genes for light-harvesting phycobilins, with Anabaena sp. AL93 possessing a phycoerythrocyanin operon. Biologically relevant structural variants in the Anabaena sp. WA102 genome were detected only by long-read sequencing: a tandem triplication of the anaBCD promoter region in the anatoxin-a synthase gene cluster (not triplicated in Anabaena sp. AL93) and a 5-kbp deletion variant present in two-thirds of the population. The genome has a large number of mobile elements (160). Strikingly, there was no synteny with the genome of its nearest fully assembled relative, Anabaena sp. 90. Structural and functional genome analyses indicate that Anabaena sp. WA102 has a flexible genome. Genome closure, which can be readily achieved with long-read sequencing, reveals large scale (e.g., gene order) and local structural features that should be considered in understanding genome evolution and function.
Age-Related Macular Degeneration: Genetics and Biology Coming Together
Fritsche, Lars G.; Fariss, Robert N.; Stambolian, Dwight; Abecasis, Gonçalo R.; Curcio, Christine A.
2014-01-01
Genetic and genomic studies have enhanced our understanding of complex neurodegenerative diseases that exert a devastating impact on individuals and society. One such disease, age-related macular degeneration (AMD), is a major cause of progressive and debilitating visual impairment. Since the pioneering discovery in 2005 of complement factor H (CFH) as a major AMD susceptibility gene, extensive investigations have confirmed 19 additional genetic risk loci, and more are anticipated. In addition to common variants identified by now-conventional genome-wide association studies, targeted genomic sequencing and exome-chip analyses are uncovering rare variant alleles of high impact. Here, we provide a critical review of the ongoing genetic studies and of common and rare risk variants at a total of 20 susceptibility loci, which together explain 40–60% of the disease heritability but provide limited power for diagnostic testing of disease risk. Identification of these susceptibility loci has begun to untangle the complex biological pathways underlying AMD pathophysiology, pointing to new testable paradigms for treatment. PMID:24773320
Benschop, Jackie; Biggs, Patrick J.; Marshall, Jonathan C.; Hayman, David T.S.; Carter, Philip E.; Midwinter, Anne C.; Mather, Alison E.; French, Nigel P.
2017-01-01
During 1998–2012, an extended outbreak of Salmonella enterica serovar Typhimurium definitive type 160 (DT160) affected >3,000 humans and killed wild birds in New Zealand. However, the relationship between DT160 within these 2 host groups and the origin of the outbreak are unknown. Whole-genome sequencing was used to compare 109 Salmonella Typhimurium DT160 isolates from sources throughout New Zealand. We provide evidence that DT160 was introduced into New Zealand around 1997 and rapidly propagated throughout the country, becoming more genetically diverse over time. The genetic heterogeneity was evenly distributed across multiple predicted functional protein groups, and we found no evidence of host group differentiation between isolates collected from human, poultry, bovid, and wild bird sources, indicating ongoing transmission between these host groups. Our findings demonstrate how a comparative genomic approach can be used to gain insight into outbreaks, disease transmission, and the evolution of a multihost pathogen after a probable point-source introduction. PMID:28516864
Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora
2015-05-01
Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. © 2015 WILEY Periodicals, Inc.
Coordinates and intervals in graph-based reference genomes.
Rand, Knut D; Grytten, Ivar; Nederbragt, Alexander J; Storvik, Geir O; Glad, Ingrid K; Sandve, Geir K
2017-05-18
It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph-based reference genomes. We formalize offset-based coordinate systems on graph-based reference genomes and introduce methods for representing intervals on these reference structures. We show the advantage of our methods by representing genes on a graph-based representation of the newest assembly of the human genome (GRCh38) and its alternative loci for regions that are highly variable. More complex reference genomes, containing alternative loci, require methods to represent genomic data on these structures. Our proposed notation for genomic intervals makes it possible to fully utilize the alternative loci of the GRCh38 assembly and potential future graph-based reference genomes. We have made a Python package for representing such intervals on offset-based coordinate systems, available at https://github.com/uio-cels/offsetbasedgraph . An interactive web-tool using this Python package to visualize genes on a graph created from GRCh38 is available at https://github.com/uio-cels/genomicgraphcoords .
Genome Pool Strategy for Structural Coverage of Protein Families
Jaroszewski, Lukasz; Slabinski, Lukasz; Wooley, John; Deacon, Ashley M.; Lesley, Scott A.; Wilson, Ian. A.; Godzik, Adam
2010-01-01
As noticed by generations of structural biologists, closely homologous proteins may have substantially different crystallization properties and propensities. These observations can be used to systematically introduce additional dimensionality into crystallization trials by targeting homologous proteins from multiple genomes in a “genome pool” strategy. Through extensive use of our recently introduced “crystallization feasibility score” (Slabinski et al., 2007a), we can explain that the genome pool strategy works well because the crystallization feasibility scores are surprisingly broad within families of homologous proteins, with most families containing a range of optimal to very difficult targets. We also show that some families can be regarded as relatively “easy”, where a significant number of proteins are predicted to have optimal crystallization features, and others are “very difficult”, where almost none are predicted to result in a crystal structure. Thus, the outcome of such variable distributions of such crystallizability' preferences leads to uneven structural coverage of known families, with “easier” or “optimal” families having several times more solved structures than “very difficult” ones. Nevertheless, this latter category can be successfully targeted by increasing the number of genomes that are used to select targets from a given family. On average, adding 10 new genomes to the “genome pool” provides more promising targets for 7 “very difficult” families. In contrast, our crystallization feasibility score does not indicate that any specific microbial genomes can be readily classified as “easier” or “very difficult” with respect to providing suitable candidates for crystallization and structure determination. Finally, our analyses show that specific physicochemical properties of the protein sequence favor successful outcomes for structure determination and, hence, the group of proteins with known 3D structures is systematically different from the general pool of known proteins. We, therefore, assess the structural consequences of these differences in protein sequence and protein biophysical properties. PMID:19000818
Organtini, Lindsey J; Shingler, Kristin L; Ashley, Robert E; Capaldi, Elizabeth A; Durrani, Kulsoom; Dryden, Kelly A; Makhov, Alexander M; Conway, James F; Pizzorno, Marie C; Hafenstein, Susan
2017-01-15
The picornavirus-like deformed wing virus (DWV) has been directly linked to colony collapse; however, little is known about the mechanisms of host attachment or entry for DWV or its molecular and structural details. Here we report the three-dimensional (3-D) structures of DWV capsids isolated from infected honey bees, including the immature procapsid, the genome-filled virion, the putative entry intermediate (A-particle), and the empty capsid that remains after genome release. The capsids are decorated by large spikes around the 5-fold vertices. The 5-fold spikes had an open flower-like conformation for the procapsid and genome-filled capsids, whereas the putative A-particle and empty capsids that had released the genome had a closed tube-like spike conformation. Between the two conformations, the spikes undergo a significant hinge-like movement that we predicted using a Robetta model of the structure comprising the spike. We conclude that the spike structures likely serve a function during host entry, changing conformation to release the genome, and that the genome may escape from a 5-fold vertex to initiate infection. Finally, the structures illustrate that, similarly to picornaviruses, DWV forms alternate particle conformations implicated in assembly, host attachment, and RNA release. Honey bees are critical for global agriculture, but dramatic losses of entire hives have been reported in numerous countries since 2006. Deformed wing virus (DWV) and infestation with the ectoparasitic mite Varroa destructor have been linked to colony collapse disorder. DWV was purified from infected adult worker bees to pursue biochemical and structural studies that allowed the first glimpse into the conformational changes that may be required during transmission and genome release for DWV. Copyright © 2017 American Society for Microbiology.
Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci
Chambers, Emily V.; Bickmore, Wendy A.; Semple, Colin A.
2013-01-01
Several recent studies have examined different aspects of mammalian higher order chromatin structure – replication timing, lamina association and Hi-C inter-locus interactions — and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution. PMID:23592965
Single cell Hi-C reveals cell-to-cell variability in chromosome structure
Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter
2013-01-01
Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610
Stewart, H.; Bingham, R.J.; White, S. J.; Dykeman, E. C.; Zothner, C.; Tuplin, A. K.; Stockley, P. G.; Twarock, R.; Harris, M.
2016-01-01
The specific packaging of the hepatitis C virus (HCV) genome is hypothesised to be driven by Core-RNA interactions. To identify the regions of the viral genome involved in this process, we used SELEX (systematic evolution of ligands by exponential enrichment) to identify RNA aptamers which bind specifically to Core in vitro. Comparison of these aptamers to multiple HCV genomes revealed the presence of a conserved terminal loop motif within short RNA stem-loop structures. We postulated that interactions of these motifs, as well as sub-motifs which were present in HCV genomes at statistically significant levels, with the Core protein may drive virion assembly. We mutated 8 of these predicted motifs within the HCV infectious molecular clone JFH-1, thereby producing a range of mutant viruses predicted to possess altered RNA secondary structures. RNA replication and viral titre were unaltered in viruses possessing only one mutated structure. However, infectivity titres were decreased in viruses possessing a higher number of mutated regions. This work thus identified multiple novel RNA motifs which appear to contribute to genome packaging. We suggest that these structures act as cooperative packaging signals to drive specific RNA encapsidation during HCV assembly. PMID:26972799
Qin, Yanhong; Wang, Li; Zhang, Zhenchen; Qiao, Qi; Zhang, Desheng; Tian, Yuting; Wang, Shuang; Wang, Yongjiang; Yan, Zhaoling
2014-01-01
Background Sweet potato chlorotic stunt virus (family Closteroviridae, genus Crinivirus) features a large bipartite, single-stranded, positive-sense RNA genome. To date, only three complete genomic sequences of SPCSV can be accessed through GenBank. SPCSV was first detected from China in 2011, only partial genomic sequences have been determined in the country. No report on the complete genomic sequence and genome structure of Chinese SPCSV isolates or the genetic relation between isolates from China and other countries is available. Methodology/Principal Findings The complete genomic sequences of five isolates from different areas in China were characterized. This study is the first to report the complete genome sequences of SPCSV from whitefly vectors. Genome structure analysis showed that isolates of WA and EA strains from China have the same coding protein as isolates Can181-9 and m2-47, respectively. Twenty cp genes and four RNA1 partial segments were sequenced and analyzed, and the nucleotide identities of complete genomic, cp, and RNA1 partial sequences were determined. Results indicated high conservation among strains and significant differences between WA and EA strains. Genetic analysis demonstrated that, except for isolates from Guangdong Province, SPCSVs from other areas belong to the WA strain. Genome organization analysis showed that the isolates in this study lack the p22 gene. Conclusions/Significance We presented the complete genome sequences of SPCSV in China. Comparison of nucleotide identities and genome structures between these isolates and previously reported isolates showed slight differences. The nucleotide identities of different SPCSV isolates showed high conservation among strains and significant differences between strains. All nine isolates in this study lacked p22 gene. WA strains were more extensively distributed than EA strains in China. These data provide important insights into the molecular variation and genomic structure of SPCSV in China as well as genetic relationships among isolates from China and other countries. PMID:25170926
Shao, Changwei; Niu, Yongchao; Rastas, Pasi; Liu, Yang; Xie, Zhiyuan; Li, Hengde; Wang, Lei; Jiang, Yong; Tai, Shuaishuai; Tian, Yongsheng; Sakamoto, Takashi; Chen, Songlin
2015-04-01
High-resolution genetic maps are essential for fine mapping of complex traits, genome assembly, and comparative genomic analysis. Single-nucleotide polymorphisms (SNPs) are the primary molecular markers used for genetic map construction. In this study, we identified 13,362 SNPs evenly distributed across the Japanese flounder (Paralichthys olivaceus) genome. Of these SNPs, 12,712 high-confidence SNPs were subjected to high-throughput genotyping and assigned to 24 consensus linkage groups (LGs). The total length of the genetic linkage map was 3,497.29 cM with an average distance of 0.47 cM between loci, thereby representing the densest genetic map currently reported for Japanese flounder. Nine positive quantitative trait loci (QTLs) forming two main clusters for Vibrio anguillarum disease resistance were detected. All QTLs could explain 5.1-8.38% of the total phenotypic variation. Synteny analysis of the QTL regions on the genome assembly revealed 12 immune-related genes, among them 4 genes strongly associated with V. anguillarum disease resistance. In addition, 246 genome assembly scaffolds with an average size of 21.79 Mb were anchored onto the LGs; these scaffolds, comprising 522.99 Mb, represented 95.78% of assembled genomic sequences. The mapped assembly scaffolds in Japanese flounder were used for genome synteny analyses against zebrafish (Danio rerio) and medaka (Oryzias latipes). Flounder and medaka were found to possess almost one-to-one synteny, whereas flounder and zebrafish exhibited a multi-syntenic correspondence. The newly developed high-resolution genetic map, which will facilitate QTL mapping, scaffold assembly, and genome synteny analysis of Japanese flounder, marks a milestone in the ongoing genome project for this species. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Austin, Melissa A
2002-04-01
Recent completion of the draft sequence of the human genome has been greeted with both excitement and skepticism, and the potential of this accomplishment for advancing public health has been tempered by ethical concerns about the protection of human subjects. This commentary explores ethical issues arising in human genome epidemiology by using a case study approach based on the ongoing Japanese American Family Study at the University of Washington in Seattle (1994-2003). Ethical issues encountered in designing the study, collecting the data, and reporting the study results are considered. When developing studies, investigators must consider whether to restrict the study to specific racial or ethnic groups and whether community involvement is appropriate. Once the study design is in place, further ethical issues emerge, including obtaining informed consent for DNA banking and protecting the privacy and confidentiality of family members. Finally, investigators must carefully consider whether to report genotype results to study participants and whether pedigrees illustrating the results of the study will be published. Overall, the promise of genomics for improving public health must be pursued based on the fundamental ethical principles of respect for persons, beneficence, and justice.
Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A
2011-01-01
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Cohn, Elizabeth Gross; Husamudeen, Maryam; Larson, Elaine L.; Williams, Janet K.
2016-01-01
Achieving equitable minority representation in genomic biobanking is one of the most difficult challenges faced by researchers today. Capacity building—a framework for research that includes collaborations and on-going engagement—can be used to help researchers, clinicians and communities better understand the process, utility, and clinical application of genomic science. The purpose of this exploratory descriptive study was to examine factors that influence the decision to participate in genomic research, and identify essential components of capacity building with a community at risk of being under-represented in biobanks. Results of focus groups conducted in Central Harlem with 46 participants were analyzed by a collaborative team of community and academic investigators using content analysis and AtlisTi. Key themes identified were: (1) the potential contribution of biobanking to individual and community health, for example the effect of the environment on health, (2) the societal context of the science, such as DNA criminal databases and paternity testing, that may affect the decision to participate, and (3) the researchers’ commitment to community health as an outcome of capacity building. These key factors can contribute to achieving equity in biobank participation, and guide genetic specialists in biobank planning and implementation. PMID:25228357
Ding, Jiun-Yan; Shiu, Jia-Ho; Chen, Wen-Ming; Chiang, Yin-Ru; Tang, Sen-Lin
2016-01-01
The bacterial genus Endozoicomonas was commonly detected in healthy corals in many coral-associated bacteria studies in the past decade. Although, it is likely to be a core member of coral microbiota, little is known about its ecological roles. To decipher potential interactions between bacteria and their coral hosts, we sequenced and investigated the first culturable endozoicomonal bacterium from coral, the E. montiporae CL-33T. Its genome had potential sign of ongoing genome erosion and gene exchange with its host. Testosterone degradation and type III secretion system are commonly present in Endozoicomonas and may have roles to recognize and deliver effectors to their hosts. Moreover, genes of eukaryotic ephrin ligand B2 are present in its genome; presumably, this bacterium could move into coral cells via endocytosis after binding to coral's Eph receptors. In addition, 7,8-dihydro-8-oxoguanine triphosphatase and isocitrate lyase are possible type III secretion effectors that might help coral to prevent mitochondrial dysfunction and promote gluconeogenesis, especially under stress conditions. Based on all these findings, we inferred that E. montiporae was a facultative endosymbiont that can recognize, translocate, communicate and modulate its coral host. PMID:27014194
O'Keefe, James H; Cordain, Loren
2004-01-01
Our genetic make-up, shaped through millions of years of evolution, determines our nutritional and activity needs. Although the human genome has remained primarily unchanged since the agricultural revolution 10,000 years ago, our diet and lifestyle have become progressively more divergent from those of our ancient ancestors. Accumulating evidence suggests that this mismatch between our modern diet and lifestyle and our Paleolithic genome is playing a substantial role in the ongoing epidemics of obesity, hypertension, diabetes, and atherosclerotic cardiovascular disease. Until 500 generations ago, all humans consumed only wild and unprocessed food foraged and hunted from their environment. These circumstances provided a diet high in lean protein, polyunsaturated fats (especially omega-3 [omega-3] fatty acids), monounsaturated fats, fiber, vitamins, minerals, antioxidants, and other beneficial phytochemicals. Historical and anthropological studies show hunter-gatherers generally to be healthy, fit, and largely free of the degenerative cardiovascular diseases common in modern societies. This review outlines the essence of our hunter-gatherer genetic legacy and suggests practical steps to re-align our modern milieu with our ancient genome in an effort to improve cardiovascular health.
Genome sequences of two closely related strains of Escherichia coli K-12 GM4792.
Zhang, Yan-Cong; Zhang, Yan; Zhu, Bi-Ru; Zhang, Bo-Wen; Ni, Chuan; Zhang, Da-Yong; Huang, Ying; Pang, Erli; Lin, Kui
2015-01-01
Escherichia coli lab strains K-12 GM4792 Lac(+) and GM4792 Lac(-) carry opposite lactose markers, which are useful for distinguishing evolved lines as they produce different colored colonies. The two closely related strains are chosen as ancestors for our ongoing studies of experimental evolution. Here, we describe the genome sequences, annotation, and features of GM4792 Lac(+) and GM4792 Lac(-). GM4792 Lac(+) has a 4,622,342-bp long chromosome with 4,061 protein-coding genes and 83 RNA genes. Similarly, the genome of GM4792 Lac(-) consists of a 4,621,656-bp chromosome containing 4,043 protein-coding genes and 74 RNA genes. Genome comparison analysis reveals that the differences between GM4792 Lac(+) and GM4792 Lac(-) are minimal and limited to only the targeted lac region. Moreover, a previous study on competitive experimentation indicates the two strains are identical or nearly identical in survivability except for lactose utilization in a nitrogen-limited environment. Therefore, at both a genetic and a phenotypic level, GM4792 Lac(+) and GM4792 Lac(-), with opposite neutral markers, are ideal systems for future experimental evolution studies.
Digestive tumor bank protocol: from surgical specimens to genomic studies of digestive cancers.
Popescu, I; Stroescu, C; Dumitrascu, T; Herlea, V; Paslaru, Liliana; Lazar, V; Boissin, H; Taieb, J; Horeanga, Ionela
2006-01-01
Cancer is a complex polygenic and multifactorial disease, resulting from successive dynamic changes in the genome of somatic cells and from the accumulation of molecular alterations in both tumour cells and host cells. For the majority of cancers, including many malignancies of the gastrointestinal tract, our current means of diagnosis and treatment of the tumors are grossly insufficient. In recent years the development of several gene expression profiling methods such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE) and DNA arrays, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complete cascade of molecular events leading to tumor development and progression. Given the central role played by surgeons in the current management of patients with solid cancers, it is of paramount importance for them to know the principles characterizing this laboratory tools to critically assess the results originating from this biotechnology. We describe in this article the scientific partnership between Fundeni Clinical Institute Bucharest, Romania and RNtech Company, Paris, France for the development of a center of biological resources (Biobank) as well as the standardized protocol of working with the biological samples, the ongoing projects and the future perspectives.
Lau, Cia-Hin; Suh, Yousin
2017-01-01
Adeno-associated virus (AAV) has shown promising therapeutic efficacy with a good safety profile in a wide range of animal models and human clinical trials. With the advent of clustered regulatory interspaced short palindromic repeat (CRISPR)-based genome-editing technologies, AAV provides one of the most suitable viral vectors to package, deliver, and express CRISPR components for targeted gene editing. Recent discoveries of smaller Cas9 orthologues have enabled the packaging of Cas9 nuclease and its chimeric guide RNA into a single AAV delivery vehicle for robust in vivo genome editing. Here, we discuss how the combined use of small Cas9 orthologues, tissue-specific minimal promoters, AAV serotypes, and different routes of administration has advanced the development of efficient and precise in vivo genome editing and comprehensively review the various AAV-CRISPR systems that have been effectively used in animals. We then discuss the clinical implications and potential strategies to overcome off-target effects, immunogenicity, and toxicity associated with CRISPR components and AAV delivery vehicles. Finally, we discuss ongoing non-viral-based ex vivo gene therapy clinical trials to underscore the current challenges and future prospects of CRISPR/Cas9 delivery for human therapeutics. PMID:29333255
Informational laws of genome structures
Bonnici, Vincenzo; Manca, Vincenzo
2016-01-01
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155
Informational laws of genome structures
NASA Astrophysics Data System (ADS)
Bonnici, Vincenzo; Manca, Vincenzo
2016-06-01
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Liu, Yun-Hua; Zhang, Meiping; Wu, Chengcang; Huang, James J; Zhang, Hong-Bin
2014-01-01
Knowledge of how a genome is structured and organized from its constituent elements is crucial to understanding its biology and evolution. Here, we report the genome structuring and organization pattern as revealed by systems analysis of the sequences of three model species, Arabidopsis, rice and yeast, at the whole-genome and chromosome levels. We found that all fundamental function elements (FFE) constituting the genomes, including genes (GEN), DNA transposable elements (DTE), retrotransposable elements (RTE), simple sequence repeats (SSR), and (or) low complexity repeats (LCR), are structured in a nonrandom and correlative manner, thus leading to a hypothesis that the DNA of the species is structured as a linear "jigsaw puzzle". Furthermore, we showed that different FFE differ in their importance in the formation and evolution of the DNA jigsaw puzzle structure between species. DTE and RTE play more important roles than GEN, LCR, and SSR in Arabidopsis, whereas GEN and RTE play more important roles than LCR, SSR, and DTE in rice. The genes having multiple recognized functions play more important roles than those having single functions. These results provide useful knowledge necessary for better understanding genome biology and evolution of the species and for effective molecular breeding of rice.
USDA-ARS?s Scientific Manuscript database
Fast neutron radiation has been used as a mutagen to develop extensive mutant collections. However, the genome-wide structural consequences of fast neutron radiation are not well understood. Here, we examine the genome-wide structural variants observed among 264 soybean (Glycine max (L.) Merrill) pl...
USDA-ARS?s Scientific Manuscript database
Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome The availability of a saturated genetic map of Clementine was identified by the International Citrus Genome Consortium as an essential prerequisite to assist the assembly...
G-quadruplexes Significantly Stimulate Pif1 Helicase-catalyzed Duplex DNA Unwinding*
Duan, Xiao-Lei; Liu, Na-Nv; Yang, Yan-Tao; Li, Hai-Hong; Li, Ming; Dou, Shuo-Xing; Xi, Xu-Guang
2015-01-01
The evolutionarily conserved G-quadruplexes (G4s) are faithfully inherited and serve a variety of cellular functions such as telomere maintenance, gene regulation, DNA replication initiation, and epigenetic regulation. Different from the Watson-Crick base-pairing found in duplex DNA, G4s are formed via Hoogsteen base pairing and are very stable and compact DNA structures. Failure of untangling them in the cell impedes DNA-based transactions and leads to genome instability. Cells have evolved highly specific helicases to resolve G4 structures. We used a recombinant nuclear form of Saccharomyces cerevisiae Pif1 to characterize Pif1-mediated DNA unwinding with a substrate mimicking an ongoing lagging strand synthesis stalled by G4s, which resembles a replication origin and a G4-structured flap in Okazaki fragment maturation. We find that the presence of G4 may greatly stimulate the Pif1 helicase to unwind duplex DNA. Further studies reveal that this stimulation results from G4-enhanced Pif1 dimerization, which is required for duplex DNA unwinding. This finding provides new insights into the properties and functions of G4s. We discuss the observed activation phenomenon in relation to the possible regulatory role of G4s in the rapid rescue of the stalled lagging strand synthesis by helping the replicator recognize and activate the replication origin as well as by quickly removing the G4-structured flap during Okazaki fragment maturation. PMID:25627683
No genome barriers to promiscuous DNA
NASA Astrophysics Data System (ADS)
Lewin, R.
1984-06-01
Farrelly and Butow (1983) used the term 'promiscuous DNA' in their report of the apparent natural transfer of yeast mitochondrial DNA sequences into the nuclear genome. Ellis (1982) applied the same term in an editorial comment. It is pointed out since that time the subject of DNA's promiscuity has exploded with a series of reports. According to a report by Stern (1984), movement of DNA sequences between chloroplasts and mitochondria is not just a rare event but is a rampant process. It was recently concluded that 'the widespread presence of ctDNA sequences in plant mtDNA is best regarded as a dramatic demonstration of the dynamo nature of interactions between the chloroplast and the mitochondrion, similar to the ongoing process of interorganellar DNA transfer already documented between mitochondrion and nucleus and between chloroplast and nucleus'.
Experimental evolution and the dynamics of adaptation and genome evolution in microbial populations.
Lenski, Richard E
2017-10-01
Evolution is an on-going process, and it can be studied experimentally in organisms with rapid generations. My team has maintained 12 populations of Escherichia coli in a simple laboratory environment for >25 years and 60 000 generations. We have quantified the dynamics of adaptation by natural selection, seen some of the populations diverge into stably coexisting ecotypes, described changes in the bacteria's mutation rate, observed the new ability to exploit a previously untapped carbon source, characterized the dynamics of genome evolution and used parallel evolution to identify the genetic targets of selection. I discuss what the future might hold for this particular experiment, briefly highlight some other microbial evolution experiments and suggest how the fields of experimental evolution and microbial ecology might intersect going forward.
BPP: a sequence-based algorithm for branch point prediction.
Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing
2017-10-15
Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Damoiseaux, Robert
2014-05-01
The Molecular Screening Shared Resource (MSSR) offers a comprehensive range of leading-edge high throughput screening (HTS) services including drug discovery, chemical and functional genomics, and novel methods for nano and environmental toxicology. The MSSR is an open access environment with investigators from UCLA as well as from the entire globe. Industrial clients are equally welcome as are non-profit entities. The MSSR is a fee-for-service entity and does not retain intellectual property. In conjunction with the Center for Environmental Implications of Nanotechnology, the MSSR is unique in its dedicated and ongoing efforts towards high throughput toxicity testing of nanomaterials. In addition, the MSSR engages in technology development eliminating bottlenecks from the HTS workflow and enabling novel assays and readouts currently not available.
Yamaguchi, Akihiro; Go, Mitiko
2006-01-01
We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics. PMID:17146617
Musunuru, Kiran; Arora, Pankaj; Cooke, John P; Ferguson, Jane F; Hershberger, Ray E; Hickey, Kathleen T; Lee, Jin-Moo; Lima, João A C; Loscalzo, Joseph; Pereira, Naveen L; Russell, Mark W; Shah, Svati H; Sheikh, Farah; Wang, Thomas J; MacRae, Calum A
2018-06-01
The completion of the Human Genome Project has unleashed a wealth of human genomics information, but it remains unclear how best to implement this information for the benefit of patients. The standard approach of biomedical research, with researchers pursuing advances in knowledge in the laboratory and, separately, clinicians translating research findings into the clinic as much as decades later, will need to give way to new interdisciplinary models for research in genomic medicine. These models should include scientists and clinicians actively working as teams to study patients and populations recruited in clinical settings and communities to make genomics discoveries-through the combined efforts of data scientists, clinical researchers, epidemiologists, and basic scientists-and to rapidly apply these discoveries in the clinic for the prediction, prevention, diagnosis, prognosis, and treatment of cardiovascular diseases and stroke. The highly publicized US Precision Medicine Initiative, also known as All of Us, is a large-scale program funded by the US National Institutes of Health that will energize these efforts, but several ongoing studies such as the UK Biobank Initiative; the Million Veteran Program; the Electronic Medical Records and Genomics Network; the Kaiser Permanente Research Program on Genes, Environment and Health; and the DiscovEHR collaboration are already providing exemplary models of this kind of interdisciplinary work. In this statement, we outline the opportunities and challenges in broadly implementing new interdisciplinary models in academic medical centers and community settings and bringing the promise of genomics to fruition. © 2018 American Heart Association, Inc.
Adaptation to Low Salinity Promotes Genomic Divergence in Atlantic Cod (Gadus morhua L.)
Berg, Paul R.; Jentoft, Sissel; Star, Bastiaan; Ring, Kristoffer H.; Knutsen, Halvor; Lien, Sigbjørn; Jakobsen, Kjetill S.; André, Carl
2015-01-01
How genomic selection enables species to adapt to divergent environments is a fundamental question in ecology and evolution. We investigated the genomic signatures of local adaptation in Atlantic cod (Gadus morhua L.) along a natural salinity gradient, ranging from 35‰ in the North Sea to 7‰ within the Baltic Sea. By utilizing a 12 K SNPchip, we simultaneously assessed neutral and adaptive genetic divergence across the Atlantic cod genome. Combining outlier analyses with a landscape genomic approach, we identified a set of directionally selected loci that are strongly correlated with habitat differences in salinity, oxygen, and temperature. Our results show that discrete regions within the Atlantic cod genome are subject to directional selection and associated with adaptation to the local environmental conditions in the Baltic- and the North Sea, indicating divergence hitchhiking and the presence of genomic islands of divergence. We report a suite of outlier single nucleotide polymorphisms within or closely located to genes associated with osmoregulation, as well as genes known to play important roles in the hydration and development of oocytes. These genes are likely to have key functions within a general osmoregulatory framework and are important for the survival of eggs and larvae, contributing to the buildup of reproductive isolation between the low-salinity adapted Baltic cod and the adjacent cod populations. Hence, our data suggest that adaptive responses to the environmental conditions in the Baltic Sea may contribute to a strong and effective reproductive barrier, and that Baltic cod can be viewed as an example of ongoing speciation. PMID:25994933
Pathgroups, a dynamic data structure for genome reconstruction problems.
Zheng, Chunfang
2010-07-01
Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.
Hamperl, Stephan; Cimprich, Karlene A.
2014-01-01
Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. PMID:24746923
2012-01-01
Background Amazona vittata is a critically endangered Puerto Rican endemic bird, the only surviving native parrot species in the United States territory, and the first parrot in the large Neotropical genus Amazona, to be studied on a genomic scale. Findings In a unique community-based funded project, DNA from an A. vittata female was sequenced using a HiSeq Illumina platform, resulting in a total of ~42.5 billion nucleotide bases. This provided approximately 26.89x average coverage depth at the completion of this funding phase. Filtering followed by assembly resulted in 259,423 contigs (N50 = 6,983 bp, longest = 75,003 bp), which was further scaffolded into 148,255 fragments (N50 = 19,470, longest = 206,462 bp). This provided ~76% coverage of the genome based on an estimated size of 1.58 Gb. The assembled scaffolds allowed basic genomic annotation and comparative analyses with other available avian whole-genome sequences. Conclusions The current data represents the first genomic information from and work carried out with a unique source of funding. This analysis further provides a means for directed training of young researchers in genetic and bioinformatics analyses and will facilitate progress towards a full assembly and annotation of the Puerto Rican parrot genome. It also adds extensive genomic data to a new branch of the avian tree, making it useful for comparative analyses with other avian species. Ultimately, the knowledge acquired from these data will contribute to an improved understanding of the overall population health of this species and aid in ongoing and future conservation efforts. PMID:23587420
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Gary; Detter, John C; Bruce, David C
We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus 11B, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudo genes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Gary; Detter, Chris; Bruce, David
We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus lIB, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudogenes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
Yang, Zhihong; Lin, James; Zhang, John; Fong, Wai Ieng; Li, Pei; Zhao, Rong; Liu, Xiaohong; Podevin, William; Kuang, Yanping; Liu, Jiaen
2015-06-23
Recent advances in next-generation sequencing (NGS) have provided new methods for preimplantation genetic screening (PGS) of human embryos from in vitro fertilization (IVF) cycles. However, there is still limited information about clinical applications of NGS in IVF and PGS (IVF-PGS) treatments. The present study aimed to investigate the effects of NGS screening on clinical pregnancy and implantation outcomes for PGS patients in comparison to array comparative genomic hybridization (aCGH) screening. This study was performed in two phases. Phase I study evaluated the accuracy of NGS for aneuploidy screening in comparison to aCGH. Whole-genome amplification (WGA) products (n = 164) derived from previous IVF-PGS cycles (n = 38) were retrospectively analyzed with NGS. The NGS results were then compared with those of aCGH. Phase II study further compared clinical pregnancy and implantation outcomes between NGS and aCGH for IVF-PGS patients. A total of 172 patients at mean age 35.2 ± 3.5 years were randomized into two groups: 1) NGS (Group A): patients (n = 86) had embryos screened with NGS and 2) aCGH (Group B): patients (n = 86) had embryos screened with aCGH. For both groups, blastocysts were vitrified after trophectoderm biopsy. One to two euploid blastocysts were thawed and transferred to individual patients primarily based on the PGS results. Ongoing pregnancy and implantation rates were compared between the two study groups. NGS detected all types of aneuploidies of human blastocysts accurately and provided a 100 % 24-chromosome diagnosis consistency with the highly validated aCGH method. Moreover, NGS screening identified euploid blastocysts for transfer and resulted in similarly high ongoing pregnancy rates for PGS patients compared to aCGH screening (74.7 % vs. 69.2 %, respectively, p >0.05). The observed implantation rates were also comparable between the NGS and aCGH groups (70.5 % vs. 66.2 %, respectively, p >0.05). While NGS screening has been recently introduced to assist IVF patients, this is the first randomized clinical study on the efficiency of NGS for preimplantation genetic screening in comparison to aCGH. With the observed high accuracy of 24-chromosome diagnosis and the resulting high ongoing pregnancy and implantation rates, NGS has demonstrated an efficient, robust high-throughput technology for PGS.
Protein family clustering for structural genomics.
Yan, Yongpan; Moult, John
2005-10-28
A major goal of structural genomics is the provision of a structural template for a large fraction of protein domains. The magnitude of this task depends on the number and nature of protein sequence families. With a large number of bacterial genomes now fully sequenced, it is possible to obtain improved estimates of the number and diversity of families in that kingdom. We have used an automated clustering procedure to group all sequences in a set of genomes into protein families. Bench-marking shows the clustering method is sensitive at detecting remote family members, and has a low level of false positives. This comprehensive protein family set has been used to address the following questions. (1) What is the structure coverage for currently known families? (2) How will the number of known apparent families grow as more genomes are sequenced? (3) What is a practical strategy for maximizing structure coverage in future? Our study indicates that approximately 20% of known families with three or more members currently have a representative structure. The study indicates also that the number of apparent protein families will be considerably larger than previously thought: We estimate that, by the criteria of this work, there will be about 250,000 protein families when 1000 microbial genomes have been sequenced. However, the vast majority of these families will be small, and it will be possible to obtain structural templates for 70-80% of protein domains with an achievable number of representative structures, by systematically sampling the larger families.
Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne
2015-02-10
Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Nandi, Tannistha; Holden, Matthew T.G.; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A.; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J.; Titball, Richard; Chen, Swaine L.; Parkhill, Julian
2015-01-01
Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617
Mottawea, Walid; Duceppe, Marc-Olivier; Dupras, Andrée A; Usongo, Valentine; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Hamel, Jeremie; Kukavica-Ibrulj, Irena; Boyle, Brian; Gill, Alexander; Burnett, Elton; Franz, Eelco; Arya, Gitanjali; Weadge, Joel T; Gruenheid, Samantha; Wiedmann, Martin; Huang, Hongsheng; Daigle, France; Moineau, Sylvain; Bekal, Sadjia; Levesque, Roger C; Goodridge, Lawrence D; Ogunremi, Dele
2018-01-01
Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP) typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella . In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR) of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S . Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.
Genome-wide introgression among distantly related Heliconius butterfly species.
Zhang, Wei; Dasmahapatra, Kanchon K; Mallet, James; Moreira, Gilson R P; Kronforst, Marcus R
2016-02-27
Although hybridization is thought to be relatively rare in animals, the raw genetic material introduced via introgression may play an important role in fueling adaptation and adaptive radiation. The butterfly genus Heliconius is an excellent system to study hybridization and introgression but most studies have focused on closely related species such as H. cydno and H. melpomene. Here we characterize genome-wide patterns of introgression between H. besckei, the only species with a red and yellow banded 'postman' wing pattern in the tiger-striped silvaniform clade, and co-mimetic H. melpomene nanna. We find a pronounced signature of putative introgression from H. melpomene into H. besckei in the genomic region upstream of the gene optix, known to control red wing patterning, suggesting adaptive introgression of wing pattern mimicry between these two distantly related species. At least 39 additional genomic regions show signals of introgression as strong or stronger than this mimicry locus. Gene flow has been on-going, with evidence of gene exchange at multiple time points, and bidirectional, moving from the melpomene to the silvaniform clade and vice versa. The history of gene exchange has also been complex, with contributions from multiple silvaniform species in addition to H. besckei. We also detect a signature of ancient introgression of the entire Z chromosome between the silvaniform and melpomene/cydno clades. Our study provides a genome-wide portrait of introgression between distantly related butterfly species. We further propose a comprehensive and efficient workflow for gene flow identification in genomic data sets.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.
Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción
2016-02-27
In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Chemical biology on the genome.
Balasubramanian, Shankar
2014-08-15
In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.
A sequence-based survey of the complex structural organization of tumor genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav
2008-04-03
The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less
MIPSPlantsDB—plant database resource for integrative and comparative plant genome research
Spannagl, Manuel; Noubibou, Octave; Haase, Dirk; Yang, Li; Gundlach, Heidrun; Hindemitt, Tobias; Klee, Kathrin; Haberer, Georg; Schoof, Heiko; Mayer, Klaus F. X.
2007-01-01
Genome-oriented plant research delivers rapidly increasing amount of plant genome data. Comprehensive and structured information resources are required to structure and communicate genome and associated analytical data for model organisms as well as for crops. The increase in available plant genomic data enables powerful comparative analysis and integrative approaches. PlantsDB aims to provide data and information resources for individual plant species and in addition to build a platform for integrative and comparative plant genome research. PlantsDB is constituted from genome databases for Arabidopsis, Medicago, Lotus, rice, maize and tomato. Complementary data resources for cis elements, repetive elements and extensive cross-species comparisons are implemented. The PlantsDB portal can be reached at . PMID:17202173
2017-10-01
properties associated with active genes; 5) take advantage of ongoing clinical trials in which patients with myelodysplastic syndrome are being treated...and Molecular Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA 3. Dept. of Radiation Oncology & Molecular Radiation ...these impart durable changes in both genome –wide DNA methylation and transcriptome, while avoiding acute cytotoxicity (Tsai et al., 2012). 5 Most
Clonal Evaluation of Prostate Cancer by ERG/SPINK1 Status to Improve Prognosis Prediction
2015-10-01
waiting for case numbers to increase. Significant changes in use or care of human subjects, vertebrate animals , biohazards, and/or select agents...IHC at UM is no longer with the University. He has been replaced by Connie Brenke, who is now performing the staining which is ongoing (as shown in...and genomics. Institutional partners include the FHCRC and the UW Institute of Stem Cell Sciences. Core A: Tissues/ Sera /Models The major function
YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.
Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh
2015-01-16
Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the preliminary results showed differences in virulence genes found in Yersinia pestis and Yersinia pseudotuberculosis compared to other Yersinia species, and differences between Yersinia enterocolitica subsp. enterocolitica and Yersinia enterocolitica subsp. palearctica. YersiniaBase offers free access to wide range of genomic data and analysis tools for the analysis of Yersinia. YersiniaBase can be accessed at http://yersinia.um.edu.my .
Sabatini, Linda M; Mathews, Charles; Ptak, Devon; Doshi, Shivang; Tynan, Katherine; Hegde, Madhuri R; Burke, Tara L; Bossler, Aaron D
2016-05-01
The increasing use of advanced nucleic acid sequencing technologies for clinical diagnostics and therapeutics has made vital understanding the costs of performing these procedures and their value to patients, providers, and payers. The Association for Molecular Pathology invested in a cost and value analysis of specific genomic sequencing procedures (GSPs) newly coded by the American Medical Association Current Procedural Terminology Editorial Panel. Cost data and work effort, including the development and use of data analysis pipelines, were gathered from representative laboratories currently performing these GSPs. Results were aggregated to generate representative cost ranges given the complexity and variability of performing the tests. Cost-impact models for three clinical scenarios were generated with assistance from key opinion leaders: impact of using a targeted gene panel in optimizing care for patients with advanced non-small-cell lung cancer, use of a targeted gene panel in the diagnosis and management of patients with sensorineural hearing loss, and exome sequencing in the diagnosis and management of children with neurodevelopmental disorders of unknown genetic etiology. Each model demonstrated value by either reducing health care costs or identifying appropriate care pathways. The templates generated will aid laboratories in assessing their individual costs, considering the value structure in their own patient populations, and contributing their data to the ongoing dialogue regarding the impact of GSPs on improving patient care. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Mahillon, Jacques; Chandler, Michael
1998-01-01
Insertion sequences (ISs) constitute an important component of most bacterial genomes. Over 500 individual ISs have been described in the literature to date, and many more are being discovered in the ongoing prokaryotic and eukaryotic genome-sequencing projects. The last 10 years have also seen some striking advances in our understanding of the transposition process itself. Not least of these has been the development of various in vitro transposition systems for both prokaryotic and eukaryotic elements and, for several of these, a detailed understanding of the transposition process at the chemical level. This review presents a general overview of the organization and function of insertion sequences of eubacterial, archaebacterial, and eukaryotic origins with particular emphasis on bacterial elements and on different aspects of the transposition mechanism. It also attempts to provide a framework for classification of these elements by assigning them to various families or groups. A total of 443 members of the collection have been grouped in 17 families based on combinations of the following criteria: (i) similarities in genetic organization (arrangement of open reading frames); (ii) marked identities or similarities in the enzymes which mediate the transposition reactions, the recombinases/transposases (Tpases); (iii) similar features of their ends (terminal IRs); and (iv) fate of the nucleotide sequence of their target sites (generation of a direct target duplication of determined length). A brief description of the mechanism(s) involved in the mobility of individual ISs in each family and of the structure-function relationships of the individual Tpases is included where available. PMID:9729608
Casas-Marce, Mireia; Marmesat, Elena; Soriano, Laura; Martínez-Cruz, Begoña; Lucena-Perez, Maria; Nocete, Francisco; Rodríguez-Hidalgo, Antonio; Canals, Antoni; Nadal, Jordi; Detry, Cleia; Bernáldez-Sánchez, Eloísa; Fernández-Rodríguez, Carlos; Pérez-Ripoll, Manuel; Stiller, Mathias; Hofreiter, Michael; Rodríguez, Alejandro; Revilla, Eloy; Delibes, Miguel; Godoy, José A.
2017-01-01
Abstract There is the tendency to assume that endangered species have been both genetically and demographically healthier in the past, so that any genetic erosion observed today was caused by their recent decline. The Iberian lynx (Lynx pardinus) suffered a dramatic and continuous decline during the 20th century, and now shows extremely low genome- and species-wide genetic diversity among other signs of genomic erosion. We analyze ancient (N = 10), historical (N = 245), and contemporary (N = 172) samples with microsatellite and mitogenome data to reconstruct the species' demography and investigate patterns of genetic variation across space and time. Iberian lynx populations transitioned from low but significantly higher genetic diversity than today and shallow geographical differentiation millennia ago, through a structured metapopulation with varying levels of diversity during the last centuries, to two extremely genetically depauperate and differentiated remnant populations by 2002. The historical subpopulations show varying extents of genetic drift in relation to their recent size and time in isolation, but these do not predict whether the populations persisted or went finally extinct. In conclusion, current genetic patterns were mainly shaped by genetic drift, supporting the current admixture of the two genetic pools and calling for a comprehensive genetic management of the ongoing conservation program. This study illustrates how a retrospective analysis of demographic and genetic patterns of endangered species can shed light onto their evolutionary history and this, in turn, can inform conservation actions. PMID:28962023
RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”
Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113
RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".
Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.
Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-01-01
Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259
Real-time ligand binding pocket database search using local surface descriptors.
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-07-01
Because of the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two-dimensional pseudo-Zernike moments or the three-dimensional Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark studies employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed.
Functional RNA structures throughout the Hepatitis C Virus genome.
Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie
2017-06-01
The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.
2012-01-01
Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
Prakash, Hariprasath; Rudramurthy, Shivaprakash Mandya; Gandham, Prasad S; Ghosh, Anup Kumar; Kumar, Milner M; Badapanda, Chandan; Chakrabarti, Arunaloke
2017-09-18
Apophysomyces species are prevalent in tropical countries and A. variabilis is the second most frequent agent causing mucormycosis in India. Among Apophysomyces species, A. elegans, A. trapeziformis and A. variabilis are commonly incriminated in human infections. The genome sequences of A. elegans and A. trapeziformis are available in public database, but not A. variabilis. We, therefore, performed the whole genome sequence of A. variabilis to explore its genomic structure and possible genes determining the virulence of the organism. The whole genome of A. variabilis NCCPF 102052 was sequenced and the genomic structure of A. variabilis was compared with already available genome structures of A. elegans, A. trapeziformis and other medically important Mucorales. The total size of genome assembly of A. variabilis was 39.38 Mb with 12,764 protein-coding genes. The transposable elements (TEs) were low in Apophysomyces genome and the retrotransposon Ty3-gypsy was the common TE. Phylogenetically, Apophysomyces species were grouped closely with Phycomyces blakesleeanus. OrthoMCL analysis revealed 3025 orthologues proteins, which were common in those three pathogenic Apophysomyces species. Expansion of multiple gene families/duplication was observed in Apophysomyces genomes. Approximately 6% of Apophysomyces genes were predicted to be associated with virulence on PHIbase analysis. The virulence determinants included the protein families of CotH proteins (invasins), proteases, iron utilisation pathways, siderophores and signal transduction pathways. Serine proteases were the major group of proteases found in all Apophysomyces genomes. The carbohydrate active enzymes (CAZymes) constitute the majority of the secretory proteins. The present study is the maiden attempt to sequence and analyze the genomic structure of A. variabilis. Together with available genome sequence of A. elegans and A. trapeziformis, the study helped to indicate the possible virulence determinants of pathogenic Apophysomyces species. The presence of unique CAZymes in cell wall might be exploited in future for antifungal drug development.
Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C.; Fiser, Andras
2014-01-01
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins—including proteins for which reliable homology models can be generated—on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long. PMID:24567391
Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C; Fiser, Andras
2014-03-11
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins--including proteins for which reliable homology models can be generated--on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long.
Scoping the polymer genome: A roadmap for rational polymer dielectrics design and beyond
Mannodi-Kanakkithodi, Arun; Chandrasekaran, Anand; Kim, Chiho; ...
2017-12-19
The Materials Genome Initiative (MGI) has heralded a sea change in the philosophy of materials design. In an increasing number of applications, the successful deployment of novel materials has benefited from the use of computational methodologies, data descriptors, and machine learning. Polymers have long suffered from a lack of data on electronic, mechanical, and dielectric properties across large chemical spaces, causing a stagnation in the set of suitable candidates for various applications. Extensive efforts over the last few years have seen the fruitful application of MGI principles toward the accelerated discovery of attractive polymer dielectrics for capacitive energy storage. Here,more » we review these efforts, highlighting the importance of computational data generation and screening, targeted synthesis and characterization, polymer fingerprinting and machine-learning prediction models, and the creation of an online knowledgebase to guide ongoing and future polymer discovery and design. We lay special emphasis on the fingerprinting of polymers in terms of their genome or constituent atomic and molecular fragments, an idea that pays homage to the pioneers of the human genome project who identified the basic building blocks of the human DNA. As a result, by scoping the polymer genome, we present an essential roadmap for the design of polymer dielectrics, and provide future perspectives and directions for expansions to other polymer subclasses and properties.« less
Scoping the polymer genome: A roadmap for rational polymer dielectrics design and beyond
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mannodi-Kanakkithodi, Arun; Chandrasekaran, Anand; Kim, Chiho
The Materials Genome Initiative (MGI) has heralded a sea change in the philosophy of materials design. In an increasing number of applications, the successful deployment of novel materials has benefited from the use of computational methodologies, data descriptors, and machine learning. Polymers have long suffered from a lack of data on electronic, mechanical, and dielectric properties across large chemical spaces, causing a stagnation in the set of suitable candidates for various applications. Extensive efforts over the last few years have seen the fruitful application of MGI principles toward the accelerated discovery of attractive polymer dielectrics for capacitive energy storage. Here,more » we review these efforts, highlighting the importance of computational data generation and screening, targeted synthesis and characterization, polymer fingerprinting and machine-learning prediction models, and the creation of an online knowledgebase to guide ongoing and future polymer discovery and design. We lay special emphasis on the fingerprinting of polymers in terms of their genome or constituent atomic and molecular fragments, an idea that pays homage to the pioneers of the human genome project who identified the basic building blocks of the human DNA. As a result, by scoping the polymer genome, we present an essential roadmap for the design of polymer dielectrics, and provide future perspectives and directions for expansions to other polymer subclasses and properties.« less
Localized Retroprocessing as a Model of Intron Loss in the Plant Mitochondrial Genome
Cuenca, Argelia; Ross, T. Gregory; Graham, Sean W.; Barrett, Craig F.; Davis, Jerrold I.; Seberg, Ole; Petersen, Gitte
2016-01-01
Loss of introns in plant mitochondrial genes is commonly explained by retroprocessing. Under this model, an mRNA is reverse transcribed and integrated back into the genome, simultaneously affecting the contents of introns and edited sites. To evaluate the extent to which retroprocessing explains intron loss, we analyzed patterns of intron content and predicted RNA editing for whole mitochondrial genomes of 30 species in the monocot order Alismatales. In this group, we found an unusually high degree of variation in the intron content, even expanding the hitherto known variation among angiosperms. Some species have lost some two-third of the cis-spliced introns. We found a strong correlation between intron content and editing frequency, and detected 27 events in which intron loss is consistent with the presence of nucleotides in an edited state, supporting retroprocessing. However, we also detected seven cases of intron loss not readily being explained by retroprocession. Our analyses are also not consistent with the entire length of a fully processed cDNA copy being integrated into the genome, but instead indicate that retroprocessing usually occurs for only part of the gene. In some cases, several rounds of retroprocessing may explain intron loss in genes completely devoid of introns. A number of taxa retroprocessing seem to be very common and a possibly ongoing process. It affects the entire mitochondrial genome. PMID:27435795
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.
Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav
2016-01-01
Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances
Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav
2016-01-01
Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos). PMID:27846272
Statistical Significance of Optical Map Alignments
Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C.
2012-01-01
Abstract The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities. PMID:22506568
Salem, Nida’ M.; Miller, W. Allen; Rowhani, Adib; Golino, Deborah A.; Moyne, Anne-Laure; Falk, Bryce W.
2015-01-01
We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5′- and 3′-RACE showed the RSDaV genomic RNA to be 5,808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3′-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5′ ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5′ end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3′ cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae. PMID:18329064
Salem, Nida' M; Miller, W Allen; Rowhani, Adib; Golino, Deborah A; Moyne, Anne-Laure; Falk, Bryce W
2008-06-05
We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5'- and 3'-RACE showed the RSDaV genomic RNA to be 5808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3'-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5' ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5' end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3' cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae.
The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lang, Daniel; Ullrich, Kristian K.; Murat, Florent
Here, the draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome–scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene– and TE–rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono–centric with peaks of a class of Copia elements potentially coinciding with centromeres. Genemore » body methylation is evident in 5.7% of the protein–coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure–based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant–specific cell growth and tissue organization. The P. patens genome lacks the TE–rich pericentromeric and gene–rich distal regions typical for most flowering plant genomes. More non–seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.« less
The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution
Lang, Daniel; Ullrich, Kristian K.; Murat, Florent; ...
2017-12-13
Here, the draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome–scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene– and TE–rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono–centric with peaks of a class of Copia elements potentially coinciding with centromeres. Genemore » body methylation is evident in 5.7% of the protein–coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure–based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant–specific cell growth and tissue organization. The P. patens genome lacks the TE–rich pericentromeric and gene–rich distal regions typical for most flowering plant genomes. More non–seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.« less
Nanoscale platforms for messenger RNA delivery.
Li, Bin; Zhang, Xinfu; Dong, Yizhou
2018-05-04
Messenger RNA (mRNA) has become a promising class of drugs for diverse therapeutic applications in the past few years. A series of clinical trials are ongoing or will be initiated in the near future for the treatment of a variety of diseases. Currently, mRNA-based therapeutics mainly focuses on ex vivo transfection and local administration in clinical studies. Efficient and safe delivery of therapeutically relevant mRNAs remains one of the major challenges for their broad applications in humans. Thus, effective delivery systems are urgently needed to overcome this limitation. In recent years, numerous nanoscale biomaterials have been constructed for mRNA delivery in order to protect mRNA from extracellular degradation and facilitate endosomal escape after cellular uptake. Nanoscale platforms have expanded the feasibility of mRNA-based therapeutics, and enabled its potential applications to protein replacement therapy, cancer immunotherapy, therapeutic vaccines, regenerative medicine, and genome editing. This review focuses on recent advances, challenges, and future directions in nanoscale platforms designed for mRNA delivery, including lipid and lipid-derived nanoparticles, polymer-based nanoparticles, protein derivatives mRNA complexes, and other types of nanomaterials. This article is categorized under: Nanotechnology Approaches to Biology > Nanoscale Systems in Biology Biology-Inspired Nanomaterials > Lipid-Based Structures Biology-Inspired Nanomaterials > Nucleic Acid-Based Structures. © 2018 Wiley Periodicals, Inc.
Behind Every Good Metabolite there is a Great Enzyme (and perhaps a structure)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buchko, Garry W.; Phan, Isabelle; Cron, Lisabeth
Today, due to great technological advancements, it is possible to study everything at the same time. This ability has given birth to “totality” studies in the fields of genomics, transcriptomics, proteomics, and metabolomics. In turn, the combined study of all these global analyses gave birth to the field of systems biology. Another “totality” field brought to life with new emerging technologies is structural genomics, an effort to determine the three-dimensional structure of every protein encoded in a genome. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a specialized structural genomics effort composed of academic (University of Washington), governmentmore » (Pacific Northwest National Laboratory), not-for-profit (Seattle BioMed), and commercial (Emerald BioStructures) institutions that is funded by the National Institute of Allergy and Infectious Diseases (Federal Contract: HHSN272200700057C and HHSN27220120025C) to apply genome-scale approaches in solving protein structures from biodefense organisms, as well as those causing emerging and re-emerging disease. In five years over 540 structures have been deposited into the Protein Data Bank (PDB) by SSGICD. About one third of all SSGCID structures contain bound ligands, many of which are metabolites or metabolite analogues present in the cell. These proteins structures are the blueprints for the structure-based design of the next generation of drugs against bacterial pathogens and other infectious diseases. Many of the selected SSGCID targets are annotated enzymes from known metabolomic pathways essential to cellular vitality since selectively “knocking-out” one of the enzymes in an important pathway with a drug may be fatal to the organism. One reason metabolomic pathways are important is because of the small molecules, or metabolites, produced at various steps in these pathways and identified by metabolomic studies. Unlike genomics, transcriptomics, and proteomics that may be influenced by epigenetic, post-transcriptional, and post-translational modifications, respectively, the metabolites present in the cell at any one time represent downstream biochemical endproducts, and therefore, metabolite profiles may be most closely associated with a phenotype and provide valuable information for infectious disease research. Metabolomic data would be even more useful if it could be linked to the vast amount of structural genomics data. Towards this goal SSGCID has created an automated website (http://apps.sbri.org/SSGCIDTargetStatus/Pathway) that assigns selected SSGCID target proteins to MetaCyc pathways (http://metacyc.org/). Details of this website will be provided here. The SSGCID-Pathway website represents a first big step towards linking metabolites and metabolic pathways to structural genomic data with the goal of accelerating the discovery of new agents to battle infectious diseases.« less
Tree decomposition based fast search of RNA structures including pseudoknots in genomes.
Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming
2005-01-01
Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.
Barrick, Jeffrey E; Colburn, Geoffrey; Deatherage, Daniel E; Traverse, Charles C; Strand, Matthew D; Borges, Jordan J; Knoester, David B; Reba, Aaron; Meyer, Austin G
2014-11-29
Mutations that alter chromosomal structure play critical roles in evolution and disease, including in the origin of new lifestyles and pathogenic traits in microbes. Large-scale rearrangements in genomes are often mediated by recombination events involving new or existing copies of mobile genetic elements, recently duplicated genes, or other repetitive sequences. Most current software programs for predicting structural variation from short-read DNA resequencing data are intended primarily for use on human genomes. They typically disregard information in reads mapping to repeat sequences, and significant post-processing and manual examination of their output is often required to rule out false-positive predictions and precisely describe mutational events. We have implemented an algorithm for identifying structural variation from DNA resequencing data as part of the breseq computational pipeline for predicting mutations in haploid microbial genomes. Our method evaluates the support for new sequence junctions present in a clonal sample from split-read alignments to a reference genome, including matches to repeat sequences. Then, it uses a statistical model of read coverage evenness to accept or reject these predictions. Finally, breseq combines predictions of new junctions and deleted chromosomal regions to output biologically relevant descriptions of mutations and their effects on genes. We demonstrate the performance of breseq on simulated Escherichia coli genomes with deletions generating unique breakpoint sequences, new insertions of mobile genetic elements, and deletions mediated by mobile elements. Then, we reanalyze data from an E. coli K-12 mutation accumulation evolution experiment in which structural variation was not previously identified. Transposon insertions and large-scale chromosomal changes detected by breseq account for ~25% of spontaneous mutations in this strain. In all cases, we find that breseq is able to reliably predict structural variation with modest read-depth coverage of the reference genome (>40-fold). Using breseq to predict structural variation should be useful for studies of microbial epidemiology, experimental evolution, synthetic biology, and genetics when a reference genome for a closely related strain is available. In these cases, breseq can discover mutations that may be responsible for important or unintended changes in genomes that might otherwise go undetected.
LINE-1 Elements in Structural Variation and Disease
Beck, Christine R.; Garcia-Perez, José Luis; Badge, Richard M.; Moran, John V.
2014-01-01
The completion of the human genome reference sequence ushered in a new era for the study and discovery of human transposable elements. It now is undeniable that transposable elements, historically dismissed as junk DNA, have had an instrumental role in sculpting the structure and function of our genomes. In particular, long interspersed element-1 (LINE-1 or L1) and short interspersed elements (SINEs) continue to affect our genome, and their movement can lead to sporadic cases of disease. Here, we briefly review the types of transposable elements present in the human genome and their mechanisms of mobility. We next highlight how advances in DNA sequencing and genomic technologies have enabled the discovery of novel retrotransposons in individual genomes. Finally, we discuss how L1-mediated retrotransposition events impact human genomes. PMID:21801021
Genome-wide diversity and selective pressure in the human rhinovirus
Kistler, Amy L; Webster, Dale R; Rouskin, Silvi; Magrini, Vince; Credle, Joel J; Schnurr, David P; Boushey, Homer A; Mardis, Elaine R; Li, Hao; DeRisi, Joseph L
2007-01-01
Background The human rhinoviruses (HRV) are one of the most common and diverse respiratory pathogens of humans. Over 100 distinct HRV serotypes are known, yet only 6 genomes are available. Due to the paucity of HRV genome sequence, little is known about the genetic diversity within HRV or the forces driving this diversity. Previous comparative genome sequence analyses indicate that recombination drives diversification in multiple genera of the picornavirus family, yet it remains unclear if this holds for HRV. Results To resolve this and gain insight into the forces driving diversification in HRV, we generated a representative set of 34 fully sequenced HRVs. Analysis of these genomes shows consistent phylogenies across the genome, conserved non-coding elements, and only limited recombination. However, spikes of genetic diversity at both the nucleotide and amino acid level are detectable within every locus of the genome. Despite this, the HRV genome as a whole is under purifying selective pressure, with islands of diversifying pressure in the VP1, VP2, and VP3 structural genes and two non-structural genes, the 3C protease and 3D polymerase. Mapping diversifying residues in these factors onto available 3-dimensional structures revealed the diversifying capsid residues partition to the external surface of the viral particle in statistically significant proximity to antigenic sites. Diversifying pressure in the pleconaril binding site is confined to a single residue known to confer drug resistance (VP1 191). In contrast, diversifying pressure in the non-structural genes is less clear, mapping both nearby and beyond characterized functional domains of these factors. Conclusion This work provides a foundation for understanding HRV genetic diversity and insight into the underlying biology driving evolution in HRV. It expands our knowledge of the genome sequence space that HRV reference serotypes occupy and how the pattern of genetic diversity across HRV genomes differs from other picornaviruses. It also reveals evidence of diversifying selective pressure in both structural genes known to interact with the host immune system and in domains of unassigned function in the non-structural 3C and 3D genes, raising the possibility that diversification of undiscovered functions in these essential factors may influence HRV fitness and evolution. PMID:17477878
Perspective: Role of structure prediction in materials discovery and design
NASA Astrophysics Data System (ADS)
Needs, Richard J.; Pickard, Chris J.
2016-05-01
Materials informatics owes much to bioinformatics and the Materials Genome Initiative has been inspired by the Human Genome Project. But there is more to bioinformatics than genomes, and the same is true for materials informatics. Here we describe the rapidly expanding role of searching for structures of materials using first-principles electronic-structure methods. Structure searching has played an important part in unraveling structures of dense hydrogen and in identifying the record-high-temperature superconducting component in hydrogen sulfide at high pressures. We suggest that first-principles structure searching has already demonstrated its ability to determine structures of a wide range of materials and that it will play a central and increasing part in materials discovery and design.
From genomics to chemical genomics: new developments in KEGG
Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika
2006-01-01
The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
Chromosomal distribution of microsatellite repeats in Amazon cichlids genome (Pisces, Cichlidae)
Schneider, Carlos Henrique; Gross, Maria Claudia; Terencio, Maria Leandra; de Tavares, Édika Sabrina Girão Mitozo; Martins, Cesar; Feldberg, Eliana
2015-01-01
Abstract Fish of the family Cichlidae are recognized as an excellent model for evolutionary studies because of their morphological and behavioral adaptations to a wide diversity of explored ecological niches. In addition, the family has a dynamic genome with variable structure, composition and karyotype organization. Microsatellites represent the most dynamic genomic component and a better understanding of their organization may help clarify the role of repetitive DNA elements in the mechanisms of chromosomal evolution. Thus, in this study, microsatellite sequences were mapped in the chromosomes of Cichla monoculus Agassiz, 1831, Pterophyllum scalare Schultze, 1823, and Symphysodon discus Heckel, 1840. Four microsatellites demonstrated positive results in the genome of Cichla monoculus and Symphysodon discus, and five demonstrated positive results in the genome of Pterophyllum scalare. In most cases, the microsatellite was dispersed in the chromosome with conspicuous markings in the centromeric or telomeric regions, which suggests that sequences contribute to chromosome structure and may have played a role in the evolution of this fish family. The comparative genome mapping data presented here provide novel information on the structure and organization of the repetitive DNA region of the cichlid genome and contribute to a better understanding of this fish family’s genome. PMID:26753076
Computational characterization of chromatin domain boundary-associated genomic elements
Hong, Seungpyo
2017-01-01
Abstract Topologically associated domains (TADs) are 3D genomic structures with high internal interactions that play important roles in genome compaction and gene regulation. Their genomic locations and their association with CCCTC-binding factor (CTCF)-binding sites and transcription start sites (TSSs) were recently reported. However, the relationship between TADs and other genomic elements has not been systematically evaluated. This was addressed in the present study, with a focus on the enrichment of these genomic elements and their ability to predict the TAD boundary region. We found that consensus CTCF-binding sites were strongly associated with TAD boundaries as well as with the transcription factors (TFs) Zinc finger protein (ZNF)143 and Yin Yang (YY)1. TAD boundary-associated genomic elements include DNase I-hypersensitive sites, H3K36 trimethylation, TSSs, RNA polymerase II, and TFs such as Specificity protein 1, ZNF274 and SIX homeobox 5. Computational modeling with these genomic elements suggests that they have distinct roles in TAD boundary formation. We propose a structural model of TAD boundaries based on these findings that provides a basis for studying the mechanism of chromatin structure formation and gene regulation. PMID:28977568
Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria.
Repar, Jelena; Supek, Fran; Klanjscek, Tin; Warnecke, Tobias; Zahradka, Ksenija; Zahradka, Davor
2017-04-01
A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compensate for the genome-destabilizing effect of environmental DNA damage and may be expected to result in a more conserved gene order in radiation-resistant species. However, here we show that rates of genome rearrangements, measured as loss of gene order conservation with time, are higher in radiation-resistant species in multiple, phylogenetically independent groups of bacteria. Comparison of indicators of selection for genome organization between radiation-resistant and phylogenetically matched, nonresistant species argues against tolerance to disruption of genome structure as a strategy for radiation resistance. Interestingly, an important mechanism affecting genome rearrangements in prokaryotes, the symmetrical inversions around the origin of DNA replication, shapes genome structure of both radiation-resistant and nonresistant species. In conclusion, the opposing effects of environmental DNA damage and DNA repair result in elevated rates of genome rearrangements in radiation-resistant bacteria. Copyright © 2017 Repar et al.
In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.
Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M
2014-01-30
RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian
2009-11-01
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
Wan, KangKang; Zhang, Zhong; Pang, Xiaoming; Yin, Xiao; Bai, Yang; Sun, Xiaoqing; Gao, Lizhi; Li, Ruiqiang; Zhang, Jinbo
2016-01-01
Jujube (Ziziphus jujuba Mill.) belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar ‘Junzao’ and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa). Comparative analysis revealed that the genome of ‘Dongzao’, a fresh jujube, was ~86.5 Mb larger than that of the ‘Junzao’, partially due to the recent insertions of transposable elements in the ‘Dongzao’ genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication. PMID:28005948
Dash, Prasanta K; Rai, Rhitu
2016-01-01
Evolutionary frozen, genetically sterile and globally iconic fruit "Banana" remained untouched by the green revolution and, as of today, researchers face intrinsic impediments for its varietal improvement. Recently, this wonder crop entered the genomics era with decoding of structural genome of double haploid Pahang (AA genome constitution) genotype of Musa acuminata . Its complex genome decoded by hybrid sequencing strategies revealed panoply of genes and transcription factors involved in the process of sucrose conversion that imparts sweetness to its fruit. Historically, banana has faced the wrath of pandemic bacterial, fungal, and viral diseases and multitude of abiotic stresses that has ruined the livelihood of small/marginal farmers' and destroyed commercial plantations. Decoding structural genome of this climacteric fruit has given impetus to a deeper understanding of the repertoire of genes involved in disease resistance, understanding the mechanism of dwarfing to develop an ideal plant type, unraveling the process of parthenocarpy, and fruit ripening for better fruit quality. Further, injunction of comparative genomics will usher in integration of information from its decoded genome and other monocots into field applications in banana related but not limited to yield enhancement, food security, livelihood assurance, and energy sustainability. In this mini review, we discuss pre- and post-genomic discoveries and highlight accomplishments in structural genomics, genetic engineering and forward genetic accomplishments with an aim to target genes and transcription factors for translational research in banana.
Huang, Jian; Zhang, Chunmei; Zhao, Xing; Fei, Zhangjun; Wan, KangKang; Zhang, Zhong; Pang, Xiaoming; Yin, Xiao; Bai, Yang; Sun, Xiaoqing; Gao, Lizhi; Li, Ruiqiang; Zhang, Jinbo; Li, Xingang
2016-12-01
Jujube (Ziziphus jujuba Mill.) belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa). Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.
A score-statistic approach for determining threshold values in QTL mapping.
Kao, Chen-Hung; Ho, Hsiang-An
2012-06-01
Issues in determining the threshold values of QTL mapping are often investigated for the backcross and F2 populations with relatively simple genome structures so far. The investigations of these issues in the progeny populations after F2 (advanced populations) with relatively more complicated genomes are generally inadequate. As these advanced populations have been well implemented in QTL mapping, it is important to address these issues for them in more details. Due to an increasing number of meiosis cycle, the genomes of the advanced populations can be very different from the backcross and F2 genomes. Therefore, special devices that consider the specific genome structures present in the advanced populations are required to resolve these issues. By considering the differences in genome structure between populations, we formulate more general score test statistics and gaussian processes to evaluate their threshold values. In general, we found that, given a significance level and a genome size, threshold values for QTL detection are higher in the denser marker maps and in the more advanced populations. Simulations were performed to validate our approach.
Reconstructing Past Admixture Processes from Local Genomic Ancestry Using Wavelet Transformation
Sanderson, Jean; Sudoyo, Herawati; Karafet, Tatiana M.; Hammer, Michael F.; Cox, Murray P.
2015-01-01
Admixture between long-separated populations is a defining feature of the genomes of many species. The mosaic block structure of admixed genomes can provide information about past contact events, including the time and extent of admixture. Here, we describe an improved wavelet-based technique that better characterizes ancestry block structure from observed genomic patterns. principal components analysis is first applied to genomic data to identify the primary population structure, followed by wavelet decomposition to develop a new characterization of local ancestry information along the chromosomes. For testing purposes, this method is applied to human genome-wide genotype data from Indonesia, as well as virtual genetic data generated using genome-scale sequential coalescent simulations under a wide range of admixture scenarios. Time of admixture is inferred using an approximate Bayesian computation framework, providing robust estimates of both admixture times and their associated levels of uncertainty. Crucially, we demonstrate that this revised wavelet approach, which we have released as the R package adwave, provides improved statistical power over existing wavelet-based techniques and can be used to address a broad range of admixture questions. PMID:25852078
Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy
Garmann, Rees F.; Gopal, Ajaykumar; Athavale, Shreyas S.; Knobler, Charles M.; Gelbart, William M.; Harvey, Stephen C.
2015-01-01
The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures. PMID:25752599
Single-molecule optical genome mapping of a human HapMap and a colorectal cancer cell line.
Teo, Audrey S M; Verzotto, Davide; Yao, Fei; Nagarajan, Niranjan; Hillmer, Axel M
2015-01-01
Next-generation sequencing (NGS) technologies have changed our understanding of the variability of the human genome. However, the identification of genome structural variations based on NGS approaches with read lengths of 35-300 bases remains a challenge. Single-molecule optical mapping technologies allow the analysis of DNA molecules of up to 2 Mb and as such are suitable for the identification of large-scale genome structural variations, and for de novo genome assemblies when combined with short-read NGS data. Here we present optical mapping data for two human genomes: the HapMap cell line GM12878 and the colorectal cancer cell line HCT116. High molecular weight DNA was obtained by embedding GM12878 and HCT116 cells, respectively, in agarose plugs, followed by DNA extraction under mild conditions. Genomic DNA was digested with KpnI and 310,000 and 296,000 DNA molecules (≥ 150 kb and 10 restriction fragments), respectively, were analyzed per cell line using the Argus optical mapping system. Maps were aligned to the human reference by OPTIMA, a new glocal alignment method. Genome coverage of 6.8× and 5.7× was obtained, respectively; 2.9× and 1.7× more than the coverage obtained with previously available software. Optical mapping allows the resolution of large-scale structural variations of the genome, and the scaffold extension of NGS-based de novo assemblies. OPTIMA is an efficient new alignment method; our optical mapping data provide a resource for genome structure analyses of the human HapMap reference cell line GM12878, and the colorectal cancer cell line HCT116.
The eukaryotic genome is structurally and functionally more like a social insect colony than a book.
Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin
2017-11-01
Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.
Kim, Eun Bae
2014-01-01
Certain strains of Enterococcus faecium and Enterococcus faecalis contribute beneficially to animal health and food production, while others are associated with nosocomial infections. To determine whether there are structural and functional genomic features that are distinct between nonclinical (NC) and clinical (CL) strains of those species, we analyzed the genomes of 31 E. faecium and 38 E. faecalis strains. Hierarchical clustering of 7,017 orthologs found in the E. faecium pangenome revealed that NC strains clustered into two clades and are distinct from CL strains. NC E. faecium genomes are significantly smaller than CL genomes, and this difference was partly explained by significantly fewer mobile genetic elements (ME), virulence factors (VF), and antibiotic resistance (AR) genes. E. faecium ortholog comparisons identified 68 and 153 genes that are enriched for NC and CL strains, respectively. Proximity analysis showed that CL-enriched loci, and not NC-enriched loci, are more frequently colocalized on the genome with ME. In CL genomes, AR genes are also colocalized with ME, and VF are more frequently associated with CL-enriched loci. Genes in 23 functional groups are also differentially enriched between NC and CL E. faecium genomes. In contrast, differences were not observed between NC and CL E. faecalis genomes despite their having larger genomes than E. faecium. Our findings show that unlike E. faecalis, NC and CL E. faecium strains are equipped with distinct structural and functional genomic features indicative of adaptation to different environments. PMID:24141120
First and second trimester screening for fetal structural anomalies.
Edwards, Lindsay; Hui, Lisa
2018-04-01
Fetal structural anomalies are found in up to 3% of all pregnancies and ultrasound-based screening has been an integral part of routine prenatal care for decades. The prenatal detection of fetal anomalies allows for optimal perinatal management, providing expectant parents with opportunities for additional imaging, genetic testing, and the provision of information regarding prognosis and management options. Approximately one-half of all major structural anomalies can now be detected in the first trimester, including acrania/anencephaly, abdominal wall defects, holoprosencephaly and cystic hygromata. Due to the ongoing development of some organ systems however, some anomalies will not be evident until later in the pregnancy. To this extent, the second trimester anatomy is recommended by professional societies as the standard investigation for the detection of fetal structural anomalies. The reported detection rates of structural anomalies vary according to the organ system being examined, and are also dependent upon factors such as the equipment settings and sonographer experience. Technological advances over the past two decades continue to support the role of ultrasound as the primary imaging modality in pregnancy, and the safety of ultrasound for the developing fetus is well established. With increasing capabilities and experience, detailed examination of the central nervous system and cardiovascular system is possible, with dedicated examinations such as the fetal neurosonogram and the fetal echocardiogram now widely performed in tertiary centers. Magnetic resonance imaging (MRI) is well recognized for its role in the assessment of fetal brain anomalies; other potential indications for fetal MRI include lung volume measurement (in cases of congenital diaphragmatic hernia), and pre-surgical planning prior to fetal spina bifida repair. When a major structural abnormality is detected prenatally, genetic testing with chromosomal microarray is recommended over routine karyotype due to its higher genomic resolution. Copyright © 2017 Elsevier Ltd. All rights reserved.
Leulliot, Nicolas; Trésaugues, Lionel; Bremang, Michael; Sorel, Isabelle; Ulryck, Nathalie; Graille, Marc; Aboulfath, Ilham; Poupon, Anne; Liger, Dominique; Quevillon-Cheruel, Sophie; Janin, Joël; van Tilbeurgh, Herman
2005-06-01
Crystallization has long been regarded as one of the major bottlenecks in high-throughput structural determination by X-ray crystallography. Structural genomics projects have addressed this issue by using robots to set up automated crystal screens using nanodrop technology. This has moved the bottleneck from obtaining the first crystal hit to obtaining diffraction-quality crystals, as crystal optimization is a notoriously slow process that is difficult to automatize. This article describes the high-throughput optimization strategies used in the Yeast Structural Genomics project, with selected successful examples.
2011-02-01
Thrombocytopenia, Grade 3 in 1 patient • Hypomagnesemia, Grade 3 in 1 patient • Hypokalemia, Grade 3 in 2 patient • Pneumonia , Grade 3 in 7 patients...urgently needed. While the molecular events involved in lung cancer pathogenesis are being unraveled by ongoing large scale genomics, proteomics, and...tumor initiation, progression and metastasis are an important first step leading to the development of new prognostic markers and targets for therapy
The impact of transposable elements on mammalian development
Garcia-Perez, Jose L.; Widmann, Thomas J.; Adams, Ian R.
2018-01-01
Summary Despite often being classified as selfish or junk DNA, transposable elements (TEs) are a group of abundant genetic sequences that significantly impact on mammalian development and genome regulation. In recent years, our understanding of how pre-existing TEs affect genome architecture, gene regulatory networks and protein function during mammalian embryogenesis has dramatically expanded. In addition, the mobilization of active TEs in selected cell types has been shown to generate genetic variation during development and in fully differentiated tissues. Importantly, the ongoing domestication and evolution of TEs appears to provide a rich source of regulatory elements, functional modules and genetic variation that fuels the evolution of mammalian developmental processes. Here, we review the functional impact that TEs exert on mammalian developmental processes and how the somatic activity of TEs can influence gene regulatory networks. PMID:27875251
“Shovel-ready” Sequences as a Stimulus for the Next Generation of Life Scientists
Boyle, Michael D.
2010-01-01
Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists. PMID:23653696
"Shovel-ready" Sequences as a Stimulus for the Next Generation of Life Scientists.
Boyle, Michael D
2010-01-01
Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists.
Maggi, Elaine; Montagna, Cristina
2015-12-01
The American Association for Cancer Research (AACR) Precision Medicine Series "Integrating Clinical Genomics and Cancer Therapy" took place June 13-16, 2015 in Salt Lake City, Utah. The conference was co-chaired by Charles L. Sawyers form Memorial Sloan Kettering Cancer Center in New York, Elaine R. Mardis form Washington University School of Medicine in St. Louis, and Arul M. Chinnaiyan from University of Michigan in Ann Arbor. About 500 clinicians, basic science investigators, bioinformaticians, and postdoctoral fellows joined together to discuss the current state of Clinical Genomics and the advances and challenges of integrating Next Generation Sequencing (NGS) technologies into clinical practice. The plenary sessions and panel discussions covered current platforms and sequencing approaches adopted for NGS assays of cancer genome at several national and international institutions, different approaches used to map and classify targetable sequence variants, and how information acquired with the sequencing of the cancer genome is used to guide treatment options. While challenges still exist from a technological perspective, it emerged that there exists considerable need for the development of tools to aid the identification of the therapy most suitable based on the mutational profile of the somatic cancer genome. The process to match patients to ongoing clinical trials is still complex. In addition, the need for centralized data repositories, preferably linked to well annotated clinical records, that aid sharing of sequencing information is central to begin understanding the contribution of variants of unknown significance to tumor etiology and response to therapy. Here we summarize the highlights of this stimulating four-day conference with a major emphasis on the open problems that the clinical genomics community is currently facing and the tools most needed for advancing this field. Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Black, Michael; Moolhuijzen, Paula; Chapman, Brett; Barrero, Roberto; Howieson, John; Hungria, Mariangela; Bellgard, Matthew
2012-01-01
The symbiotic relationship between legumes and nitrogen fixing bacteria is critical for agriculture, as it may have profound impacts on lowering costs for farmers, on land sustainability, on soil quality, and on mitigation of greenhouse gas emissions. However, despite the importance of the symbioses to the global nitrogen cycling balance, very few rhizobial genomes have been sequenced so far, although there are some ongoing efforts in sequencing elite strains. In this study, the genomes of fourteen selected strains of the order Rhizobiales, all previously fully sequenced and annotated, were compared to assess differences between the strains and to investigate the feasibility of defining a core ‘symbiome’—the essential genes required by all rhizobia for nodulation and nitrogen fixation. Comparison of these whole genomes has revealed valuable information, such as several events of lateral gene transfer, particularly in the symbiotic plasmids and genomic islands that have contributed to a better understanding of the evolution of contrasting symbioses. Unique genes were also identified, as well as omissions of symbiotic genes that were expected to be found. Protein comparisons have also allowed the identification of a variety of similarities and differences in several groups of genes, including those involved in nodulation, nitrogen fixation, production of exopolysaccharides, Type I to Type VI secretion systems, among others, and identifying some key genes that could be related to host specificity and/or a better saprophytic ability. However, while several significant differences in the type and number of proteins were observed, the evidence presented suggests no simple core symbiome exists. A more abstract systems biology concept of nitrogen fixing symbiosis may be required. The results have also highlighted that comparative genomics represents a valuable tool for capturing specificities and generalities of each genome. PMID:24704847
Pool, John E
2015-12-01
North American populations of Drosophila melanogaster derive from both European and African source populations, but despite their importance for genetic research, patterns of ancestry along their genomes are largely undocumented. Here, I infer geographic ancestry along genomes of the Drosophila Genetic Reference Panel (DGRP) and the D. melanogaster reference genome, which may have implications for reference alignment, association mapping, and population genomic studies in Drosophila. Overall, the proportion of African ancestry was estimated to be 20% for the DGRP and 9% for the reference genome. Combining my estimate of admixture timing with historical records, I provide the first estimate of natural generation time for this species (approximately 15 generations per year). Ancestry levels were found to vary strikingly across the genome, with less African introgression on the X chromosome, in regions of high recombination, and at genes involved in specific processes (e.g., circadian rhythm). An important role for natural selection during the admixture process was further supported by evidence that many unlinked pairs of loci showed a deficiency of Africa-Europe allele combinations between them. Numerous epistatic fitness interactions may therefore exist between African and European genotypes, leading to ongoing selection against incompatible variants. By focusing on hubs in this network of fitness interactions, I identified a set of interacting loci that include genes with roles in sensation and neuropeptide/hormone reception. These findings suggest that admixed D. melanogaster samples could become an important study system for the genetics of early-stage isolation between populations. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Rockinger, Alexander; Sousa, Aretuza; Carvalho, Fernanda A; Renner, Susanne S
2016-06-01
Caricaceae include six genera and 34 species, among them papaya, a model species in plant sex chromosome research. The family was held to have a conserved karyotype with 2n = 18 chromosomes, an assumption based on few counts. We examined the karyotypes and genome size of species from all genera to test for possible cytogenetic variation. We used fluorescent in situ hybridization using standard telomere, 5S, and 45S rDNA probes. New and published data were combined with a phylogeny, molecular clock dating, and C values (available for ∼50% of the species) to reconstruct genome evolution. The African genus Cylicomorpha, which is sister to the remaining Caricaceae (all neotropical), has 2n = 18, as do the species in two other genera. A Mexican clade of five species that includes papaya, however, has 2n = 18 (papaya), 2n = 16 (Horovitzia cnidoscoloides), and 2n = 14 (Jarilla caudata and J. heterophylla; third Jarilla not counted), with the phylogeny indicating that the dysploidy events occurred ∼16.6 and ∼5.5 million years ago and that Jarilla underwent genome size doubling (∼450 to 830-920 Mbp/haploid genome). Pericentromeric interstitial telomere repeats occur in both Jarilla adjacent to 5S rDNA sites, and the variability of 5S rDNA sites across all genera is high. On the basis of outgroup comparison, 2n = 18 is the ancestral number, and repeated chromosomal fusions with simultaneous genome size increase as a result of repetitive elements accumulating near centromeres characterize the papaya clade. These results have implications for ongoing genome assemblies in Caricaceae. © 2016 Botanical Society of America.
Loquasto, Joseph R.; Barrangou, Rodolphe; Dudley, Edward G.; Stahl, Buffy; Chen, Chun
2013-01-01
Many strains of Bifidobacterium animalis subsp. lactis are considered health-promoting probiotic microorganisms and are commonly formulated into fermented dairy foods. Analyses of previously sequenced genomes of B. animalis subsp. lactis have revealed little genetic diversity, suggesting that it is a monomorphic subspecies. However, during a multilocus sequence typing survey of Bifidobacterium, it was revealed that B. animalis subsp. lactis ATCC 27673 gave a profile distinct from that of the other strains of the subspecies. As part of an ongoing study designed to understand the genetic diversity of this subspecies, the genome of this strain was sequenced and compared to other sequenced genomes of B. animalis subsp. lactis and B. animalis subsp. animalis. The complete genome of ATCC 27673 was 1,963,012 bp, contained 1,616 genes and 4 rRNA operons, and had a G+C content of 61.55%. Comparative analyses revealed that the genome of ATCC 27673 contained six distinct genomic islands encoding 83 open reading frames not found in other strains of the same subspecies. In four islands, either phage or mobile genetic elements were identified. In island 6, a novel clustered regularly interspaced short palindromic repeat (CRISPR) locus which contained 81 unique spacers was identified. This type I-E CRISPR-cas system differs from the type I-C systems previously identified in this subspecies, representing the first identification of a different system in B. animalis subsp. lactis. This study revealed that ATCC 27673 is a strain of B. animalis subsp. lactis with novel genetic content and suggests that the lack of genetic variability observed is likely due to the repeated sequencing of a limited number of widely distributed commercial strains. PMID:23995933
Identification of structural variation in mouse genomes.
Keane, Thomas M; Wong, Kim; Adams, David J; Flint, Jonathan; Reymond, Alexandre; Yalcin, Binnaz
2014-01-01
Structural variation is variation in structure of DNA regions affecting DNA sequence length and/or orientation. It generally includes deletions, insertions, copy-number gains, inversions, and transposable elements. Traditionally, the identification of structural variation in genomes has been challenging. However, with the recent advances in high-throughput DNA sequencing and paired-end mapping (PEM) methods, the ability to identify structural variation and their respective association to human diseases has improved considerably. In this review, we describe our current knowledge of structural variation in the mouse, one of the prime model systems for studying human diseases and mammalian biology. We further present the evolutionary implications of structural variation on transposable elements. We conclude with future directions on the study of structural variation in mouse genomes that will increase our understanding of molecular architecture and functional consequences of structural variation.
Real-time, portable genome sequencing for Ebola surveillance
Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan HJ; Becker-Ziaja, Beate; Boettcher, Jan-Peter; Cabeza-Cabrerizo, Mar; Camino-Sanchez, Alvaro; Carter, Lisa L.; Doerrbecker, Juiliane; Enkirch, Theresa; Dorival, Isabel Graciela García; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigail; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallash, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Yasmin; Sachse, Andrea; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Trina, Racine; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N’Faly; Williams, Cecelia V.; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Franck; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, Jamie; Rachwal, Philip; Turner, Daniel; Pollakis, Georgios; Hiscox, Julian A.; Matthews, David A.; O’Shea, Matthew K.; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Woelfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A.; Koivogui, Lamine; Diallo, Boubacar; Keita, Sakoba; Rambaut, Andrew; Formenty, Pierre; Gunther, Stephan; Carroll, Miles W.
2016-01-01
The Ebola virus disease (EVD) epidemic in West Africa is the largest on record, responsible for >28,599 cases and >11,299 deaths 1. Genome sequencing in viral outbreaks is desirable in order to characterize the infectious agent to determine its evolutionary rate, signatures of host adaptation, identification and monitoring of diagnostic targets and responses to vaccines and treatments. The Ebola virus genome (EBOV) substitution rate in the Makona strain has been estimated at between 0.87 × 10−3 to 1.42 × 10−3 mutations per site per year. This is equivalent to 16 to 27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic 2-7. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought-after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions 8. Genomic surveillance during the epidemic has been sporadic due to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities 9. In order to address this problem, we devised a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. Here we present sequence data and analysis of 142 Ebola virus (EBOV) samples collected during the period March to October 2015. We were able to generate results in less than 24 hours after receiving an Ebola positive sample, with the sequencing process taking as little as 15-60 minutes. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks. PMID:26840485
[Three-dimensional genome organization: a lesson from the Polycomb-Group proteins].
Bantignies, Frédéric
2013-01-01
As more and more genomes are being explored and annotated, important features of three-dimensional (3D) genome organization are just being uncovered. In the light of what we know about Polycomb group (PcG) proteins, we will present the latest findings on this topic. The PcG proteins are well-conserved chromatin factors that repress transcription of numerous target genes. They bind the genome at specific sites, forming chromatin domains of associated histone modifications as well as higher-order chromatin structures. These 3D chromatin structures involve the interactions between PcG-bound regulatory regions at short- and long-range distances, and may significantly contribute to PcG function. Recent high throughput "Chromosome Conformation Capture" (3C) analyses have revealed many other higher order structures along the chromatin fiber, partitioning the genomes into well demarcated topological domains. This revealed an unprecedented link between linear epigenetic domains and chromosome architecture, which might be intimately connected to genome function. © Société de Biologie, 2013.
Genome Structure of the Legume, Lotus japonicus
Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi
2008-01-01
The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435
Evolutionary genomics and population structure of Entamoeba histolytica
Das, Koushik; Ganguly, Sandipan
2014-01-01
Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba. PMID:25505504
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.
Bernard, Guillaume; Chan, Cheong Xin; Chan, Yao-Ban; Chua, Xin-Yi; Cong, Yingnan; Hogan, James M; Maetschke, Stefan R; Ragan, Mark A
2017-06-30
We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed. © The Author 2017. Published by Oxford University Press.
Sequence variation of the feline immunodeficiency virus genome and its clinical relevance.
Stickney, A L; Dunowska, M; Cave, N J
2013-06-08
The ongoing evolution of feline immunodeficiency virus (FIV) has resulted in the existence of a diverse continuum of viruses. FIV isolates differ with regards to their mutation and replication rates, plasma viral loads, cell tropism and the ability to induce apoptosis. Clinical disease in FIV-infected cats is also inconsistent. Genomic sequence variation of FIV is likely to be responsible for some of the variation in viral behaviour. The specific genetic sequences that influence these key viral properties remain to be determined. With knowledge of the specific key determinants of pathogenicity, there is the potential for veterinarians in the future to apply this information for prognostic purposes. Genomic sequence variation of FIV also presents an obstacle to effective vaccine development. Most challenge studies demonstrate acceptable efficacy of a dual-subtype FIV vaccine (Fel-O-Vax FIV) against FIV infection under experimental settings; however, vaccine efficacy in the field still remains to be proven. It is important that we discover the key determinants of immunity induced by this vaccine; such data would compliment vaccine field efficacy studies and provide the basis to make informed recommendations on its use.
Kugelman, Jeffrey R; Sanchez-Lockhart, Mariano; Andersen, Kristian G; Gire, Stephen; Park, Daniel J; Sealfon, Rachel; Lin, Aaron E; Wohl, Shirlee; Sabeti, Pardis C; Kuhn, Jens H; Palacios, Gustavo F
2015-01-20
Until recently, Ebola virus (EBOV) was a rarely encountered human pathogen that caused disease among small populations with extraordinarily high lethality. At the end of 2013, EBOV initiated an unprecedented disease outbreak in West Africa that is still ongoing and has already caused thousands of deaths. Recent studies revealed the genomic changes this particular EBOV variant undergoes over time during human-to-human transmission. Here we highlight the genomic changes that might negatively impact the efficacy of currently available EBOV sequence-based candidate therapeutics, such as small interfering RNAs (siRNAs), phosphorodiamidate morpholino oligomers (PMOs), and antibodies. Ten of the observed mutations modify the sequence of the binding sites of monoclonal antibody (MAb) 13F6, MAb 1H3, MAb 6D8, MAb 13C6, and siRNA EK-1, VP24, and VP35 targets and might influence the binding efficacy of the sequence-based therapeutics, suggesting that their efficacy should be reevaluated against the currently circulating strain. Copyright © 2015 Kugelman, et al.
Morgan, Andrew P.; Didion, John P.; Doran, Anthony G.; Holt, James M.; McMillan, Leonard; Keane, Thomas M.; de Villena, Fernando Pardo-Manuel
2016-01-01
Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction. PMID:27765810
Genome Editing of Structural Variations: Modeling and Gene Correction.
Park, Chul-Yong; Sung, Jin Jea; Kim, Dong-Wook
2016-07-01
The analysis of chromosomal structural variations (SVs), such as inversions and translocations, was made possible by the completion of the human genome project and the development of genome-wide sequencing technologies. SVs contribute to genetic diversity and evolution, although some SVs can cause diseases such as hemophilia A in humans. Genome engineering technology using programmable nucleases (e.g., ZFNs, TALENs, and CRISPR/Cas9) has been rapidly developed, enabling precise and efficient genome editing for SV research. Here, we review advances in modeling and gene correction of SVs, focusing on inversion, translocation, and nucleotide repeat expansion. Copyright © 2016 Elsevier Ltd. All rights reserved.
Measuring cancer evolution from the genome.
Graham, Trevor A; Sottoriva, Andrea
2017-01-01
The temporal dynamics of cancer evolution remain elusive, because it is impractical to longitudinally observe cancers unperturbed by treatment. Consequently, our knowledge of how cancers grow largely derives from inferences made from a single point in time - the endpoint in the cancer's evolution, when it is removed from the body and studied in the laboratory. Fortuitously however, the cancer genome, by virtue of ongoing mutations that uniquely mark clonal lineages within the tumour, provides a rich, yet surreptitious, record of cancer development. In this review, we describe how a cancer's genome can be analysed to reveal the temporal history of mutation and selection, and discuss why both selective and neutral evolution feature prominently in carcinogenesis. We argue that selection in cancer can only be properly studied once we have some understanding of what the absence of selection looks like. We review the data describing punctuated evolution in cancer, and reason that punctuated phenotype evolution is consistent with both gradual and punctuated genome evolution. We conclude that, to map and predict evolutionary trajectories during carcinogenesis, it is critical to better understand the relationship between genotype change and phenotype change. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Illumina Production Sequencing at the DOE Joint Genome Institute - Workflow and Optimizations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tarver, Angela; Fern, Alison; Diego, Matthew San
2010-06-18
The U.S. Department of Energy (DOE) Joint Genome Institute?s (JGI) Production Sequencing group is committed to the generation of high-quality genomic DNA sequence to support the DOE mission areas of renewable energy generation, global carbon management, and environmental characterization and clean-up. Within the JGI?s Production Sequencing group, the Illumina Genome Analyzer pipeline has been established as one of three sequencing platforms, along with Roche/454 and ABI/Sanger. Optimization of the Illumina pipeline has been ongoing with the aim of continual process improvement of the laboratory workflow. These process improvement projects are being led by the JGI?s Process Optimization, Sequencing Technologies, Instrumentation&more » Engineering, and the New Technology Production groups. Primary focus has been on improving the procedural ergonomics and the technicians? operating environment, reducing manually intensive technician operations with different tools, reducing associated production costs, and improving the overall process and generated sequence quality. The U.S. DOE JGI was established in 1997 in Walnut Creek, CA, to unite the expertise and resources of five national laboratories? Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest ? along with HudsonAlpha Institute for Biotechnology. JGI is operated by the University of California for the U.S. DOE.« less
Enhancing knowledge discovery from cancer genomics data with Galaxy
Albuquerque, Marco A.; Grande, Bruno M.; Ritch, Elie J.; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K.; Shah, Sohrab P.; Boutros, Paul C.
2017-01-01
Abstract The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. PMID:28327945
Genetic Diversity in the Modern Horse Illustrated from Genome-Wide SNP Data
Petersen, Jessica L.; Mickelson, James R.; Cothran, E. Gus; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M. Wade, Claire; McCue, Molly E.
2013-01-01
Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025
Enhancing knowledge discovery from cancer genomics data with Galaxy.
Albuquerque, Marco A; Grande, Bruno M; Ritch, Elie J; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K; Shah, Sohrab P; Boutros, Paul C; Morin, Ryan D
2017-05-01
The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. © The Author 2017. Published by Oxford University Press.
Pujar, Shashikant; O’Leary, Nuala A; Farrell, Catherine M; Mudge, Jonathan M; Wallin, Craig; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bult, Carol J; Frankish, Adam; Pruitt, Kim D
2018-01-01
Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. PMID:29126148
CoVaCS: a consensus variant calling system.
Chiara, Matteo; Gioiosa, Silvia; Chillemi, Giovanni; D'Antonio, Mattia; Flati, Tiziano; Picardi, Ernesto; Zambelli, Federico; Horner, David Stephen; Pesole, Graziano; Castrignanò, Tiziana
2018-02-05
The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens. Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software. CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .
Hall, William A; Bergom, Carmen; Thompson, Reid F; Baschnagel, Andrew M; Vijayakumar, Srinivasan; Willers, Henning; Li, X Allen; Schultz, Christopher J; Wilson, George D; West, Catharine M L; Capala, Jacek; Coleman, C Norman; Torres-Roca, Javier F; Weidhaas, Joanne; Feng, Felix Y
2018-06-01
To summarize important talking points from a 2016 symposium focusing on real-world challenges to advancing precision medicine in radiation oncology, and to help radiation oncologists navigate the practical challenges of precision, radiation oncology. The American Society for Radiation Oncology, American Association of Physicists in Medicine, and National Cancer Institute cosponsored a meeting on precision medicine in radiation oncology. In June 2016 numerous scientists, clinicians, and physicists convened at the National Institutes of Health to discuss challenges and future directions toward personalized radiation therapy. Various breakout sessions were held to discuss particular components and approaches to the implementation of personalized radiation oncology. This article summarizes the genomically guided radiation therapy breakout session. A summary of existing genomic data enabling personalized radiation therapy, ongoing clinical trials, current challenges, and future directions was collected. The group attempted to provide both a current overview of data that radiation oncologists could use to personalize therapy, along with data that are anticipated in the coming years. It seems apparent from the provided review that a considerable opportunity exists to truly bring genomically guided radiation therapy into clinical reality. Genomically guided radiation therapy is a necessity that must be embraced in the coming years. Incorporating these data into treatment recommendations will provide radiation oncologists with a substantial opportunity to improve outcomes for numerous cancer patients. More research focused on this topic is needed to bring genomic signatures into routine standard of care. Published by Elsevier Inc.
Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.
2015-01-01
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that unit two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea. PMID:25764277
Aquatic Plant Genomics: Advances, Applications, and Prospects
Li, Gaojie; Yang, Jingjing
2017-01-01
Genomics is a discipline in genetics that studies the genome composition of organisms and the precise structure of genes and their expression and regulation. Genomics research has resolved many problems where other biological methods have failed. Here, we summarize advances in aquatic plant genomics with a focus on molecular markers, the genes related to photosynthesis and stress tolerance, comparative study of genomes and genome/transcriptome sequencing technology. PMID:28900619
USDA-ARS?s Scientific Manuscript database
Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...
Sun, Sangrong; Wang, Jinpeng; Yu, Jigao; Meng, Fanbo; Xia, Ruiyan; Wang, Li; Wang, Zhenyi; Ge, Weina; Liu, Xiaojian; Li, Yuxian; Liu, Yinzhe; Yang, Nanshan; Wang, Xiyin
2017-01-01
Grass genomes are complicated structures as they share a common tetraploidization, and particular genomes have been further affected by extra polyploidizations. These events and the following genomic re-patternings have resulted in a complex, interweaving gene homology both within a genome, and between genomes. Accurately deciphering the structure of these complicated plant genomes would help us better understand their compositional and functional evolution at multiple scales. Here, we build on our previous research by performing a hierarchical alignment of the common wheat genome vis-à-vis eight other sequenced grass genomes with most up-to-date assemblies, and annotations. With this data, we constructed a list of the homologous genes, and then, in a layer-by-layer process, separated their orthology, and paralogy that were established by speciations and recursive polyploidizations, respectively. Compared with the other grasses, the far fewer collinear outparalogous genes within each of three subgenomes of common wheat suggest that homoeologous recombination, and genomic fractionation should have occurred after its formation. In sum, this work contributes to the establishment of an important and timely comparative genomics platform for researchers in the grass community and possibly beyond. Homologous gene list can be found in Supplemental material. PMID:28912789
Genetics and evolution of triatomines: from phylogeny to vector control
Gourbière, S; Dorn, P; Tripet, F; Dumonteil, E
2012-01-01
Triatomines are hemipteran bugs acting as vectors of the protozoan parasite Trypanosoma cruzi. This parasite causes Chagas disease, one of the major parasitic diseases in the Americas. Studies of triatomine genetics and evolution have been particularly useful in the design of rational vector control strategies, and are reviewed here. The phylogeography of several triatomine species is now slowly emerging, and the struggle to reconcile the phenotypic, phylogenetic, ecological and epidemiological species concepts makes for a very dynamic field. Population genetic studies using different markers indicate a wide range of population structures, depending on the triatomine species, ranging from highly fragmented to mobile, interbreeding populations. Triatomines transmit T. cruzi in the context of complex interactions between the insect vectors, their bacterial symbionts and the parasites; however, an integrated view of the significance of these interactions in triatomine biology, evolution and in disease transmission is still lacking. The development of novel genetic markers, together with the ongoing sequencing of the Rhodnius prolixus genome and more integrative studies, will provide key tools to expanding our understanding of these important insect vectors and allow the design of improved vector control strategies. PMID:21897436
USDA-ARS?s Scientific Manuscript database
Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...
Insights into structural variations and genome rearrangements in prokaryotic genomes.
Periwal, Vinita; Scaria, Vinod
2015-01-01
Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan
2016-06-13
The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such as segregation distortion, recombination, linkage disequilibrium, and synteny analyses. In the future, GBS can be used to accelerate cranberry molecular breeding through QTL mapping and genome-wide association studies (GWAS).
Deep ancestry of programmed genome rearrangement in lampreys.
Timoshevskiy, Vladimir A; Lampman, Ralph T; Hess, Jon E; Porter, Laurie L; Smith, Jeramiah J
2017-09-01
In most multicellular organisms, the structure and content of the genome is rigorously maintained over the course of development. However some species have evolved genome biologies that permit, or require, developmentally regulated changes in the physical structure and content of the genome (programmed genome rearrangement: PGR). Relatively few vertebrates are known to undergo PGR, although all agnathans surveyed to date (several hagfish and one lamprey: Petromyzon marinus) show evidence of large scale PGR. To further resolve the ancestry of PGR within vertebrates, we developed probes that allow simultaneous tracking of nearly all sequences eliminated by PGR in P. marinus and a second lamprey species (Entosphenus tridentatus). These comparative analyses reveal conserved subcellular structures (lagging chromatin and micronuclei) associated with PGR and provide the first comparative embryological evidence in support of the idea that PGR represents an ancient and evolutionarily stable strategy for regulating inherent developmental/genetic conflicts between germline and soma. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-wide Association Study Identifies Loci for the Polled Phenotype in Yak
Wu, Xiaoyun; Wang, Kun; Ding, Xuezhi; Wang, Mingcheng; Chu, Min; Xie, Xiuyue; Qiu, Qiang; Yan, Ping
2016-01-01
The absence of horns, known as the polled phenotype, is an economically important trait in modern yak husbandry, but the genomic structure and genetic basis of this phenotype have yet to be discovered. Here, we conducted a genome-wide association study with a panel of 10 horned and 10 polled yaks using whole genome sequencing. We mapped the POLLED locus to a 200-kb interval, which comprises three protein-coding genes. Further characterization of the candidate region showed recent artificial selection signals resulting from the breeding process. We suggest that expressional variations rather than structural variations in protein probably contribute to the polled phenotype. Our results not only represent the first and important step in establishing the genomic structure of the polled region in yak, but also add to our understanding of the polled trait in bovid species. PMID:27389700
24 CFR 35.1355 - Ongoing lead-based paint maintenance and reevaluation activities.
Code of Federal Regulations, 2012 CFR
2012-04-01
... Secretary, Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Methods and Standards for Lead-Paint Hazard Evaluation and Hazard Reduction Activities... 24 Housing and Urban Development 1 2012-04-01 2012-04-01 false Ongoing lead-based paint...
24 CFR 35.1355 - Ongoing lead-based paint maintenance and reevaluation activities.
Code of Federal Regulations, 2013 CFR
2013-04-01
... Secretary, Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Methods and Standards for Lead-Paint Hazard Evaluation and Hazard Reduction Activities... 24 Housing and Urban Development 1 2013-04-01 2013-04-01 false Ongoing lead-based paint...
24 CFR 35.1355 - Ongoing lead-based paint maintenance and reevaluation activities.
Code of Federal Regulations, 2011 CFR
2011-04-01
... Secretary, Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Methods and Standards for Lead-Paint Hazard Evaluation and Hazard Reduction Activities... 24 Housing and Urban Development 1 2011-04-01 2011-04-01 false Ongoing lead-based paint...
24 CFR 35.1355 - Ongoing lead-based paint maintenance and reevaluation activities.
Code of Federal Regulations, 2014 CFR
2014-04-01
... Secretary, Department of Housing and Urban Development LEAD-BASED PAINT POISONING PREVENTION IN CERTAIN RESIDENTIAL STRUCTURES Methods and Standards for Lead-Paint Hazard Evaluation and Hazard Reduction Activities... 24 Housing and Urban Development 1 2014-04-01 2014-04-01 false Ongoing lead-based paint...
De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J.; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E. Richard; Soriani, Marco; Donati, Claudio
2014-01-01
One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen. PMID:24706866
De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E Richard; Soriani, Marco; Donati, Claudio
2014-04-08
One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen.
Genome alignment with graph data structures: a comparison
2014-01-01
Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884
Visualization of DNA Replication in the Vertebrate Model System DT40 using the DNA Fiber Technique
Schwab, Rebekka A.V.; Niedzwiedz, Wojciech
2011-01-01
Maintenance of replication fork stability is of utmost importance for dividing cells to preserve viability and prevent disease. The processes involved not only ensure faithful genome duplication in the face of endogenous and exogenous DNA damage but also prevent genomic instability, a recognized causative factor in tumor development. Here, we describe a simple and cost-effective fluorescence microscopy-based method to visualize DNA replication in the avian B-cell line DT40. This cell line provides a powerful tool to investigate protein function in vivo by reverse genetics in vertebrate cells1. DNA fiber fluorography in DT40 cells lacking a specific gene allows one to elucidate the function of this gene product in DNA replication and genome stability. Traditional methods to analyze replication fork dynamics in vertebrate cells rely on measuring the overall rate of DNA synthesis in a population of pulse-labeled cells. This is a quantitative approach and does not allow for qualitative analysis of parameters that influence DNA synthesis. In contrast, the rate of movement of active forks can be followed directly when using the DNA fiber technique2-4. In this approach, nascent DNA is labeled in vivo by incorporation of halogenated nucleotides (Fig 1A). Subsequently, individual fibers are stretched onto a microscope slide, and the labeled DNA replication tracts are stained with specific antibodies and visualized by fluorescence microscopy (Fig 1B). Initiation of replication as well as fork directionality is determined by the consecutive use of two differently modified analogues. Furthermore, the dual-labeling approach allows for quantitative analysis of parameters that influence DNA synthesis during the S-phase, i.e. replication structures such as ongoing and stalled forks, replication origin density as well as fork terminations. Finally, the experimental procedure can be accomplished within a day, and requires only general laboratory equipment and a fluorescence microscope. PMID:22064662
Firacative, Carolina; Roe, Chandler C.; Malik, Richard; Ferreira-Paim, Kennio; Escandón, Patricia; Sykes, Jane E.; Castañón-Olivares, Laura Rocío; Contreras-Peres, Cudberto; Samayoa, Blanca; Sorrell, Tania C.; Castañeda, Elizabeth; Lockhart, Shawn R.; Engelthaler, David M.; Meyer, Wieland
2016-01-01
The emerging pathogen Cryptococcus gattii causes life-threatening disease in immunocompetent and immunocompromised hosts. Of the four major molecular types (VGI-VGIV), the molecular type VGIII has recently emerged as cause of disease in otherwise healthy individuals, prompting a need to investigate its population genetic structure to understand if there are potential genotype-dependent characteristics in its epidemiology, environmental niche(s), host range and clinical features of disease. Multilocus sequence typing (MLST) of 122 clinical, environmental and veterinary C. gattii VGIII isolates from Australia, Colombia, Guatemala, Mexico, New Zealand, Paraguay, USA and Venezuela, and whole genome sequencing (WGS) of 60 isolates representing all established MLST types identified four divergent sub-populations. The majority of the isolates belong to two main clades, corresponding either to serotype B or C, indicating an ongoing species evolution. Both major clades included clinical, environmental and veterinary isolates. The C. gattii VGIII population was genetically highly diverse, with minor differences between countries, isolation source, serotype and mating type. Little to no recombination was found between the two major groups, serotype B and C, at the whole and mitochondrial genome level. C. gattii VGIII is widespread in the Americas, with sporadic cases occurring elsewhere, WGS revealed Mexico and USA as a likely origin of the serotype B VGIII population and Colombia as a possible origin of the serotype C VGIII population. Serotype B isolates are more virulent than serotype C isolates in a murine model of infection, causing predominantly pulmonary cryptococcosis. No specific link between genotype and virulence was observed. Antifungal susceptibility testing against six antifungal drugs revealed that serotype B isolates are more susceptible to azoles than serotype C isolates, highlighting the importance of strain typing to guide effective treatment to improve the disease outcome. PMID:27494185
Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R
2014-01-01
The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
Structure and variation of the mitochondrial genome of fishes.
Satoh, Takashi P; Miya, Masaki; Mabuchi, Kohji; Nishida, Mutsumi
2016-09-07
The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets. An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species. Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.
RNA 3D Modules in Genome-Wide Predictions of RNA 2D Structure
Theis, Corinna; Zirbel, Craig L.; zu Siederdissen, Christian Höner; Anthon, Christian; Hofacker, Ivo L.; Nielsen, Henrik; Gorodkin, Jan
2015-01-01
Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution. These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D module prediction tools and apply them on a 13-way vertebrate sequence-based alignment. We find that RNA 3D modules predicted by metaRNAmodules and JAR3D are significantly enriched in the screened windows compared to their shuffled counterparts. The initially estimated FDR of 47.0% is lowered to below 25% when certain 3D module predictions are present in the window of the 2D prediction. We discuss the implications and prospects for further development of computational strategies for detection of RNA 2D structure in genomic sequence. PMID:26509713
Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick
2014-02-01
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y. F.; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie
2014-01-01
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here. PMID:24284329
Characterizing polymorphic inversions in human genomes by single-cell sequencing
Sanders, Ashley D.; Hills, Mark; Porubský, David; Guryev, Victor; Falconer, Ester; Lansdorp, Peter M.
2016-01-01
Identifying genomic features that differ between individuals and cells can help uncover the functional variants that drive phenotypes and disease susceptibilities. For this, single-cell studies are paramount, as it becomes increasingly clear that the contribution of rare but functional cellular subpopulations is important for disease prognosis, management, and progression. Until now, studying these associations has been challenged by our inability to map structural rearrangements accurately and comprehensively. To overcome this, we coupled single-cell sequencing of DNA template strands (Strand-seq) with custom analysis software to rapidly discover, map, and genotype genomic rearrangements at high resolution. This allowed us to explore the distribution and frequency of inversions in a heterogeneous cell population, identify several polymorphic domains in complex regions of the genome, and locate rare alleles in the reference assembly. We then mapped the entire genomic complement of inversions within two unrelated individuals to characterize their distinct inversion profiles and built a nonredundant global reference of structural rearrangements in the human genome. The work described here provides a powerful new framework to study structural variation and genomic heterogeneity in single-cell samples, whether from individuals for population studies or tissue types for biomarker discovery. PMID:27472961
Structure-seq2: sensitive and accurate genome-wide profiling of RNA structure in vivo
Ritchey, Laura E.; Su, Zhao; Tang, Yin; Tack, David C.
2017-01-01
Abstract RNA serves many functions in biology such as splicing, temperature sensing, and innate immunity. These functions are often determined by the structure of RNA. There is thus a pressing need to understand RNA structure and how it changes during diverse biological processes both in vivo and genome-wide. Here, we present Structure-seq2, which provides nucleotide-resolution RNA structural information in vivo and genome-wide. This optimized version of our original Structure-seq method increases sensitivity by at least 4-fold and improves data quality by minimizing formation of a deleterious by-product, reducing ligation bias, and improving read coverage. We also present a variation of Structure-seq2 in which a biotinylated nucleotide is incorporated during reverse transcription, which greatly facilitates the protocol by eliminating two PAGE purification steps. We benchmark Structure-seq2 on both mRNA and rRNA structure in rice (Oryza sativa). We demonstrate that Structure-seq2 can lead to new biological insights. Our Structure-seq2 datasets uncover hidden breaks in chloroplast rRNA and identify a previously unreported N1-methyladenosine (m1A) in a nuclear-encoded Oryza sativa rRNA. Overall, Structure-seq2 is a rapid, sensitive, and unbiased method to probe RNA in vivo and genome-wide that facilitates new insights into RNA biology. PMID:28637286
Comparative Reannotation of 21 Aspergillus Genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salamov, Asaf; Riley, Robert; Kuo, Alan
2013-03-08
We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one whichmore » most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.« less
Sorimachi, Kenji; Okayasu, Teiji; Ohhira, Shuji
2015-04-01
Normalized nucleotide and amino acid contents of complete genome sequences can be visualized as radar charts. The shapes of these charts depict the characteristics of an organism's genome. The normalized values calculated from the genome sequence theoretically exclude experimental errors. Further, because normalization is independent of both target size and kind, this procedure is applicable not only to single genes but also to whole genomes, which consist of a huge number of different genes. In this review, we discuss the applications of the normalization of the nucleotide and predicted amino acid contents of complete genomes to the investigation of genome structure and to evolutionary research from primitive organisms to Homo sapiens. Some of the results could never have been obtained from the analysis of individual nucleotide or amino acid sequences but were revealed only after the normalization of nucleotide and amino acid contents was applied to genome research. The discovery that genome structure was homogeneous was obtained only after normalization methods were applied to the nucleotide or predicted amino acid contents of genome sequences. Normalization procedures are also applicable to evolutionary research. Thus, normalization of the contents of whole genomes is a useful procedure that can help to characterize organisms.
Genetic diversity and population structure of Musa accessions in ex situ conservation
2013-01-01
Background Banana cultivars are mostly derived from hybridization between wild diploid subspecies of Musa acuminata (A genome) and M. balbisiana (B genome), and they exhibit various levels of ploidy and genomic constitution. The Embrapa ex situ Musa collection contains over 220 accessions, of which only a few have been genetically characterized. Knowledge regarding the genetic relationships and diversity between modern cultivars and wild relatives would assist in conservation and breeding strategies. Our objectives were to determine the genomic constitution based on Internal Transcribed Spacer (ITS) regions polymorphism and the ploidy of all accessions by flow cytometry and to investigate the population structure of the collection using Simple Sequence Repeat (SSR) loci as co-dominant markers based on Structure software, not previously performed in Musa. Results From the 221 accessions analyzed by flow cytometry, the correct ploidy was confirmed or established for 212 (95.9%), whereas digestion of the ITS region confirmed the genomic constitution of 209 (94.6%). Neighbor-joining clustering analysis derived from SSR binary data allowed the detection of two major groups, essentially distinguished by the presence or absence of the B genome, while subgroups were formed according to the genomic composition and commercial classification. The co-dominant nature of SSR was explored to analyze the structure of the population based on a Bayesian approach, detecting 21 subpopulations. Most of the subpopulations were in agreement with the clustering analysis. Conclusions The data generated by flow cytometry, ITS and SSR supported the hypothesis about the occurrence of homeologue recombination between A and B genomes, leading to discrepancies in the number of sets or portions from each parental genome. These phenomenons have been largely disregarded in the evolution of banana, as the “single-step domestication” hypothesis had long predominated. These findings will have an impact in future breeding approaches. Structure analysis enabled the efficient detection of ancestry of recently developed tetraploid hybrids by breeding programs, and for some triploids. However, for the main commercial subgroups, Structure appeared to be less efficient to detect the ancestry in diploid groups, possibly due to sampling restrictions. The possibility of inferring the membership among accessions to correct the effects of genetic structure opens possibilities for its use in marker-assisted selection by association mapping. PMID:23497122
Genomic variation at the tips of the adaptive radiation of Darwin's finches.
Chaves, Jaime A; Cooper, Elizabeth A; Hendry, Andrew P; Podos, Jeffrey; De León, Luis F; Raeymaekers, Joost A M; MacMillan, W Owen; Uy, J Albert C
2016-11-01
Adaptive radiation unfolds as selection acts on the genetic variation underlying functional traits. The nature of this variation can be revealed by studying the tips of an ongoing adaptive radiation. We studied genomic variation at the tips of the Darwin's finch radiation; specifically focusing on polymorphism within, and variation among, three sympatric species of the genus Geospiza. Using restriction site-associated DNA (RAD-seq), we characterized 32 569 single-nucleotide polymorphisms (SNPs), from which 11 outlier SNPs for beak and body size were uncovered by a genomewide association study (GWAS). Principal component analysis revealed that these 11 SNPs formed four statistically linked groups. Stepwise regression then revealed that the first PC score, which included 6 of the 11 top SNPs, explained over 80% of the variation in beak size, suggesting that selection on these traits influences multiple correlated loci. The two SNPs most strongly associated with beak size were near genes associated with beak morphology across deeper branches of the radiation: delta-like 1 homologue (DLK1) and high-mobility group AT-hook 2 (HMGA2). Our results suggest that (i) key adaptive traits are associated with a small fraction of the genome (11 of 32 569 SNPs), (ii) SNPs linked to the candidate genes are dispersed throughout the genome (on several chromosomes), and (iii) micro- and macro-evolutionary variation (roots and tips of the radiation) involve some shared and some unique genomic regions. © 2016 John Wiley & Sons Ltd.
Dayer, Mohammad Reza; Dayer, Mohammad Saaid; Rezatofighi, Seyedeh Elham
2015-04-01
The Crimean-Congo Hemorrhagic Fever (CCHF) is an infectious disease of high virulence and mortality caused by a negative sense RNA nairovirus. The genomic RNA of CCHFV is enwrapped by its nucleoprotein. Positively charged residues on CCHFV nucleoprotein provide multiple binding sites to facilitate genomic RNA encapsidation. In the present work, we investigated the mechanism underlying preferential packaging of the negative sense genomic RNA by CCHFV nucleoprotein in the presence of host cell RNAs during viral assembly. The work included genome sequence analyses for different families of negative and positive sense RNA viruses, using serial docking experiments and molecular dynamic simulations. Our results indicated that the main determinant parameter of the nucleoprotein binding affinity for negative sense RNA is the ratio of purine/pyrimidine in the RNA molecule. A negative sense RNA with a purine/pyrimidine ratio (>1) higher than that of a positive sense RNA (<1) exhibits higher affinity for the nucleoprotein. Our calculations revealed that a negative sense RNA expresses about 0.5 kJ/mol higher binding energy per nucleotide compared to a positive sense RNA. This energy difference produces a binding energy high enough to make the negative sense RNA, the preferred substrate for packaging by CCHFV nucleoprotein in the presence of cellular or complementary positive sense RNAs. The outcome of this study may contribute to ongoing researches on other viral diseases caused by negative sense RNA viruses such as Ebola virus which poses a security threat to all humanity.
Adam, R D
2000-04-10
Giardia lamblia is a protozoan parasite of humans and other mammals that is thought to be one of the most primitive extant eukaryotic organisms. Although distinctly eukaryotic, it is notable for its lack of mitochondria, nucleoli, and perixosomes. It has been suggested that Giardia spp. are pre-mitochondriate organisms, but the identification of genes in G. lamblia thought to be of mitochondrial origin has generated controversy regarding that designation. Giardi lamblia trophozoites have two nuclei that are identical in all ways that have been studied. They are polyploid with at least four, and perhaps eight or more, copies of each of five chromosomes per organism and have an estimated genome complexity of 1.2x10(7)bp of DNA, and GC content of 46%. There is evidence for recombination at the telomeres of some of the chromosomes, and multiple size variants of single chromosomes have been identified within cloned isolates. However, the internal regions of the chromosomes demonstrate no evidence of recombination. For example, there is no evidence for control of vsp gene expression by DNA recombination, and no evidence for rapid mutation in the vsp genes. Single pass sequences of approximately 9% of the G. lamblia genome have already been obtained. An ongoing genome project plans to obtain approximately 95% of the genome by a random approach, as well as a complete physical map using a bacterial artificial chromosome library. The results will facilitate a better understanding of the biology of Giardia spp. as well as their phylogenetic relationship to other primitive organisms.
Genomic Infectious Disease Epidemiology in Partially Sampled and Ongoing Outbreaks
Didelot, Xavier; Fraser, Christophe; Gardy, Jennifer; Colijn, Caroline
2017-01-01
Abstract Genomic data are increasingly being used to understand infectious disease epidemiology. Isolates from a given outbreak are sequenced, and the patterns of shared variation are used to infer which isolates within the outbreak are most closely related to each other. Unfortunately, the phylogenetic trees typically used to represent this variation are not directly informative about who infected whom—a phylogenetic tree is not a transmission tree. However, a transmission tree can be inferred from a phylogeny while accounting for within-host genetic diversity by coloring the branches of a phylogeny according to which host those branches were in. Here we extend this approach and show that it can be applied to partially sampled and ongoing outbreaks. This requires computing the correct probability of an observed transmission tree and we herein demonstrate how to do this for a large class of epidemiological models. We also demonstrate how the branch coloring approach can incorporate a variable number of unique colors to represent unsampled intermediates in transmission chains. The resulting algorithm is a reversible jump Monte–Carlo Markov Chain, which we apply to both simulated data and real data from an outbreak of tuberculosis. By accounting for unsampled cases and an outbreak which may not have reached its end, our method is uniquely suited to use in a public health environment during real-time outbreak investigations. We implemented this transmission tree inference methodology in an R package called TransPhylo, which is freely available from https://github.com/xavierdidelot/TransPhylo. PMID:28100788
Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies
Schatz, Michael C.; Phillippy, Adam M.; Sommer, Daniel D.; Delcher, Arthur L.; Puiu, Daniela; Narzisi, Giuseppe; Salzberg, Steven L.; Pop, Mihai
2013-01-01
Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler. These applications have been used to assemble and analyze dozens of genomes ranging in complexity from simple microbial species through mammalian genomes. Recent efforts have been focused on enhancing support for new data characteristics brought on by second- and now third-generation sequencing. This review describes the major components of AMOS in light of these challenges, with an emphasis on methods for assessing assembly quality and the visual analytics capabilities of Hawkeye. These interactive graphical aspects are essential for navigating and understanding the complexities of a genome assembly, from the overall genome structure down to individual bases. Hawkeye and AMOS are available open source at http://amos.sourceforge.net. PMID:22199379
Gericke, G S
2010-05-01
Previous reports of specific patterns of increased fragility at common chromosomal fragile sites (CFS) found in association with certain neurobehavioural disorders did not attract attention at the time due to a shift towards molecular approaches to delineate neuropsychiatric disorder candidate genes. Links with miRNA, altered methylation and the origin of copy number variation indicate that CFS region characteristics may be part of chromatinomic mechanisms that are increasingly linked with neuroplasticity and memory. Current reports of large-scale double-stranded DNA breaks in differentiating neurons and evidence of ongoing DNA demethylation of specific gene promoters in adult hippocampus may shed new light on the dynamic epigenetic changes that are increasingly appreciated as contributing to long-term memory consolidation. The expression of immune recombination activating genes in key stress-induced memory regions suggests the adoption by the brain of this ancient pattern recognition and memory system to establish a structural basis for long-term memory through controlled chromosomal breakage at highly specific genomic regions. It is furthermore considered that these mechanisms for management of epigenetic information related to stress memory could be linked, in some instances, with the transfer of the somatically acquired information to the germline. Here, rearranged sequences can be subjected to further selection and possible eventual retrotranscription to become part of the more stable coding machinery if proven to be crucial for survival and reproduction. While linkage of cognitive memory with stress and fear circuitry and memory establishment through structural DNA modification is proposed as a normal process, inappropriate activation of immune-like genomic rearrangement processes through traumatic stress memory may have the potential to lead to undesirable activation of neuro-inflammatory processes. These theories could have a significant impact on the interpretation of risks posed by heredity and the environment and the search for neuropsychiatric candidate genes.
Molecular motors and their functions in plants
NASA Technical Reports Server (NTRS)
Reddy, A. S.
2001-01-01
Molecular motors that hydrolyze ATP and use the derived energy to generate force are involved in a variety of diverse cellular functions. Genetic, biochemical, and cellular localization data have implicated motors in a variety of functions such as vesicle and organelle transport, cytoskeleton dynamics, morphogenesis, polarized growth, cell movements, spindle formation, chromosome movement, nuclear fusion, and signal transduction. In non-plant systems three families of molecular motors (kinesins, dyneins, and myosins) have been well characterized. These motors use microtubules (in the case of kinesines and dyneins) or actin filaments (in the case of myosins) as tracks to transport cargo materials intracellularly. During the last decade tremendous progress has been made in understanding the structure and function of various motors in animals. These studies are yielding interesting insights into the functions of molecular motors and the origin of different families of motors. Furthermore, the paradigm that motors bind cargo and move along cytoskeletal tracks does not explain the functions of some of the motors. Relatively little is known about the molecular motors and their roles in plants. In recent years, by using biochemical, cell biological, molecular, and genetic approaches a few molecular motors have been isolated and characterized from plants. These studies indicate that some of the motors in plants have novel features and regulatory mechanisms. The role of molecular motors in plant cell division, cell expansion, cytoplasmic streaming, cell-to-cell communication, membrane trafficking, and morphogenesis is beginning to be understood. Analyses of the Arabidopsis genome sequence database (51% of genome) with conserved motor domains of kinesin and myosin families indicates the presence of a large number (about 40) of molecular motors and the functions of many of these motors remain to be discovered. It is likely that many more motors with novel regulatory mechanisms that perform plant-specific functions are yet to be discovered. Although the identification of motors in plants, especially in Arabidopsis, is progressing at a rapid pace because of the ongoing plant genome sequencing projects, only a few plant motors have been characterized in any detail. Elucidation of function and regulation of this multitude of motors in a given species is going to be a challenging and exciting area of research in plant cell biology. Structural features of some plant motors suggest calcium, through calmodulin, is likely to play a key role in regulating the function of both microtubule- and actin-based motors in plants.
Structural Genomics of Protein Phosphatases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Almo,S.; Bonanno, J.; Sauder, J.
The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptionalmore » regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.« less
Mullapudi, Edukondalu; Füzik, Tibor; Přidal, Antonín; Plevka, Pavel
2017-02-15
Viruses of the family Dicistroviridae can cause substantial economic damage by infecting agriculturally important insects. Israeli acute bee paralysis virus (IAPV) causes honeybee colony collapse disorder in the United States. High-resolution molecular details of the genome delivery mechanism of dicistroviruses are unknown. Here we present a cryo-electron microscopy analysis of IAPV virions induced to release their genomes in vitro We determined structures of full IAPV virions primed to release their genomes to a resolution of 3.3 Å and of empty capsids to a resolution of 3.9 Å. We show that IAPV does not form expanded A particles before genome release as in the case of related enteroviruses of the family Picornaviridae The structural changes observed in the empty IAPV particles include detachment of the VP4 minor capsid proteins from the inner face of the capsid and partial loss of the structure of the N-terminal arms of the VP2 capsid proteins. Unlike the case for many picornaviruses, the empty particles of IAPV are not expanded relative to the native virions and do not contain pores in their capsids that might serve as channels for genome release. Therefore, rearrangement of a unique region of the capsid is probably required for IAPV genome release. Honeybee populations in Europe and North America are declining due to pressure from pathogens, including viruses. Israeli acute bee paralysis virus (IAPV), a member of the family Dicistroviridae, causes honeybee colony collapse disorder in the United States. The delivery of virus genomes into host cells is necessary for the initiation of infection. Here we present a structural cryo-electron microscopy analysis of IAPV particles induced to release their genomes. We show that genome release is not preceded by an expansion of IAPV virions as in the case of related picornaviruses that infect vertebrates. Furthermore, minor capsid proteins detach from the capsid upon genome release. The genome leaves behind empty particles that have compact protein shells. Copyright © 2017 Mullapudi et al.
Functional RNA elements in the dengue virus genome.
Gebhard, Leopoldo G; Filomatori, Claudia V; Gamarnik, Andrea V
2011-09-01
Dengue virus (DENV) genome amplification is a process that involves the viral RNA, cellular and viral proteins, and a complex architecture of cellular membranes. The viral RNA is not a passive template during this process; it plays an active role providing RNA signals that act as promoters, enhancers and/or silencers of the replication process. RNA elements that modulate RNA replication were found at the 5' and 3' UTRs and within the viral coding sequence. The promoter for DENV RNA synthesis is a large stem loop structure located at the 5' end of the genome. This structure specifically interacts with the viral polymerase NS5 and promotes RNA synthesis at the 3' end of a circularized genome. The circular conformation of the viral genome is mediated by long range RNA-RNA interactions that span thousands of nucleotides. Recent studies have provided new information about the requirement of alternative, mutually exclusive, structures in the viral RNA, highlighting the idea that the viral genome is flexible and exists in different conformations. In this article, we describe elements in the promoter SLA and other RNA signals involved in NS5 polymerase binding and activity, and provide new ideas of how dynamic secondary and tertiary structures of the viral RNA participate in the viral life cycle.
Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus
Brinton, Margo A.; Basu, Mausumi
2015-01-01
The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510
Center for Cancer Genomics | Office of Cancer Genomics
The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approaches, CCG aims to accelerate structural, functional and computational research to explore cancer mechanisms, discover new cancer targets, and develop new therapeutics.
Kim, Sanghee; Lim, Byung-Jin; Min, Gi-Sik; Choi, Han-Gu
2013-05-10
Copepoda is the most diverse and abundant group of crustaceans, but its phylogenetic relationships are ambiguous. Mitochondrial (mt) genomes are useful for studying evolutionary history, but only six complete Copepoda mt genomes have been made available and these have extremely rearranged genome structures. This study determined the mt genome of Calanus hyperboreus, making it the first reported Arctic copepod mt genome and the first complete mt genome of a calanoid copepod. The mt genome of C. hyperboreus is 17,910 bp in length and it contains the entire set of 37 mt genes, including 13 protein-coding genes, 2 rRNAs, and 22 tRNAs. It has a very unusual gene structure, including the longest control region reported for a crustacean, a large tRNA gene cluster, and reversed GC skews in 11 out of 13 protein-coding genes (84.6%). Despite the unusual features, comparing this genome to published copepod genomes revealed retained pan-crustacean features, as well as a conserved calanoid-specific pattern. Our data provide a foundation for exploring the calanoid pattern and the mechanisms of mt gene rearrangement in the evolutionary history of the copepod mt genome. Copyright © 2012 Elsevier B.V. All rights reserved.
Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin
2014-10-01
Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout
Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo
2015-01-01
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
The complete genomic sequence of egg drop syndrome virus strain AAV-2.
Jin, Q; Zeng, L; Yang, F; Li, M; Hou, Y
1999-12-01
In the search for the genome of egg drop syndrome virus (EDSV-76) Chinese strain AAV-2, part of restriction endonuclease physical map is analyzed, the complete genomic library is organized. On basis of this, the complete genome nucleotide sequences (32 838 bp in length, including terminal structures) are determined. The data analysis shows: compared with the other Adenoviruses, strain AAV-2 has more disparity on genomic structure and the distribution of open reading frame (ORF). There are no clear E1, E3 and E4 regions in AAV-2 genome. Two segments located at both ends of genome (1.1 kb and 8.3 kb in length respectively) have no homology with the other adenovirus genomes. In addition, strain AAV-2 genome lacks ORFs encoding ElA, pV and pIX, which are common ORFs encoding early, lately proteins in Adenovirus. This reveals differences between EDSA-76, the sole standard strain of group III Avian Adenoviruses, and the other Avian Adenoviruses for the first time. It will help the search for Avian Adenovirus and will also help the search of all Adenoviruses.
SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.
Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi
2016-11-11
The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.
Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart
2017-04-24
High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.
Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W
2009-01-01
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’. PMID:19840100
Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W
2009-11-01
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element 'mobilome'.
Evolution and spread of Ebola virus in Liberia, 2014–2015
Ladner, Jason T.; Wiley, Michael R.; Mate, Suzanne; Dudas, Gytis; Prieto, Karla; Lovett, Sean; Nagle, Elyse R.; Beitzel, Brett; Gilbert, Merle L.; Fakoli, Lawrence; Diclaro, Joseph W.; Schoepp, Randal J.; Fair, Joseph; Kuhn, Jens H.; Hensley, Lisa E.; Park, Daniel J.; Sabeti, Pardis C.; Rambaut, Andrew; Sanchez-Lockhart, Mariano; Bolay, Fatorma K.; Kugelman, Jeffrey R.; Palacios, Gustavo
2015-01-01
SUMMARY The 2013–present Western African Ebola virus disease (EVD) outbreak is the largest ever recorded with >28,000 reported cases. Ebola virus (EBOV) genome sequencing has played an important role throughout this outbreak; however, relatively few sequences have been determined from patients in Liberia, the second worst-affected country. Here, we report 140 EBOV genome sequences from the second wave of the Liberian outbreak and analyze them in combination with 782 previously published sequences from throughout the Western African outbreak. While multiple early introductions of EBOV to Liberia are evident, the majority of Liberian EVD cases are consistent with a single introduction, followed by spread and diversification within the country. Movement of the virus within Liberia was widespread and reintroductions from Liberia served as an important source for the continuation of the already ongoing EVD outbreak in Guinea. Overall, little evidence was found for incremental adaptation of EBOV to the human host. PMID:26651942
The impact of transposable elements on mammalian development.
Garcia-Perez, Jose L; Widmann, Thomas J; Adams, Ian R
2016-11-15
Despite often being classified as selfish or junk DNA, transposable elements (TEs) are a group of abundant genetic sequences that have a significant impact on mammalian development and genome regulation. In recent years, our understanding of how pre-existing TEs affect genome architecture, gene regulatory networks and protein function during mammalian embryogenesis has dramatically expanded. In addition, the mobilization of active TEs in selected cell types has been shown to generate genetic variation during development and in fully differentiated tissues. Importantly, the ongoing domestication and evolution of TEs appears to provide a rich source of regulatory elements, functional modules and genetic variation that fuels the evolution of mammalian developmental processes. Here, we review the functional impact that TEs exert on mammalian developmental processes and discuss how the somatic activity of TEs can influence gene regulatory networks. © 2016. Published by The Company of Biologists Ltd.
Ancestry and demography and descendants of Iron Age nomads of the Eurasian Steppe
NASA Astrophysics Data System (ADS)
Unterländer, Martina; Palstra, Friso; Lazaridis, Iosif; Pilipenko, Aleksandr; Hofmanová, Zuzana; Groß, Melanie; Sell, Christian; Blöcher, Jens; Kirsanow, Karola; Rohland, Nadin; Rieger, Benjamin; Kaiser, Elke; Schier, Wolfram; Pozdniakov, Dimitri; Khokhlov, Aleksandr; Georges, Myriam; Wilde, Sandra; Powell, Adam; Heyer, Evelyne; Currat, Mathias; Reich, David; Samashev, Zainolla; Parzinger, Hermann; Molodin, Vyacheslav I.; Burger, Joachim
2017-03-01
During the 1st millennium before the Common Era (BCE), nomadic tribes associated with the Iron Age Scythian culture spread over the Eurasian Steppe, covering a territory of more than 3,500 km in breadth. To understand the demographic processes behind the spread of the Scythian culture, we analysed genomic data from eight individuals and a mitochondrial dataset of 96 individuals originating in eastern and western parts of the Eurasian Steppe. Genomic inference reveals that Scythians in the east and the west of the steppe zone can best be described as a mixture of Yamnaya-related ancestry and an East Asian component. Demographic modelling suggests independent origins for eastern and western groups with ongoing gene-flow between them, plausibly explaining the striking uniformity of their material culture. We also find evidence that significant gene-flow from east to west Eurasia must have occurred early during the Iron Age.
Setup, Validation and Quality Control of a Centralized WGS laboratory - Lessons Learned.
Arnold, Cath; Edwards, Kirstin; Desai, Meeta; Platt, Steve; Green, Jonathan; Conway, David
2018-04-25
Routine use of Whole Genome analysis for infectious diseases can be used to enlighten various scenarios pertaining to Public Health, including identification of microbial pathogens; relating individual cases to an outbreak of infectious disease; establishing an association between an outbreak of food poisoning and a specific food vehicle; inferring drug susceptibility; source tracing of contaminants and study of variations in the genome affect pathogenicity/virulence. We describe the setup, validation and ongoing verification of a centralised WGS laboratory to carry out the sequencing for these public health functions for the National Infection Services, Public Health England in the UK. The performance characteristics and Quality Control metrics measured during validation and verification of the entire end to end process (accuracy, precision, reproducibility and repeatability) are described and include information regarding the automated pass and release of data to service users without intervention. © Crown copyright 2018.
Apollo: a community resource for genome annotation editing
Ed, Lee; Nomi, Harris; Mark, Gibson; Raymond, Chetty; Suzanna, Lewis
2009-01-01
Summary: Apollo is a genome annotation-editing tool with an easy to use graphical interface. It is a component of the GMOD project, with ongoing development driven by the community. Recent additions to the software include support for the generic feature format version 3 (GFF3), continuous transcriptome data, a full Chado database interface, integration with remote services for on-the-fly BLAST and Primer BLAST analyses, graphical interfaces for configuring user preferences and full undo of all edit operations. Apollo's user community continues to grow, including its use as an educational tool for college and high-school students. Availability: Apollo is a Java application distributed under a free and open source license. Installers for Windows, Linux, Unix, Solaris and Mac OS X are available at http://apollo.berkeleybop.org, and the source code is available from the SourceForge CVS repository at http://gmod.cvs.sourceforge.net/gmod/apollo. Contact: elee@berkeleybop.org PMID:19439563
A glimpse into the proteome of phototrophic bacterium Rhodobacter capsulatus.
Onder, Ozlem; Aygun-Sunar, Semra; Selamoglu, Nur; Daldal, Fevzi
2010-01-01
A first glimpse into the proteome of Rhodobacter capsulatus revealed more than 450 (with over 210 cytoplasmic and 185 extracytoplasmic known as well as 55 unknown) proteins that are identified with high degree of confidence using nLC-MS/MS analyses. The accumulated data provide a solid platform for ongoing efforts to establish the proteome of this species and the cellular locations of its constituents. They also indicate that at least 40 of the identified proteins, which were annotated in genome databases as unknown hypothetical proteins, correspond to predicted translation products that are indeed present in cells under the growth conditions used in this work. In addition, matching the identification labels of the proteins reported between the two available R. capsulatus genome databases (ERGO-light with RRCxxxxx and NT05 with NT05RCxxxx numbers) indicated that 11 such proteins are listed only in the latter database.
Apollo: a community resource for genome annotation editing.
Lee, Ed; Harris, Nomi; Gibson, Mark; Chetty, Raymond; Lewis, Suzanna
2009-07-15
Apollo is a genome annotation-editing tool with an easy to use graphical interface. It is a component of the GMOD project, with ongoing development driven by the community. Recent additions to the software include support for the generic feature format version 3 (GFF3), continuous transcriptome data, a full Chado database interface, integration with remote services for on-the-fly BLAST and Primer BLAST analyses, graphical interfaces for configuring user preferences and full undo of all edit operations. Apollo's user community continues to grow, including its use as an educational tool for college and high-school students. Apollo is a Java application distributed under a free and open source license. Installers for Windows, Linux, Unix, Solaris and Mac OS X are available at http://apollo.berkeleybop.org, and the source code is available from the SourceForge CVS repository at http://gmod.cvs.sourceforge.net/gmod/apollo.
Bacterial antisense RNAs are mainly the product of transcriptional noise.
Lloréns-Rico, Verónica; Cano, Jaime; Kamminga, Tjerko; Gil, Rosario; Latorre, Amparo; Chen, Wei-Hua; Bork, Peer; Glass, John I; Serrano, Luis; Lluch-Senar, Maria
2016-03-01
cis-Encoded antisense RNAs (asRNAs) are widespread along bacterial transcriptomes. However, the role of most of these RNAs remains unknown, and there is an ongoing discussion as to what extent these transcripts are the result of transcriptional noise. We show, by comparative transcriptomics of 20 bacterial species and one chloroplast, that the number of asRNAs is exponentially dependent on the genomic AT content and that expression of asRNA at low levels exerts little impact in terms of energy consumption. A transcription model simulating mRNA and asRNA production indicates that the asRNA regulatory effect is only observed above certain expression thresholds, substantially higher than physiological transcript levels. These predictions were verified experimentally by overexpressing nine different asRNAs in Mycoplasma pneumoniae. Our results suggest that most of the antisense transcripts found in bacteria are the consequence of transcriptional noise, arising at spurious promoters throughout the genome.
Bacterial antisense RNAs are mainly the product of transcriptional noise
Lloréns-Rico, Verónica; Cano, Jaime; Kamminga, Tjerko; Gil, Rosario; Latorre, Amparo; Chen, Wei-Hua; Bork, Peer; Glass, John I.; Serrano, Luis; Lluch-Senar, Maria
2016-01-01
cis-Encoded antisense RNAs (asRNAs) are widespread along bacterial transcriptomes. However, the role of most of these RNAs remains unknown, and there is an ongoing discussion as to what extent these transcripts are the result of transcriptional noise. We show, by comparative transcriptomics of 20 bacterial species and one chloroplast, that the number of asRNAs is exponentially dependent on the genomic AT content and that expression of asRNA at low levels exerts little impact in terms of energy consumption. A transcription model simulating mRNA and asRNA production indicates that the asRNA regulatory effect is only observed above certain expression thresholds, substantially higher than physiological transcript levels. These predictions were verified experimentally by overexpressing nine different asRNAs in Mycoplasma pneumoniae. Our results suggest that most of the antisense transcripts found in bacteria are the consequence of transcriptional noise, arising at spurious promoters throughout the genome. PMID:26973873
Rabah, Samar O; Lee, Chaehee; Hajrah, Nahid H; Makki, Rania M; Alharby, Hesham F; Alhebshi, Alawiah M; Sabir, Jamal S M; Jansen, Robert K; Ruhlman, Tracey A
2017-11-01
In plant evolution, intracellular gene transfer (IGT) is a prevalent, ongoing process. While nuclear and mitochondrial genomes are known to integrate foreign DNA via IGT and horizontal gene transfer (HGT), plastid genomes (plastomes) have resisted foreign DNA incorporation and only recently has IGT been uncovered in the plastomes of a few land plants. In this study, we completed plastome sequences for l0 crop species and describe a number of structural features including variation in gene and intron content, inversions, and expansion and contraction of the inverted repeat (IR). We identified a putative in cinnamon ( J. Presl) and other sequenced Lauraceae and an apparent functional transfer of to the nucleus of quinoa ( Willd.). In the orchard tree cashew ( L.), we report the insertion of an ∼6.7-kb fragment of mitochondrial DNA into the plastome IR. BLASTn analyses returned high identity hits to mitogenome sequences including an intact open reading frame. Using three plastome markers for five species of , we generated a phylogeny to investigate the distribution and timing of the insertion. Four species share the insertion, suggesting that this event occurred <20 million yr ago in a single clade in the genus. Our study extends the observation of mitochondrial to plastome IGT to include long-lived tree species. While previous studies have suggested possible mechanisms facilitating IGT to the plastome, more examples of this phenomenon, along with more complete mitogenome sequences, will be required before a common, or variable, mechanism can be elucidated. Copyright © 2017 Crop Science Society of America.
Characterizing Novel Archaeal Lineages in Salton Sea Sediments
NASA Astrophysics Data System (ADS)
Tarn, J.; Valentine, D. L.
2016-12-01
Biological communities in extreme environments are often dominated by microorganisms of the domain Archaea. Abundant microbial assemblages of this group are found in the hottest, saltiest, and most thermodynamically-limited ecosystems on earth. These taxing surroundings are thought to impose a state of chronic energy stress on resident organisms due to high costs of cellular maintenance relative to resource availability. Even in more temperate settings, Archaea are regularly associated with low-nutrient lifestyles, reflecting their adaptation to extreme, biologically-limiting conditions, which may be an ancestral, domain-wide trait. In this study, we seek to characterize the Archaeal community of the Salton Sea, where members of this domain are novel and highly abundant. Previous work by Swan et al. in 2010 showed that gradients in salinity, sulfate, carbon and nitrogen across sediment horizons of the Salton Sea are linked to changes in Archaeal dominance and community structure. In light of recent taxonomic revisions of the domain, I reclassified the 107 published small subunit rRNA Archaeal sequences from the 2010 study using updated reference databases. The majority of these Euryarchaeal sequences were reassigned to the so-called DPANN superphylum, with Pacearchaeota-related sequences being very abundant in shallow, organic-rich sediments. In deeper, energy-limited strata, several groups of Bathyarchaeota and one divergent DPANN clade were dominant. Ongoing metagenomic work on these sediment communities is being used to assemble genomes of these novel Archaeal groups. These results will help define genomic adaptations of Salton Sea Archaea to varying levels of energy stress as well as inform future cultivation efforts.
Casas-Marce, Mireia; Marmesat, Elena; Soriano, Laura; Martínez-Cruz, Begoña; Lucena-Perez, Maria; Nocete, Francisco; Rodríguez-Hidalgo, Antonio; Canals, Antoni; Nadal, Jordi; Detry, Cleia; Bernáldez-Sánchez, Eloísa; Fernández-Rodríguez, Carlos; Pérez-Ripoll, Manuel; Stiller, Mathias; Hofreiter, Michael; Rodríguez, Alejandro; Revilla, Eloy; Delibes, Miguel; Godoy, José A
2017-11-01
There is the tendency to assume that endangered species have been both genetically and demographically healthier in the past, so that any genetic erosion observed today was caused by their recent decline. The Iberian lynx (Lynx pardinus) suffered a dramatic and continuous decline during the 20th century, and now shows extremely low genome- and species-wide genetic diversity among other signs of genomic erosion. We analyze ancient (N = 10), historical (N = 245), and contemporary (N = 172) samples with microsatellite and mitogenome data to reconstruct the species' demography and investigate patterns of genetic variation across space and time. Iberian lynx populations transitioned from low but significantly higher genetic diversity than today and shallow geographical differentiation millennia ago, through a structured metapopulation with varying levels of diversity during the last centuries, to two extremely genetically depauperate and differentiated remnant populations by 2002. The historical subpopulations show varying extents of genetic drift in relation to their recent size and time in isolation, but these do not predict whether the populations persisted or went finally extinct. In conclusion, current genetic patterns were mainly shaped by genetic drift, supporting the current admixture of the two genetic pools and calling for a comprehensive genetic management of the ongoing conservation program. This study illustrates how a retrospective analysis of demographic and genetic patterns of endangered species can shed light onto their evolutionary history and this, in turn, can inform conservation actions. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Structural genomics reveals EVE as a new ASCH/PUA-related domain
Bertonati, Claudia; Punta, Marco; Fischer, Markus; Yachdav, Guy; Forouhar, Farhad; Zhou, Weihong; Kuzin, Alexander P.; Seetharaman, Jayaraman; Abashidze, Mariam; Ramelot, Theresa A.; Kennedy, Michael A.; Cort, John R.; Belachew, Adam; Hunt, John F.; Tong, Liang; Montelione, Gaetano T.; Rost, Burkhard
2014-01-01
Summary We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links. PMID:19191354
Structural Genomics Reveals EVE as a New ASCH/PUA-Related Domain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bertonati, C.; Punta, M; Fischer, M
2008-01-01
We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE.more » Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links.« less
Molecular Innovation in Ciliates with Complex Genome Rearrangements
NASA Astrophysics Data System (ADS)
Neme, R.; Landweber, L. F.
2017-07-01
We study molecular innovation in several ciliate species with unique massive genome rearrangements to understand how a radically distinct genome architecture can shape the process of acquiring new functions, genes and structures.
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).
Huntemann, Marcel; Ivanova, Natalia N; Mavromatis, Konstantinos; Tripp, H James; Paez-Espino, David; Palaniappan, Krishnaveni; Szeto, Ernest; Pillay, Manoj; Chen, I-Min A; Pati, Amrita; Nielsen, Torben; Markowitz, Victor M; Kyrpides, Nikos C
2015-01-01
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.
Dash, Prasanta K.; Rai, Rhitu
2016-01-01
Evolutionary frozen, genetically sterile and globally iconic fruit “Banana” remained untouched by the green revolution and, as of today, researchers face intrinsic impediments for its varietal improvement. Recently, this wonder crop entered the genomics era with decoding of structural genome of double haploid Pahang (AA genome constitution) genotype of Musa acuminata. Its complex genome decoded by hybrid sequencing strategies revealed panoply of genes and transcription factors involved in the process of sucrose conversion that imparts sweetness to its fruit. Historically, banana has faced the wrath of pandemic bacterial, fungal, and viral diseases and multitude of abiotic stresses that has ruined the livelihood of small/marginal farmers’ and destroyed commercial plantations. Decoding structural genome of this climacteric fruit has given impetus to a deeper understanding of the repertoire of genes involved in disease resistance, understanding the mechanism of dwarfing to develop an ideal plant type, unraveling the process of parthenocarpy, and fruit ripening for better fruit quality. Further, injunction of comparative genomics will usher in integration of information from its decoded genome and other monocots into field applications in banana related but not limited to yield enhancement, food security, livelihood assurance, and energy sustainability. In this mini review, we discuss pre- and post-genomic discoveries and highlight accomplishments in structural genomics, genetic engineering and forward genetic accomplishments with an aim to target genes and transcription factors for translational research in banana. PMID:27833619
Organizational heterogeneity of vertebrate genomes.
Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham
2012-01-01
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae
2013-12-20
Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions.
Ho, Michelle L.; Adler, Benjamin A.; Torre, Michael L.; Silberg, Jonathan J.; Suh, Junghae
2013-01-01
Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications, but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions. PMID:23899192
Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou
2011-11-01
Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.
Muntean, Cristina M; Leopold, Nicolae; Tripon, Carmen; Coste, Ana; Halmagyi, Adela
2015-06-05
In this work the surface-enhanced Raman scattering (SERS) spectra of five genomic DNAs from non-cryopreserved control tomato plants (Lycopersicon esculentum Mill. cultivars Siriana, Darsirius, Kristin, Pontica and Capriciu) respectively, have been analyzed in the wavenumber range 400-1800 cm(-1). Structural changes induced in genomic DNAs upon cryopreservation were discussed in detail for four of the above mentioned tomato cultivars. The surface-enhanced Raman vibrational modes for each of these cases, spectroscopic band assignments and structural interpretations of genomic DNAs are reported. We have found, that DNA isolated from Siriana cultivar leaf tissues suffers the weakest structural changes upon cryogenic storage of tomato shoot apices. On the contrary, genomic DNA extracted from Pontica cultivar is the most responsive system to cryopreservation process. Particularly, both C2'-endo-anti and C3'-endo-anti conformations have been detected. As a general observation, the wavenumber range 1511-1652 cm(-1), being due to dA, dG and dT residues seems to be influenced by cryopreservation process. These changes could reflect unstacking of DNA bases. However, not significant structural changes of genomic DNAs from Siriana, Darsirius and Kristin have been found upon cryopreservation process of tomato cultivars. Based on this work, specific plant DNA-ligand interactions or accurate local structure of DNA in the proximity of a metallic surface, might be further investigated using surface-enhanced Raman spectroscopy. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Muntean, Cristina M.; Leopold, Nicolae; Tripon, Carmen; Coste, Ana; Halmagyi, Adela
2015-06-01
In this work the surface-enhanced Raman scattering (SERS) spectra of five genomic DNAs from non-cryopreserved control tomato plants (Lycopersicon esculentum Mill. cultivars Siriana, Darsirius, Kristin, Pontica and Capriciu) respectively, have been analyzed in the wavenumber range 400-1800 cm-1. Structural changes induced in genomic DNAs upon cryopreservation were discussed in detail for four of the above mentioned tomato cultivars. The surface-enhanced Raman vibrational modes for each of these cases, spectroscopic band assignments and structural interpretations of genomic DNAs are reported. We have found, that DNA isolated from Siriana cultivar leaf tissues suffers the weakest structural changes upon cryogenic storage of tomato shoot apices. On the contrary, genomic DNA extracted from Pontica cultivar is the most responsive system to cryopreservation process. Particularly, both C2‧-endo-anti and C3'-endo-anti conformations have been detected. As a general observation, the wavenumber range 1511-1652 cm-1, being due to dA, dG and dT residues seems to be influenced by cryopreservation process. These changes could reflect unstacking of DNA bases. However, not significant structural changes of genomic DNAs from Siriana, Darsirius and Kristin have been found upon cryopreservation process of tomato cultivars. Based on this work, specific plant DNA-ligand interactions or accurate local structure of DNA in the proximity of a metallic surface, might be further investigated using surface-enhanced Raman spectroscopy.
Genome organization during the cell cycle: unity in division.
Golloshi, Rosela; Sanders, Jacob T; McCord, Rachel Patton
2017-09-01
During the cell cycle, the genome must undergo dramatic changes in structure, from a decondensed, yet highly organized interphase structure to a condensed, generic mitotic chromosome and then back again. For faithful cell division, the genome must be replicated and chromosomes and sister chromatids physically segregated from one another. Throughout these processes, there is feedback and tension between the information-storing role and the physical properties of chromosomes. With a combination of recent techniques in fluorescence microscopy, chromosome conformation capture (Hi-C), biophysical experiments, and computational modeling, we can now attribute mechanisms to many long-observed features of chromosome structure changes during cell division. Apparent conflicts that arise when integrating the concepts from these different proposed mechanisms emphasize that orchestrating chromosome organization during cell division requires a complex system of factors rather than a simple pathway. Cell division is both essential for and threatening to proper genome organization. As interphase three-dimensional (3D) genome structure is quite static at a global level, cell division provides an important window of opportunity to make substantial changes in 3D genome organization in daughter cells, allowing for proper differentiation and development. Mistakes in the process of chromosome condensation or rebuilding the structure after mitosis can lead to diseases such as cancer, premature aging, and neurodegeneration. WIREs Syst Biol Med 2017, 9:e1389. doi: 10.1002/wsbm.1389 For further resources related to this article, please visit the WIREs website. © 2017 Wiley Periodicals, Inc.
Gherghe, Cristina; Lombo, Tania; Leonard, Christopher W.; Datta, Siddhartha A. K.; Bess, Julian W.; Gorelick, Robert J.; Rein, Alan; Weeks, Kevin M.
2010-01-01
All retroviral genomic RNAs contain a cis-acting packaging signal by which dimeric genomes are selectively packaged into nascent virions. However, it is not understood how Gag (the viral structural protein) interacts with these signals to package the genome with high selectivity. We probed the structure of murine leukemia virus RNA inside virus particles using SHAPE, a high-throughput RNA structure analysis technology. These experiments showed that NC (the nucleic acid binding domain derived from Gag) binds within the virus to the sequence UCUG-UR-UCUG. Recombinant Gag and NC proteins bound to this same RNA sequence in dimeric RNA in vitro; in all cases, interactions were strongest with the first U and final G in each UCUG element. The RNA structural context is critical: High-affinity binding requires base-paired regions flanking this motif, and two UCUG-UR-UCUG motifs are specifically exposed in the viral RNA dimer. Mutating the guanosine residues in these two motifs—only four nucleotides per genomic RNA—reduced packaging 100-fold, comparable to the level of nonspecific packaging. These results thus explain the selective packaging of dimeric RNA. This paradigm has implications for RNA recognition in general, illustrating how local context and RNA structure can create information-rich recognition signals from simple single-stranded sequence elements in large RNAs. PMID:20974908