structural genomics target: Topics by Science.gov

Sample records for structural genomics target

Genome Pool Strategy for Structural Coverage of Protein Families

PubMed Central

Jaroszewski, Lukasz; Slabinski, Lukasz; Wooley, John; Deacon, Ashley M.; Lesley, Scott A.; Wilson, Ian. A.; Godzik, Adam

2010-01-01

As noticed by generations of structural biologists, closely homologous proteins may have substantially different crystallization properties and propensities. These observations can be used to systematically introduce additional dimensionality into crystallization trials by targeting homologous proteins from multiple genomes in a “genome pool” strategy. Through extensive use of our recently introduced “crystallization feasibility score” (Slabinski et al., 2007a), we can explain that the genome pool strategy works well because the crystallization feasibility scores are surprisingly broad within families of homologous proteins, with most families containing a range of optimal to very difficult targets. We also show that some families can be regarded as relatively “easy”, where a significant number of proteins are predicted to have optimal crystallization features, and others are “very difficult”, where almost none are predicted to result in a crystal structure. Thus, the outcome of such variable distributions of such crystallizability' preferences leads to uneven structural coverage of known families, with “easier” or “optimal” families having several times more solved structures than “very difficult” ones. Nevertheless, this latter category can be successfully targeted by increasing the number of genomes that are used to select targets from a given family. On average, adding 10 new genomes to the “genome pool” provides more promising targets for 7 “very difficult” families. In contrast, our crystallization feasibility score does not indicate that any specific microbial genomes can be readily classified as “easier” or “very difficult” with respect to providing suitable candidates for crystallization and structure determination. Finally, our analyses show that specific physicochemical properties of the protein sequence favor successful outcomes for structure determination and, hence, the group of proteins with known 3D structures is systematically different from the general pool of known proteins. We, therefore, assess the structural consequences of these differences in protein sequence and protein biophysical properties. PMID:19000818
Target-Pathogen: a structural bioinformatic approach to prioritize drug targets in pathogens

PubMed Central

Sosa, Ezequiel J; Burguener, Germán; Lanzarotti, Esteban; Radusky, Leandro; Pardo, Agustín M; Marti, Marcelo

2018-01-01

Abstract Available genomic data for pathogens has created new opportunities for drug discovery and development to fight them, including new resistant and multiresistant strains. In particular structural data must be integrated with both, gene information and experimental results. In this sense, there is a lack of an online resource that allows genome wide-based data consolidation from diverse sources together with thorough bioinformatic analysis that allows easy filtering and scoring for fast target selection for drug discovery. Here, we present Target-Pathogen database (http://target.sbg.qb.fcen.uba.ar/patho), designed and developed as an online resource that allows the integration and weighting of protein information such as: function, metabolic role, off-targeting, structural properties including druggability, essentiality and omic experiments, to facilitate the identification and prioritization of candidate drug targets in pathogens. We include in the database 10 genomes of some of the most relevant microorganisms for human health (Mycobacterium tuberculosis, Mycobacterium leprae, Klebsiella pneumoniae, Plasmodium vivax, Toxoplasma gondii, Leishmania major, Wolbachia bancrofti, Trypanosoma brucei, Shigella dysenteriae and Schistosoma Smanosoni) and show its applicability. New genomes can be uploaded upon request. PMID:29106651
Target-Pathogen: a structural bioinformatic approach to prioritize drug targets in pathogens.

PubMed

Sosa, Ezequiel J; Burguener, Germán; Lanzarotti, Esteban; Defelipe, Lucas; Radusky, Leandro; Pardo, Agustín M; Marti, Marcelo; Turjanski, Adrián G; Fernández Do Porto, Darío

2018-01-04

Available genomic data for pathogens has created new opportunities for drug discovery and development to fight them, including new resistant and multiresistant strains. In particular structural data must be integrated with both, gene information and experimental results. In this sense, there is a lack of an online resource that allows genome wide-based data consolidation from diverse sources together with thorough bioinformatic analysis that allows easy filtering and scoring for fast target selection for drug discovery. Here, we present Target-Pathogen database (http://target.sbg.qb.fcen.uba.ar/patho), designed and developed as an online resource that allows the integration and weighting of protein information such as: function, metabolic role, off-targeting, structural properties including druggability, essentiality and omic experiments, to facilitate the identification and prioritization of candidate drug targets in pathogens. We include in the database 10 genomes of some of the most relevant microorganisms for human health (Mycobacterium tuberculosis, Mycobacterium leprae, Klebsiella pneumoniae, Plasmodium vivax, Toxoplasma gondii, Leishmania major, Wolbachia bancrofti, Trypanosoma brucei, Shigella dysenteriae and Schistosoma Smanosoni) and show its applicability. New genomes can be uploaded upon request. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
The SGC beyond structural genomics: redefining the role of 3D structures by coupling genomic stratification with fragment-based discovery.

PubMed

Bradley, Anthony R; Echalier, Aude; Fairhead, Michael; Strain-Damerell, Claire; Brennan, Paul; Bullock, Alex N; Burgess-Brown, Nicola A; Carpenter, Elisabeth P; Gileadi, Opher; Marsden, Brian D; Lee, Wen Hwa; Yue, Wyatt; Bountra, Chas; von Delft, Frank

2017-11-08

The ongoing explosion in genomics data has long since outpaced the capacity of conventional biochemical methodology to verify the large number of hypotheses that emerge from the analysis of such data. In contrast, it is still a gold-standard for early phenotypic validation towards small-molecule drug discovery to use probe molecules (or tool compounds), notwithstanding the difficulty and cost of generating them. Rational structure-based approaches to ligand discovery have long promised the efficiencies needed to close this divergence; in practice, however, this promise remains largely unfulfilled, for a host of well-rehearsed reasons and despite the huge technical advances spearheaded by the structural genomics initiatives of the noughties. Therefore the current, fourth funding phase of the Structural Genomics Consortium (SGC), building on its extensive experience in structural biology of novel targets and design of protein inhibitors, seeks to redefine what it means to do structural biology for drug discovery. We developed the concept of a Target Enabling Package (TEP) that provides, through reagents, assays and data, the missing link between genetic disease linkage and the development of usefully potent compounds. There are multiple prongs to the ambition: rigorously assessing targets' genetic disease linkages through crowdsourcing to a network of collaborating experts; establishing a systematic approach to generate the protocols and data that comprise each target's TEP; developing new, X-ray-based fragment technologies for generating high quality chemical matter quickly and cheaply; and exploiting a stringently open access model to build multidisciplinary partnerships throughout academia and industry. By learning how to scale these approaches, the SGC aims to make structures finally serve genomics, as originally intended, and demonstrate how 3D structures systematically allow new modes of druggability to be discovered for whole classes of targets. © 2017 The Author(s).
Using in Vitro Evolution and Whole Genome Analysis To Discover Next Generation Targets for Antimalarial Drug Discovery

PubMed Central

2018-01-01

Although many new anti-infectives have been discovered and developed solely using phenotypic cellular screening and assay optimization, most researchers recognize that structure-guided drug design is more practical and less costly. In addition, a greater chemical space can be interrogated with structure-guided drug design. The practicality of structure-guided drug design has launched a search for the targets of compounds discovered in phenotypic screens. One method that has been used extensively in malaria parasites for target discovery and chemical validation is in vitro evolution and whole genome analysis (IVIEWGA). Here, small molecules from phenotypic screens with demonstrated antiparasitic activity are used in genome-based target discovery methods. In this Review, we discuss the newest, most promising druggable targets discovered or further validated by evolution-based methods, as well as some exceptions. PMID:29451780
Determinants for DNA target structure selectivity of the human LINE-1 retrotransposon endonuclease.

PubMed

Repanas, Kostas; Zingler, Nora; Layer, Liliana E; Schumann, Gerald G; Perrakis, Anastassis; Weichenrieder, Oliver

2007-01-01

The human LINE-1 endonuclease (L1-EN) is the targeting endonuclease encoded by the human LINE-1 (L1) retrotransposon. L1-EN guides the genomic integration of new L1 and Alu elements that presently account for approximately 28% of the human genome. L1-EN bears considerable technological interest, because its target selectivity may ultimately be engineered to allow the site-specific integration of DNA into defined genomic locations. Based on the crystal structure, we generated L1-EN mutants to analyze and manipulate DNA target site recognition. Crystal structures and their dynamic and functional analysis show entire loop grafts to be feasible, resulting in altered specificity, while individual point mutations do not change the nicking pattern of L1-EN. Structural parameters of the DNA target seem more important for recognition than the nucleotide sequence, and nicking profiles on DNA oligonucleotides in vitro are less well defined than the respective integration site consensus in vivo. This suggests that additional factors other than the DNA nicking specificity of L1-EN contribute to the targeted integration of non-LTR retrotransposons.
The Paris-Sud yeast structural genomics pilot-project: from structure to function.

PubMed

Quevillon-Cheruel, Sophie; Liger, Dominique; Leulliot, Nicolas; Graille, Marc; Poupon, Anne; Li de La Sierra-Gallay, Inès; Zhou, Cong-Zhao; Collinet, Bruno; Janin, Joël; Van Tilbeurgh, Herman

2004-01-01

We present here the outlines and results from our yeast structural genomics (YSG) pilot-project. A lab-scale platform for the systematic production and structure determination is presented. In order to validate this approach, 250 non-membrane proteins of unknown structure were targeted. Strategies and final statistics are evaluated. We finally discuss the opportunity of structural genomics programs to contribute to functional biochemical annotation.
Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach.

PubMed

Ng, Clara; Hauptman, Ruth; Zhang, Yinliang; Bourne, Philip E; Xie, Lei

2014-01-01

The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.
microRNA-122 target sites in the hepatitis C virus RNA NS5B coding region and 3' untranslated region: function in replication and influence of RNA secondary structure.

PubMed

Gerresheim, Gesche K; Dünnes, Nadia; Nieder-Röhrmann, Anika; Shalamova, Lyudmila A; Fricke, Markus; Hofacker, Ivo; Höner Zu Siederdissen, Christian; Marz, Manja; Niepmann, Michael

2017-02-01

We have analyzed the binding of the liver-specific microRNA-122 (miR-122) to three conserved target sites of hepatitis C virus (HCV) RNA, two in the non-structural protein 5B (NS5B) coding region and one in the 3' untranslated region (3'UTR). miR-122 binding efficiency strongly depends on target site accessibility under conditions when the range of flanking sequences available for the formation of local RNA secondary structures changes. Our results indicate that the particular sequence feature that contributes most to the correlation between target site accessibility and binding strength varies between different target sites. This suggests that the dynamics of miRNA/Ago2 binding not only depends on the target site itself but also on flanking sequence context to a considerable extent, in particular in a small viral genome in which strong selection constraints act on coding sequence and overlapping cis-signals and model the accessibility of cis-signals. In full-length genomes, single and combination mutations in the miR-122 target sites reveal that site 5B.2 is positively involved in regulating overall genome replication efficiency, whereas mutation of site 5B.3 showed a weaker effect. Mutation of the 3'UTR site and double or triple mutants showed no significant overall effect on genome replication, whereas in a translation reporter RNA, the 3'UTR target site inhibits translation directed by the HCV 5'UTR. Thus, the miR-122 target sites in the 3'-region of the HCV genome are involved in a complex interplay in regulating different steps of the HCV replication cycle.
Structural Genomics and Drug Discovery for Infectious Diseases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson, W.F.

The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging,more » or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.« less
Advances in targeted genome editing.

PubMed

Perez-Pinera, Pablo; Ousterout, David G; Gersbach, Charles A

2012-08-01

New technologies have recently emerged that enable targeted editing of genomes in diverse systems. This includes precise manipulation of gene sequences in their natural chromosomal context and addition of transgenes to specific genomic loci. This progress has been facilitated by advances in engineering targeted nucleases with programmable, site-specific DNA-binding domains, including zinc finger proteins and transcription activator-like effectors (TALEs). Recent improvements have enhanced nuclease performance, accelerated nuclease assembly, and lowered the cost of genome editing. These advances are driving new approaches to many areas of biotechnology, including biopharmaceutical production, agriculture, creation of transgenic organisms and cell lines, and studies of genome structure, regulation, and function. Genome editing is also being investigated in preclinical and clinical gene therapies for many diseases. Copyright © 2012 Elsevier Ltd. All rights reserved.
Partial DNA-guided Cas9 enables genome editing with reduced off-target activity

PubMed Central

Yin, Hao; Song, Chun-Qing; Suresh, Sneha; Kwan, Suet-Yan; Wu, Qiongqiong; Walsh, Stephen; Ding, Junmei; Bogorad, Roman L; Zhu, Lihua Julie; Wolfe, Scot A; Koteliansky, Victor; Xue, Wen; Langer, Robert; Anderson, Daniel G

2018-01-01

CRISPR–Cas9 is a versatile RNA-guided genome editing tool. Here we demonstrate that partial replacement of RNA nucleotides with DNA nucleotides in CRISPR RNA (crRNA) enables efficient gene editing in human cells. This strategy of partial DNA replacement retains on-target activity when used with both crRNA and sgRNA, as well as with multiple guide sequences. Partial DNA replacement also works for crRNA of Cpf1, another CRISPR system. We find that partial DNA replacement in the guide sequence significantly reduces off-target genome editing through focused analysis of off-target cleavage, measurement of mismatch tolerance and genome-wide profiling of off-target sites. Using the structure of the Cas9–sgRNA complex as a guide, the majority of the 3′ end of crRNA can be replaced with DNA nucleotide, and the 5 - and 3′-DNA-replaced crRNA enables efficient genome editing. Cas9 guided by a DNA–RNA chimera may provide a generalized strategy to reduce both the cost and the off-target genome editing in human cells. PMID:29377001
Functional RNA structures throughout the Hepatitis C Virus genome.

PubMed

Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie

2017-06-01

The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.
Center for Cancer Genomics | Office of Cancer Genomics

Cancer.gov

The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approaches, CCG aims to accelerate structural, functional and computational research to explore cancer mechanisms, discover new cancer targets, and develop new therapeutics.
Leveraging structure determination with fragment screening for infectious disease drug targets: MECP synthase from Burkholderia pseudomallei

DOE Office of Scientific and Technical Information (OSTI.GOV)

Begley, Darren W.; Hartley, Robert C.; Davies, Douglas R.

As part of the Seattle Structural Genomics Center for Infectious Disease, we seek to enhance structural genomics with ligand-bound structure data which can serve as a blueprint for structure-based drug design. We have adapted fragment-based screening methods to our structural genomics pipeline to generate multiple ligand-bound structures of high priority drug targets from pathogenic organisms. In this study, we report fragment screening methods and structure determination results for 2C-methyl-D-erythritol-2,4-cyclo-diphosphate (MECP) synthase from Burkholderia pseudomallei, the gram-negative bacterium which causes melioidosis. Screening by nuclear magnetic resonance spectroscopy as well as crystal soaking followed by X-ray diffraction led to the identification ofmore » several small molecules which bind this enzyme in a critical metabolic pathway. A series of complex structures obtained with screening hits reveal distinct binding pockets and a range of small molecules which form complexes with the target. Additional soaks with these compounds further demonstrate a subset of fragments to only bind the protein when present in specific combinations. This ensemble of fragment-bound complexes illuminates several characteristics of MECP synthase, including a previously unknown binding surface external to the catalytic active site. These ligand-bound structures now serve to guide medicinal chemists and structural biologists in rational design of novel inhibitors for this enzyme.« less
Complete genome-wide screening and subtractive genomic approach revealed new virulence factors, potential drug targets against bio-war pathogen Brucella melitensis 16M

PubMed Central

Pradeepkiran, Jangampalli Adi; Sainath, Sri Bhashyam; Kumar, Konidala Kranthi; Bhaskar, Matcha

2015-01-01

Brucella melitensis 16M is a Gram-negative coccobacillus that infects both animals and humans. It causes a disease known as brucellosis, which is characterized by acute febrile illness in humans and causes abortions in livestock. To prevent and control brucellosis, identification of putative drug targets is crucial. The present study aimed to identify drug targets in B. melitensis 16M by using a subtractive genomic approach. We used available database repositories (Database of Essential Genes, Kyoto Encyclopedia of Genes and Genomes Automatic Annotation Server, and Kyoto Encyclopedia of Genes and Genomes) to identify putative genes that are nonhomologous to humans and essential for pathogen B. melitensis 16M. The results revealed that among 3 Mb genome size of pathogen, 53 putative characterized and 13 uncharacterized hypothetical genes were identified; further, from Basic Local Alignment Search Tool protein analysis, one hypothetical protein showed a close resemblance (50%) to Silicibacter pomeroyi DUF1285 family protein (2RE3). A further homology model of the target was constructed using MODELLER 9.12 and optimized through variable target function method by molecular dynamics optimization with simulating annealing. The stereochemical quality of the restrained model was evaluated by PROCHECK, VERIFY-3D, ERRAT, and WHATIF servers. Furthermore, structure-based virtual screening was carried out against the predicted active site of the respective protein using the glycerol structural analogs from the PubChem database. We identified five best inhibitors with strong affinities, stable interactions, and also with reliable drug-like properties. Hence, these leads might be used as the most effective inhibitors of modeled protein. The outcome of the present work of virtual screening of putative gene targets might facilitate design of potential drugs for better treatment against brucellosis. PMID:25834405
Combining functional and structural genomics to sample the essential Burkholderia structome.

PubMed

Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

2013-01-01

The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request.
Structural Basis for Guide RNA Processing and Seed-Dependent DNA Targeting by CRISPR-Cas12a.

PubMed

Swarts, Daan C; van der Oost, John; Jinek, Martin

2017-04-20

The CRISPR-associated protein Cas12a (Cpf1), which has been repurposed for genome editing, possesses two distinct nuclease activities: endoribonuclease activity for processing its own guide RNAs and RNA-guided DNase activity for target DNA cleavage. To elucidate the molecular basis of both activities, we determined crystal structures of Francisella novicida Cas12a bound to guide RNA and in complex with an R-loop formed by a non-cleavable guide RNA precursor and a full-length target DNA. Corroborated by biochemical experiments, these structures reveal the mechanisms of guide RNA processing and pre-ordering of the seed sequence in the guide RNA that primes Cas12a for target DNA binding. Furthermore, the R-loop complex structure reveals the strand displacement mechanism that facilitates guide-target hybridization and suggests a mechanism for double-stranded DNA cleavage involving a single active site. Together, these insights advance our mechanistic understanding of Cas12a enzymes and may contribute to further development of genome editing technologies. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

PubMed

Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

2016-04-07

DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. Copyright © 2016 Vrljicak et al.
A machine-learned computational functional genomics-based approach to drug classification.

PubMed

Lötsch, Jörn; Ultsch, Alfred

2016-12-01

The public accessibility of "big data" about the molecular targets of drugs and the biological functions of genes allows novel data science-based approaches to pharmacology that link drugs directly with their effects on pathophysiologic processes. This provides a phenotypic path to drug discovery and repurposing. This paper compares the performance of a functional genomics-based criterion to the traditional drug target-based classification. Knowledge discovery in the DrugBank and Gene Ontology databases allowed the construction of a "drug target versus biological process" matrix as a combination of "drug versus genes" and "genes versus biological processes" matrices. As a canonical example, such matrices were constructed for classical analgesic drugs. These matrices were projected onto a toroid grid of 50 × 82 artificial neurons using a self-organizing map (SOM). The distance, respectively, cluster structure of the high-dimensional feature space of the matrices was visualized on top of this SOM using a U-matrix. The cluster structure emerging on the U-matrix provided a correct classification of the analgesics into two main classes of opioid and non-opioid analgesics. The classification was flawless with both the functional genomics and the traditional target-based criterion. The functional genomics approach inherently included the drugs' modulatory effects on biological processes. The main pharmacological actions known from pharmacological science were captures, e.g., actions on lipid signaling for non-opioid analgesics that comprised many NSAIDs and actions on neuronal signal transmission for opioid analgesics. Using machine-learned techniques for computational drug classification in a comparative assessment, a functional genomics-based criterion was found to be similarly suitable for drug classification as the traditional target-based criterion. This supports a utility of functional genomics-based approaches to computational system pharmacology for drug discovery and repurposing.

An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

PubMed

Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

2015-05-01

Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. © 2015 WILEY Periodicals, Inc.
Cloning, production, and purification of proteins for a medium-scale structural genomics project.

PubMed

Quevillon-Cheruel, Sophie; Collinet, Bruno; Trésaugues, Lionel; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Zhou, Cong-Zhao; Liger, Dominique; Bettache, Nabila; Poupon, Anne; Aboulfath, Ilham; Leulliot, Nicolas; Janin, Joël; van Tilbeurgh, Herman

2007-01-01

The South-Paris Yeast Structural Genomics Pilot Project (http://www.genomics.eu.org) aims at systematically expressing, purifying, and determining the three-dimensional structures of Saccharomyces cerevisiae proteins. We have already cloned 240 yeast open reading frames in the Escherichia coli pET system. Eighty-two percent of the targets can be expressed in E. coli, and 61% yield soluble protein. We have currently purified 58 proteins. Twelve X-ray structures have been solved, six are in progress, and six other proteins gave crystals. In this chapter, we present the general experimental flowchart applied for this project. One of the main difficulties encountered in this pilot project was the low solubility of a great number of target proteins. We have developed parallel strategies to recover these proteins from inclusion bodies, including refolding, coexpression with chaperones, and an in vitro expression system. A limited proteolysis protocol, developed to localize flexible regions in proteins that could hinder crystallization, is also described.
Toward Repurposing Metformin as a Precision Anti-Cancer Therapy Using Structural Systems Pharmacology

PubMed Central

Hart, Thomas; Dider, Shihab; Han, Weiwei; Xu, Hua; Zhao, Zhongming; Xie, Lei

2016-01-01

Metformin, a drug prescribed to treat type-2 diabetes, exhibits anti-cancer effects in a portion of patients, but the direct molecular and genetic interactions leading to this pleiotropic effect have not yet been fully explored. To repurpose metformin as a precision anti-cancer therapy, we have developed a novel structural systems pharmacology approach to elucidate metformin’s molecular basis and genetic biomarkers of action. We integrated structural proteome-scale drug target identification with network biology analysis by combining structural genomic, functional genomic, and interactomic data. Through searching the human structural proteome, we identified twenty putative metformin binding targets and their interaction models. We experimentally verified the interactions between metformin and our top-ranked kinase targets. Notably, kinases, particularly SGK1 and EGFR were identified as key molecular targets of metformin. Subsequently, we linked these putative binding targets to genes that do not directly bind to metformin but whose expressions are altered by metformin through protein-protein interactions, and identified network biomarkers of phenotypic response of metformin. The molecular targets and the key nodes in genetic networks are largely consistent with the existing experimental evidence. Their interactions can be affected by the observed cancer mutations. This study will shed new light into repurposing metformin for safe, effective, personalized therapies. PMID:26841718
Low incidence of SNVs and indels in trio genomes of Cas9-mediated multiplex edited sheep.

PubMed

Wang, Xiaolong; Liu, Jing; Niu, Yiyuan; Li, Yan; Zhou, Shiwei; Li, Chao; Ma, Baohua; Kou, Qifang; Petersen, Bjoern; Sonstegard, Tad; Huang, Xingxu; Jiang, Yu; Chen, Yulin

2018-05-25

The simplicity of the CRISPR/Cas9 system has enabled its widespread applications in generating animal models, functional genomic screening and in treating genetic and infectious diseases. However, unintended mutations produced by off-target CRISPR/Cas9 nuclease activity may lead to negative consequences. Especially, a very recent study found that gene editing can introduce hundreds of unintended mutations into the genome, and have attracted wide attention. To address the off-target concerns, urgent characterization of the CRISPR/Cas9-mediated off-target mutagenesis is highly anticipated. Here we took advantage of our previously generated gene-edited sheep and performed family trio-based whole genome sequencing which is capable of discriminating variants in the edited progenies that are inherited, naturally generated, or induced by genetic modification. Three family trios were re-sequenced at a high average depth of genomic coverage (~ 25.8×). After developing a pipeline to comprehensively analyze the sequence data for de novo single nucleotide variants, indels and structural variations from the genome; we only found a single unintended event in the form of a 2.4 kb inversion induced by site-specific double-strand breaks between two sgRNA targeting sites at the MSTN locus with a low incidence. We provide the first report on the fidelity of CRISPR-based modification for sheep genomes targeted simultaneously for gene breaks at three coding sequence locations. The trio-based sequencing approach revealed almost negligible off-target modifications, providing timely evidences of the safe application of genome editing in vivo with CRISPR/Cas9.
Mms1 is an assistant for regulating G-quadruplex DNA structures.

PubMed

Schwindt, Eike; Paeschke, Katrin

2018-06-01

The preservation of genome stability is fundamental for every cell. Genomic integrity is constantly challenged. Among those challenges are also non-canonical nucleic acid structures. In recent years, scientists became aware of the impact of G-quadruplex (G4) structures on genome stability. It has been shown that folded G4-DNA structures cause changes in the cell, such as transcriptional up/down-regulation, replication stalling, or enhanced genome instability. Multiple helicases have been identified to regulate G4 structures and by this preserve genome stability. Interestingly, although these helicases are mostly ubiquitous expressed, they show specificity for G4 regulation in certain cellular processes (e.g., DNA replication). To this date, it is not clear how this process and target specificity of helicases are achieved. Recently, Mms1, an ubiquitin ligase complex protein, was identified as a novel G4-DNA-binding protein that supports genome stability by aiding Pif1 helicase binding to these regions. In this perspective review, we discuss the question if G4-DNA interacting proteins are fundamental for helicase function and specificity at G4-DNA structures.
The Druggable Pocketome of Corynebacterium diphtheriae: A New Approach for in silico Putative Druggable Targets

PubMed Central

Hassan, Syed S.; Jamal, Syed B.; Radusky, Leandro G.; Tiwari, Sandeep; Ullah, Asad; Ali, Javed; Behramand; de Carvalho, Paulo V. S. D.; Shams, Rida; Khan, Sabir; Figueiredo, Henrique C. P.; Barh, Debmalya; Ghosh, Preetam; Silva, Artur; Baumbach, Jan; Röttger, Richard; Turjanski, Adrián G.; Azevedo, Vasco A. C.

2018-01-01

Diphtheria is an acute and highly infectious disease, previously regarded as endemic in nature but vaccine-preventable, is caused by Corynebacterium diphtheriae (Cd). In this work, we used an in silico approach along the 13 complete genome sequences of C. diphtheriae followed by a computational assessment of structural information of the binding sites to characterize the “pocketome druggability.” To this end, we first computed the “modelome” (3D structures of a complete genome) of a randomly selected reference strain Cd NCTC13129; that had 13,763 open reading frames (ORFs) and resulted in 1,253 (∼9%) structure models. The amino acid sequences of these modeled structures were compared with the remaining 12 genomes and consequently, 438 conserved protein sequences were obtained. The RCSB-PDB database was consulted to check the template structures for these conserved proteins and as a result, 401 adequate 3D models were obtained. We subsequently predicted the protein pockets for the obtained set of models and kept only the conserved pockets that had highly druggable (HD) values (137 across all strains). Later, an off-target host homology analyses was performed considering the human proteome using NCBI database. Furthermore, the gene essentiality analysis was carried out that gave a final set of 10-conserved targets possessing highly druggable protein pockets. To check the target identification robustness of the pipeline used in this work, we crosschecked the final target list with another in-house target identification approach for C. diphtheriae thereby obtaining three common targets, these were; hisE-phosphoribosyl-ATP pyrophosphatase, glpX-fructose 1,6-bisphosphatase II, and rpsH-30S ribosomal protein S8. Our predicted results suggest that the in silico approach used could potentially aid in experimental polypharmacological target determination in C. diphtheriae and other pathogens, thereby, might complement the existing and new drug-discovery pipelines. PMID:29487617
Chemical biology on the genome.

PubMed

Balasubramanian, Shankar

2014-08-15

In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome.

PubMed

Wang, Yi; Liu, Xianju; Ren, Chong; Zhong, Gan-Yuan; Yang, Long; Li, Shaohua; Liang, Zhenchang

2016-04-21

CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of humans, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific target sites for CRISPR/Cas9 have been computationally identified for several annual model and crop species, but such sites have not been reported for perennial, woody fruit species. In this study, we identified and characterized five types of CRISPR/Cas9 target sites in the widely cultivated grape species Vitis vinifera and developed a user-friendly database for editing grape genomes in the future. A total of 35,767,960 potential CRISPR/Cas9 target sites were identified from grape genomes in this study. Among them, 22,597,817 target sites were mapped to specific genomic locations and 7,269,788 were found to be highly specific. Protospacers and PAMs were found to distribute uniformly and abundantly in the grape genomes. They were present in all the structural elements of genes with the coding region having the highest abundance. Five PAM types, TGG, AGG, GGG, CGG and NGG, were observed. With the exception of the NGG type, they were abundantly present in the grape genomes. Synteny analysis of similar genes revealed that the synteny of protospacers matched the synteny of homologous genes. A user-friendly database containing protospacers and detailed information of the sites was developed and is available for public use at the Grape-CRISPR website ( http://biodb.sdau.edu.cn/gc/index.html ). Grape genomes harbour millions of potential CRISPR/Cas9 target sites. These sites are widely distributed among and within chromosomes with predominant abundance in the coding regions of genes. We developed a publicly-accessible Grape-CRISPR database for facilitating the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among other functions, the database allows users to identify and select multi-protospacers for editing similar sequences in grape genomes simultaneously.
Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment

PubMed Central

Xu, Dong; Zhang, Yang

2013-01-01

Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms. PMID:23719418
Schistosoma comparative genomics: integrating genome structure, parasite biology and anthelmintic discovery

PubMed Central

Swain, Martin T.; Larkin, Denis M.; Caffrey, Conor R.; Davies, Stephen J.; Loukas, Alex; Skelly, Patrick J.; Hoffmann, Karl F.

2011-01-01

Schistosoma genomes provide a comprehensive resource for identifying the molecular processes that shape parasite evolution and for discovering novel chemotherapeutic or immunoprophylactic targets. Here, we demonstrate how intra- and intergenus comparative genomics can be used to drive these investigations forward, illustrate the advantages and limitations of these approaches and review how post genomic technologies offer complementary strategies for genome characterisation. While sequencing and functional characterisation of other schistosome/platyhelminth genomes continues to expedite anthelmintic discovery, we contend that future priorities should equally focus on improving assembly quality, and chromosomal assignment, of existing schistosome/platyhelminth genomes. PMID:22024648
Genome engineering in human cells.

PubMed

Song, Minjung; Kim, Young-Hoon; Kim, Jin-Soo; Kim, Hyongbum

2014-01-01

Genome editing in human cells is of great value in research, medicine, and biotechnology. Programmable nucleases including zinc-finger nucleases, transcription activator-like effector nucleases, and RNA-guided engineered nucleases recognize a specific target sequence and make a double-strand break at that site, which can result in gene disruption, gene insertion, gene correction, or chromosomal rearrangements. The target sequence complexities of these programmable nucleases are higher than 3.2 mega base pairs, the size of the haploid human genome. Here, we briefly introduce the structure of the human genome and the characteristics of each programmable nuclease, and review their applications in human cells including pluripotent stem cells. In addition, we discuss various delivery methods for nucleases, programmable nickases, and enrichment of gene-edited human cells, all of which facilitate efficient and precise genome editing in human cells.
Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

Cancer.gov

Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.
The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

PubMed

Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

2015-01-01

A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arakaki, Tracy; Le Trong, Isolde; Structural Genomics of Pathogenic Protozoa

2006-03-01

The crystal structure of a conserved hypothetical protein from L. major, Pfam sequence family PF04543, structural genomics target ID Lmaj006129AAA, has been determined at a resolution of 1.6 Å. The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure elements and disorder. The structure of one of these variants, Lmaj006129AAH, was solved by multiple-wavelength anomalous diffraction (MAD)more » using ELVES, an automatic protein crystal structure-determination system. This model was then successfully used as a molecular-replacement probe for the parent full-length target, Lmaj006129AAA. The final structure of Lmaj006129AAA was refined to an R value of 0.185 (R{sub free} = 0.229) at 1.60 Å resolution. Structure and sequence comparisons based on Lmaj006129AAA suggest that proteins belonging to Pfam sequence families PF04543 and PF01878 may share a common ligand-binding motif.« less
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

PubMed Central

2009-01-01

Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

PubMed

Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

2009-08-06

Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

PubMed Central

Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

2013-01-01

Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request. PMID:23382856
Multi-target parallel processing approach for gene-to-structure determination of the influenza polymerase PB2 subunit.

PubMed

Armour, Brianna L; Barnes, Steve R; Moen, Spencer O; Smith, Eric; Raymond, Amy C; Fairman, James W; Stewart, Lance J; Staker, Bart L; Begley, Darren W; Edwards, Thomas E; Lorimer, Donald D

2013-06-28

Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year (1). Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans (2). Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains.
Targeting RNA–Protein Interactions within the Human Immunodeficiency Virus Type 1 Lifecycle

PubMed Central

2013-01-01

RNA–protein interactions are vital throughout the HIV-1 life cycle for the successful production of infectious virus particles. One such essential RNA–protein interaction occurs between the full-length genomic viral RNA and the major structural protein of the virus. The initial interaction is between the Gag polyprotein and the viral RNA packaging signal (psi or Ψ), a highly conserved RNA structural element within the 5′-UTR of the HIV-1 genome, which has gained attention as a potential therapeutic target. Here, we report the application of a target-based assay to identify small molecules, which modulate the interaction between Gag and Ψ. We then demonstrate that one such molecule exhibits potent inhibitory activity in a viral replication assay. The mode of binding of the lead molecules to the RNA target was characterized by 1H NMR spectroscopy. PMID:24358934
Transposon-like properties of the major, long repetitive sequence family in the genome of Physarum polycephalum

PubMed Central

Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman

1985-01-01

A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652

BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers.

PubMed

Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E

2015-02-18

Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Herbicide targets and detoxification proteins in sugarcane: from gene assembly to structure modelling.

PubMed

Lloyd Evans, Dyfed; Joshi, Shailesh Vinay

2017-07-01

In a genome context, sugarcane is a classic orphan crop, in that no genome and only very few genes have been assembled. We have devised a novel exome assembly methodology that has allowed us to assemble and characterize 49 genes that serve as herbicide targets, safener interacting proteins, and members of herbicide detoxification pathways within the sugarcane genome. We have structurally modelled the products of each of these genes, as well as determining allelic, genomic, and RNA-Seq based polymorphisms for each gene. This study provides the largest collection of sugarcane structures modelled to date. We demonstrate that sugarcane genes are highly polymorphic, revealing that each genotype is evolving both uniquely and independently. In addition, we present an exome assembly system for orphan crops that can be executed on commodity infrastructure, making exome assembly practical for any group. In terms of knowledge about herbicide modes of action and detoxification, we have advanced sugarcane from a crop where no information about any herbicide-associated gene was available to the situation where sugarcane is now a species with the single largest collection of known and annotated herbicide-associated genes.
[Three-dimensional genome organization: a lesson from the Polycomb-Group proteins].

PubMed

Bantignies, Frédéric

2013-01-01

As more and more genomes are being explored and annotated, important features of three-dimensional (3D) genome organization are just being uncovered. In the light of what we know about Polycomb group (PcG) proteins, we will present the latest findings on this topic. The PcG proteins are well-conserved chromatin factors that repress transcription of numerous target genes. They bind the genome at specific sites, forming chromatin domains of associated histone modifications as well as higher-order chromatin structures. These 3D chromatin structures involve the interactions between PcG-bound regulatory regions at short- and long-range distances, and may significantly contribute to PcG function. Recent high throughput "Chromosome Conformation Capture" (3C) analyses have revealed many other higher order structures along the chromatin fiber, partitioning the genomes into well demarcated topological domains. This revealed an unprecedented link between linear epigenetic domains and chromosome architecture, which might be intimately connected to genome function. © Société de Biologie, 2013.
The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions | Office of Cancer Genomics

Cancer.gov

We present the molecular landscape of pediatric acute myeloid leukemia (AML) and characterize nearly 1,000 participants in Children’s Oncology Group (COG) AML trials. The COG–National Cancer Institute (NCI) TARGET AML initiative assessed cases by whole-genome, targeted DNA, mRNA and microRNA sequencing and CpG methylation profiling. Validated DNA variants corresponded to diverse, infrequent mutations, with fewer than 40 genes mutated in >2% of cases.
DNA methylation pathways and their crosstalk with histone methylation

PubMed Central

Du, Jiamu; Johnson, Lianna M.; Jacobsen, Steven E.; Patel, Dinshaw J.

2015-01-01

Methylation of DNA and of histone 3 at Lys 9 (H3K9) are highly correlated with gene silencing in eukaryotes from fungi to humans. Both of these epigenetic marks need to be established at specific regions of the genome and then maintained at these sites through cell division. Protein structural domains that specifically recognize methylated DNA and methylated histones are key for targeting enzymes that catalyse these marks to appropriate genome sites. Genetic, genomic, structural and biochemical data reveal connections between these two epigenetic marks, and these domains mediate much of the crosstalk. PMID:26296162
Progress towards mapping the universe of protein folds

PubMed Central

Grant, Alastair; Lee, David; Orengo, Christine

2004-01-01

Although the precise aims differ between the various international structural genomics initiatives currently aiming to illuminate the universe of protein folds, many selectively target protein families for which the fold is unknown. How well can the current set of known protein families and folds be used to estimate the total number of folds in nature, and will structural genomics initiatives yield representatives for all the major protein families within a reasonable time scale? PMID:15128436
Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

PubMed

Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

2014-09-13

Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral element in the genome. Galileo shows a significant insertion preference for a 15-bp palindromic TSM.
Multi-target Parallel Processing Approach for Gene-to-structure Determination of the Influenza Polymerase PB2 Subunit

PubMed Central

Moen, Spencer O.; Smith, Eric; Raymond, Amy C.; Fairman, James W.; Stewart, Lance J.; Staker, Bart L.; Begley, Darren W.; Edwards, Thomas E.; Lorimer, Donald D.

2013-01-01

Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year 1. Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans 2. Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains. PMID:23851357
Behind Every Good Metabolite there is a Great Enzyme (and perhaps a structure)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buchko, Garry W.; Phan, Isabelle; Cron, Lisabeth

Today, due to great technological advancements, it is possible to study everything at the same time. This ability has given birth to “totality” studies in the fields of genomics, transcriptomics, proteomics, and metabolomics. In turn, the combined study of all these global analyses gave birth to the field of systems biology. Another “totality” field brought to life with new emerging technologies is structural genomics, an effort to determine the three-dimensional structure of every protein encoded in a genome. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a specialized structural genomics effort composed of academic (University of Washington), governmentmore » (Pacific Northwest National Laboratory), not-for-profit (Seattle BioMed), and commercial (Emerald BioStructures) institutions that is funded by the National Institute of Allergy and Infectious Diseases (Federal Contract: HHSN272200700057C and HHSN27220120025C) to apply genome-scale approaches in solving protein structures from biodefense organisms, as well as those causing emerging and re-emerging disease. In five years over 540 structures have been deposited into the Protein Data Bank (PDB) by SSGICD. About one third of all SSGCID structures contain bound ligands, many of which are metabolites or metabolite analogues present in the cell. These proteins structures are the blueprints for the structure-based design of the next generation of drugs against bacterial pathogens and other infectious diseases. Many of the selected SSGCID targets are annotated enzymes from known metabolomic pathways essential to cellular vitality since selectively “knocking-out” one of the enzymes in an important pathway with a drug may be fatal to the organism. One reason metabolomic pathways are important is because of the small molecules, or metabolites, produced at various steps in these pathways and identified by metabolomic studies. Unlike genomics, transcriptomics, and proteomics that may be influenced by epigenetic, post-transcriptional, and post-translational modifications, respectively, the metabolites present in the cell at any one time represent downstream biochemical endproducts, and therefore, metabolite profiles may be most closely associated with a phenotype and provide valuable information for infectious disease research. Metabolomic data would be even more useful if it could be linked to the vast amount of structural genomics data. Towards this goal SSGCID has created an automated website (http://apps.sbri.org/SSGCIDTargetStatus/Pathway) that assigns selected SSGCID target proteins to MetaCyc pathways (http://metacyc.org/). Details of this website will be provided here. The SSGCID-Pathway website represents a first big step towards linking metabolites and metabolic pathways to structural genomic data with the goal of accelerating the discovery of new agents to battle infectious diseases.« less
Convergent evolution of adenosine aptamers spanning bacterial, human, and random sequences revealed by structure-based bioinformatics and genomic SELEX

PubMed Central

Vu, Michael M. K.; Jameson, Nora E.; Masuda, Stuart J.; Lin, Dana; Larralde-Ridaura, Rosa; Lupták, Andrej

2012-01-01

SUMMARY Aptamers are structured macromolecules in vitro evolved to bind molecular targets, whereas in nature they form the ligand-binding domains of riboswitches. Adenosine aptamers of a single structural family were isolated several times from random pools but they have not been identified in genomic sequences. We used two unbiased methods, structure-based bioinformatics and human genome-based in vitro selection, to identify aptamers that form the same adenosine-binding structure in a bacterium, and several vertebrates, including humans. Two of the human aptamers map to introns of RAB3C and FGD3 genes. The RAB3C aptamer binds ATP with dissociation constants about ten times lower than physiological ATP concentration, while the minimal FGD3 aptamer binds ATP only co-transcriptionally. PMID:23102219
Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

USDA-ARS?s Scientific Manuscript database

The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the...
Functional Information Stored in the Conserved Structural RNA Domains of Flavivirus Genomes

PubMed Central

Fernández-Sanlés, Alba; Ríos-Marco, Pablo; Romero-López, Cristina; Berzal-Herranz, Alfredo

2017-01-01

The genus Flavivirus comprises a large number of small, positive-sense single-stranded, RNA viruses able to replicate in the cytoplasm of certain arthropod and/or vertebrate host cells. The genus, which has some 70 member species, includes a number of emerging and re-emerging pathogens responsible for outbreaks of human disease around the world, such as the West Nile, dengue, Zika, yellow fever, Japanese encephalitis, St. Louis encephalitis, and tick-borne encephalitis viruses. Like other RNA viruses, flaviviruses have a compact RNA genome that efficiently stores all the information required for the completion of the infectious cycle. The efficiency of this storage system is attributable to supracoding elements, i.e., discrete, structural units with essential functions. This information storage system overlaps and complements the protein coding sequence and is highly conserved across the genus. It therefore offers interesting potential targets for novel therapeutic strategies. This review summarizes our knowledge of the features of flavivirus genome functional RNA domains. It also provides a brief overview of the main achievements reported in the design of antiviral nucleic acid-based drugs targeting functional genomic RNA elements. PMID:28421048
A sequence-based survey of the complex structural organization of tumor genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav

2008-04-03

The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less
Insights into structural variations and genome rearrangements in prokaryotic genomes.

PubMed

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
GAP Final Technical Report 12-14-04

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andrew J. Bordner, PhD, Senior Research Scientist

2004-12-14

The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less
Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

DOE PAGES

Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...

2007-11-23

Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less
Genomes2Drugs: Identifies Target Proteins and Lead Drugs from Proteome Data

PubMed Central

Toomey, David; Hoppe, Heinrich C.; Brennan, Marian P.; Nolan, Kevin B.; Chubb, Anthony J.

2009-01-01

Background Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. Methodology/Principal Findings To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i) homologous to previously crystallized proteins or (ii) targets of known drugs, but are (iii) not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. Conclusions/Significance Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under ‘change-of-application’ patents. PMID:19593435
A Global Comparison of the Human and T. brucei Degradomes Gives Insights about Possible Parasite Drug Targets

PubMed Central

Mashiyama, Susan T.; Koupparis, Kyriacos; Caffrey, Conor R.; McKerrow, James H.; Babbitt, Patricia C.

2012-01-01

We performed a genome-level computational study of sequence and structure similarity, the latter using crystal structures and models, of the proteases of Homo sapiens and the human parasite Trypanosoma brucei. Using sequence and structure similarity networks to summarize the results, we constructed global views that show visually the relative abundance and variety of proteases in the degradome landscapes of these two species, and provide insights into evolutionary relationships between proteases. The results also indicate how broadly these sequence sets are covered by three-dimensional structures. These views facilitate cross-species comparisons and offer clues for drug design from knowledge about the sequences and structures of potential drug targets and their homologs. Two protease groups (“M32” and “C51”) that are very different in sequence from human proteases are examined in structural detail, illustrating the application of this global approach in mining new pathogen genomes for potential drug targets. Based on our analyses, a human ACE2 inhibitor was selected for experimental testing on one of these parasite proteases, TbM32, and was shown to inhibit it. These sequence and structure data, along with interactive versions of the protein similarity networks generated in this study, are available at http://babbittlab.ucsf.edu/resources.html. PMID:23236535
TDR Targets: a chemogenomics resource for neglected diseases.

PubMed

Magariños, María P; Carmona, Santiago J; Crowther, Gregory J; Ralph, Stuart A; Roos, David S; Shanmugam, Dhanasekaran; Van Voorhis, Wesley C; Agüero, Fernán

2012-01-01

The TDR Targets Database (http://tdrtargets.org) has been designed and developed as an online resource to facilitate the rapid identification and prioritization of molecular targets for drug development, focusing on pathogens responsible for neglected human diseases. The database integrates pathogen specific genomic information with functional data (e.g. expression, phylogeny, essentiality) for genes collected from various sources, including literature curation. This information can be browsed and queried using an extensive web interface with functionalities for combining, saving, exporting and sharing the query results. Target genes can be ranked and prioritized using numerical weights assigned to the criteria used for querying. In this report we describe recent updates to the TDR Targets database, including the addition of new genomes (specifically helminths), and integration of chemical structure, property and bioactivity information for biological ligands, drugs and inhibitors and cheminformatic tools for querying and visualizing these chemical data. These changes greatly facilitate exploration of linkages (both known and predicted) between genes and small molecules, yielding insight into whether particular proteins may be druggable, effectively allowing the navigation of chemical space in a genomics context.
TDR Targets: a chemogenomics resource for neglected diseases

PubMed Central

Magariños, María P.; Carmona, Santiago J.; Crowther, Gregory J.; Ralph, Stuart A.; Roos, David S.; Shanmugam, Dhanasekaran; Van Voorhis, Wesley C.; Agüero, Fernán

2012-01-01

The TDR Targets Database (http://tdrtargets.org) has been designed and developed as an online resource to facilitate the rapid identification and prioritization of molecular targets for drug development, focusing on pathogens responsible for neglected human diseases. The database integrates pathogen specific genomic information with functional data (e.g. expression, phylogeny, essentiality) for genes collected from various sources, including literature curation. This information can be browsed and queried using an extensive web interface with functionalities for combining, saving, exporting and sharing the query results. Target genes can be ranked and prioritized using numerical weights assigned to the criteria used for querying. In this report we describe recent updates to the TDR Targets database, including the addition of new genomes (specifically helminths), and integration of chemical structure, property and bioactivity information for biological ligands, drugs and inhibitors and cheminformatic tools for querying and visualizing these chemical data. These changes greatly facilitate exploration of linkages (both known and predicted) between genes and small molecules, yielding insight into whether particular proteins may be druggable, effectively allowing the navigation of chemical space in a genomics context. PMID:22116064

Targeted gene insertion for molecular medicine.

PubMed

Voigt, Katrin; Izsvák, Zsuzsanna; Ivics, Zoltán

2008-11-01

Genomic insertion of a functional gene together with suitable transcriptional regulatory elements is often required for long-term therapeutical benefit in gene therapy for several genetic diseases. A variety of integrating vectors for gene delivery exist. Some of them exhibit random genomic integration, whereas others have integration preferences based on attributes of the targeted site, such as primary DNA sequence and physical structure of the DNA, or through tethering to certain DNA sequences by host-encoded cellular factors. Uncontrolled genomic insertion bears the risk of the transgene being silenced due to chromosomal position effects, and can lead to genotoxic effects due to mutagenesis of cellular genes. None of the vector systems currently used in either preclinical experiments or clinical trials displays sufficient preferences for target DNA sequences that would ensure appropriate and reliable expression of the transgene and simultaneously prevent hazardous side effects. We review in this paper the advantages and disadvantages of both viral and non-viral gene delivery technologies, discuss mechanisms of target site selection of integrating genetic elements (viruses and transposons), and suggest distinct molecular strategies for targeted gene delivery.
Normalization of Complete Genome Characteristics: Application to Evolution from Primitive Organisms to Homo sapiens.

PubMed

Sorimachi, Kenji; Okayasu, Teiji; Ohhira, Shuji

2015-04-01

Normalized nucleotide and amino acid contents of complete genome sequences can be visualized as radar charts. The shapes of these charts depict the characteristics of an organism's genome. The normalized values calculated from the genome sequence theoretically exclude experimental errors. Further, because normalization is independent of both target size and kind, this procedure is applicable not only to single genes but also to whole genomes, which consist of a huge number of different genes. In this review, we discuss the applications of the normalization of the nucleotide and predicted amino acid contents of complete genomes to the investigation of genome structure and to evolutionary research from primitive organisms to Homo sapiens. Some of the results could never have been obtained from the analysis of individual nucleotide or amino acid sequences but were revealed only after the normalization of nucleotide and amino acid contents was applied to genome research. The discovery that genome structure was homogeneous was obtained only after normalization methods were applied to the nucleotide or predicted amino acid contents of genome sequences. Normalization procedures are also applicable to evolutionary research. Thus, normalization of the contents of whole genomes is a useful procedure that can help to characterize organisms.
Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

PubMed

Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

2015-08-29

The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment

PubMed Central

2010-01-01

Background Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. Results We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. Conclusions In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets. PMID:21034488
Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment.

PubMed

Capriles, Priscila V S Z; Guimarães, Ana C R; Otto, Thomas D; Miranda, Antonio B; Dardenne, Laurent E; Degrave, Wim M

2010-10-29

Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets.
An orthologous transcriptional signature differentiates responses towards closely related chemicals in Arabidopsis thaliana and brassica napus

EPA Science Inventory

Herbicides are structurally diverse chemicals that inhibit plant-specific targets, however their off-target and potentially differentiating side-effects are less well defined. In this study, genome-wide expression profiling based on Affymetrix AtH1 arrays was used to identify dis...
Translating the "Banana Genome" to Delineate Stress Resistance, Dwarfing, Parthenocarpy and Mechanisms of Fruit Ripening.

PubMed

Dash, Prasanta K; Rai, Rhitu

2016-01-01

Evolutionary frozen, genetically sterile and globally iconic fruit "Banana" remained untouched by the green revolution and, as of today, researchers face intrinsic impediments for its varietal improvement. Recently, this wonder crop entered the genomics era with decoding of structural genome of double haploid Pahang (AA genome constitution) genotype of Musa acuminata . Its complex genome decoded by hybrid sequencing strategies revealed panoply of genes and transcription factors involved in the process of sucrose conversion that imparts sweetness to its fruit. Historically, banana has faced the wrath of pandemic bacterial, fungal, and viral diseases and multitude of abiotic stresses that has ruined the livelihood of small/marginal farmers' and destroyed commercial plantations. Decoding structural genome of this climacteric fruit has given impetus to a deeper understanding of the repertoire of genes involved in disease resistance, understanding the mechanism of dwarfing to develop an ideal plant type, unraveling the process of parthenocarpy, and fruit ripening for better fruit quality. Further, injunction of comparative genomics will usher in integration of information from its decoded genome and other monocots into field applications in banana related but not limited to yield enhancement, food security, livelihood assurance, and energy sustainability. In this mini review, we discuss pre- and post-genomic discoveries and highlight accomplishments in structural genomics, genetic engineering and forward genetic accomplishments with an aim to target genes and transcription factors for translational research in banana.
Repetitive elements dynamics in cell identity programming, maintenance and disease.

PubMed

Bodega, Beatrice; Orlando, Valerio

2014-12-01

The days of 'junk DNA' seem to be over. The rapid progress of genomics technologies has been unveiling unexpected mechanisms by which repetitive DNA and in particular transposable elements (TEs) have evolved, becoming key issues in understanding genome structure and function. Indeed, rather than 'parasites', recent findings strongly suggest that TEs may have a positive function by contributing to tissue specific transcriptional programs, in particular as enhancer-like elements and/or modules for regulation of higher order chromatin structure. Further, it appears that during development and aging genomes experience several waves of TEs activation, and this contributes to individual genome shaping during lifetime. Interestingly, TEs activity is major target of epigenomic regulation. These findings are shedding new light on the genome-phenotype relationship and set the premises to help to explain complex disease manifestation, as consequence of TEs activity deregulation. Copyright © 2014. Published by Elsevier Ltd.
Identification of potential drug targets by subtractive genome analysis of Escherichia coli O157:H7: an in silico approach

PubMed Central

Mondal, Shakhinur Islam; Ferdous, Sabiha; Jewel, Nurnabi Azad; Akter, Arzuba; Mahmud, Zabed; Islam, Md Muzahidul; Afrin, Tanzila; Karim, Nurul

2015-01-01

Bacterial enteric infections resulting in diarrhea, dysentery, or enteric fever constitute a huge public health problem, with more than a billion episodes of disease annually in developing and developed countries. In this study, the deadly agent of hemorrhagic diarrhea and hemolytic uremic syndrome, Escherichia coli O157:H7 was investigated with extensive computational approaches aimed at identifying novel and broad-spectrum antibiotic targets. A systematic in silico workflow consisting of comparative genomics, metabolic pathways analysis, and additional drug prioritizing parameters was used to identify novel drug targets that were essential for the pathogen’s survival but absent in its human host. Comparative genomic analysis of Kyoto Encyclopedia of Genes and Genomes annotated metabolic pathways identified 350 putative target proteins in E. coli O157:H7 which showed no similarity to human proteins. Further bio-informatic approaches including prediction of subcellular localization, calculation of molecular weight, and web-based investigation of 3D structural characteristics greatly aided in filtering the potential drug targets from 350 to 120. Ultimately, 44 non-homologous essential proteins of E. coli O157:H7 were prioritized and proved to have the eligibility to become novel broad-spectrum antibiotic targets and DNA polymerase III alpha (dnaE) was the top-ranked among these targets. Moreover, druggability of each of the identified drug targets was evaluated by the DrugBank database. In addition, 3D structure of the dnaE was modeled and explored further for in silico docking with ligands having potential druggability. Finally, we confirmed that the compounds N-coeleneterazine and N-(1,4-dihydro-5H-tetrazol-5-ylidene)-9-oxo-9H-xanthene-2-sulfon-amide were the most suitable ligands of dnaE and hence proposed as the potential inhibitors of this target protein. The results of this study could facilitate the discovery and release of new and effective drugs against E. coli O157:H7 and other deadly human bacterial pathogens. PMID:26677339
CRISPR system for genome engineering: the application for autophagy study.

PubMed

Cui, Jianzhou; Chew, Shirley Jia Li; Shi, Yin; Gong, Zhiyuan; Shen, Han-Ming

2017-05-01

CRISPR/Cas9 is the latest tool introduced in the field of genome engineering and is so far the best genome-editing tool as compared to its precedents such as, meganucleases, zinc finger nucleases (ZFNs) and transcription activator-like effectors (TALENs). The simple design and assembly of the CRISPR/Cas9 system makes genome editing easy to perform as it uses small guide RNAs that correspond to their DNA targets for high efficiency editing. This has helped open the doors for multiplexible genome targeting in many species that were intractable using old genetic perturbation techniques. Currently, The CRISPR system is revolutionizing the way biological researches are conducted and paves a bright future not only in research but also in medicine and biotechnology. In this review, we evaluated the history, types and structure, the mechanism of action of CRISPR/Cas System. In particular, we focused on the application of this powerful tool in autophagy research. [BMB Reports 2017; 50(5): 247-256].
The Aspergillus Genome Database: multispecies curation and incorporation of RNA-Seq data to improve structural gene annotations.

PubMed

Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R

2014-01-01

The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
Organizational heterogeneity of vertebrate genomes.

PubMed

Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

2012-01-01

Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Mapping the malaria parasite druggable genome by using in vitro evolution and chemogenomics.

PubMed

Cowell, Annie N; Istvan, Eva S; Lukens, Amanda K; Gomez-Lorenzo, Maria G; Vanaerschot, Manu; Sakata-Kato, Tomoyo; Flannery, Erika L; Magistrado, Pamela; Owen, Edward; Abraham, Matthew; LaMonte, Gregory; Painter, Heather J; Williams, Roy M; Franco, Virginia; Linares, Maria; Arriaga, Ignacio; Bopp, Selina; Corey, Victoria C; Gnädig, Nina F; Coburn-Flynn, Olivia; Reimer, Christin; Gupta, Purva; Murithi, James M; Moura, Pedro A; Fuchs, Olivia; Sasaki, Erika; Kim, Sang W; Teng, Christine H; Wang, Lawrence T; Akidil, Aslı; Adjalley, Sophie; Willis, Paul A; Siegel, Dionicio; Tanaseichuk, Olga; Zhong, Yang; Zhou, Yingyao; Llinás, Manuel; Ottilie, Sabine; Gamo, Francisco-Javier; Lee, Marcus C S; Goldberg, Daniel E; Fidock, David A; Wirth, Dyann F; Winzeler, Elizabeth A

2018-01-12

Chemogenetic characterization through in vitro evolution combined with whole-genome analysis can identify antimalarial drug targets and drug-resistance genes. We performed a genome analysis of 262 Plasmodium falciparum parasites resistant to 37 diverse compounds. We found 159 gene amplifications and 148 nonsynonymous changes in 83 genes associated with drug-resistance acquisition, where gene amplifications contributed to one-third of resistance acquisition events. Beyond confirming previously identified multidrug-resistance mechanisms, we discovered hitherto unrecognized drug target-inhibitor pairs, including thymidylate synthase and a benzoquinazolinone, farnesyltransferase and a pyrimidinedione, and a dipeptidylpeptidase and an arylurea. This exploration of the P. falciparum resistome and druggable genome will likely guide drug discovery and structural biology efforts, while also advancing our understanding of resistance mechanisms available to the malaria parasite. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Engineering of a target site-specific recombinase by a combined evolution- and structure-guided approach

PubMed Central

Abi-Ghanem, Josephine; Chusainow, Janet; Karimova, Madina; Spiegel, Christopher; Hofmann-Sieber, Helga; Hauber, Joachim; Buchholz, Frank; Pisabarro, M. Teresa

2013-01-01

Site-specific recombinases (SSRs) can perform DNA rearrangements, including deletions, inversions and translocations when their naive target sequences are placed strategically into the genome of an organism. Hence, in order to employ SSRs in heterologous hosts, their target sites have to be introduced into the genome of an organism before the enzyme can be practically employed. Engineered SSRs hold great promise for biotechnology and advanced biomedical applications, as they promise to extend the usefulness of SSRs to allow efficient and specific recombination of pre-existing, natural genomic sequences. However, the generation of enzymes with desired properties remains challenging. Here, we use substrate-linked directed evolution in combination with molecular modeling to rationally engineer an efficient and specific recombinase (sTre) that readily and specifically recombines a sequence present in the HIV-1 genome. We elucidate the role of key residues implicated in the molecular recognition mechanism and we present a rationale for sTre’s enhanced specificity. Combining evolutionary and rational approaches should help in accelerating the generation of enzymes with desired properties for use in biotechnology and biomedicine. PMID:23275541
Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome

PubMed Central

De Nicola, Beatrice; Lech, Christopher J.; Heddi, Brahim; Regmi, Sagar; Frasson, Ilaria; Perrone, Rosalba; Richter, Sara N.; Phan, Anh Tuân

2016-01-01

The long terminal repeat (LTR) of the proviral human immunodeficiency virus (HIV)-1 genome is integral to virus transcription and host cell infection. The guanine-rich U3 region within the LTR promoter, previously shown to form G-quadruplex structures, represents an attractive target to inhibit HIV transcription and replication. In this work, we report the structure of a biologically relevant G-quadruplex within the LTR promoter region of HIV-1. The guanine-rich sequence designated LTR-IV forms a well-defined structure in physiological cationic solution. The nuclear magnetic resonance (NMR) structure of this sequence reveals a parallel-stranded G-quadruplex containing a single-nucleotide thymine bulge, which participates in a conserved stacking interaction with a neighboring single-nucleotide adenine loop. Transcription analysis in a HIV-1 replication competent cell indicates that the LTR-IV region may act as a modulator of G-quadruplex formation in the LTR promoter. Consequently, the LTR-IV G-quadruplex structure presented within this work could represent a valuable target for the design of HIV therapeutics. PMID:27298260
Structural Basis for the Altered PAM Recognition by Engineered CRISPR-Cpf1.

PubMed

Nishimasu, Hiroshi; Yamano, Takashi; Gao, Linyi; Zhang, Feng; Ishitani, Ryuichiro; Nureki, Osamu

2017-07-06

The RNA-guided Cpf1 nuclease cleaves double-stranded DNA targets complementary to the CRISPR RNA (crRNA), and it has been harnessed for genome editing technologies. Recently, Acidaminococcus sp. BV3L6 (AsCpf1) was engineered to recognize altered DNA sequences as the protospacer adjacent motif (PAM), thereby expanding the target range of Cpf1-mediated genome editing. Whereas wild-type AsCpf1 recognizes the TTTV PAM, the RVR (S542R/K548V/N552R) and RR (S542R/K607R) variants can efficiently recognize the TATV and TYCV PAMs, respectively. However, their PAM recognition mechanisms remained unknown. Here we present the 2.0 Å resolution crystal structures of the RVR and RR variants bound to a crRNA and its target DNA. The structures revealed that the RVR and RR variants primarily recognize the PAM-complementary nucleotides via the substituted residues. Our high-resolution structures delineated the altered PAM recognition mechanisms of the AsCpf1 variants, providing a basis for the further engineering of CRISPR-Cpf1. Copyright © 2017 Elsevier Inc. All rights reserved.
Glycogenomics as a mass spectrometry-guided genome-mining method for microbial glycosylated molecules.

PubMed

Kersten, Roland D; Ziemert, Nadine; Gonzalez, David J; Duggan, Brendan M; Nizet, Victor; Dorrestein, Pieter C; Moore, Bradley S

2013-11-19

Glycosyl groups are an essential mediator of molecular interactions in cells and on cellular surfaces. There are very few methods that directly relate sugar-containing molecules to their biosynthetic machineries. Here, we introduce glycogenomics as an experiment-guided genome-mining approach for fast characterization of glycosylated natural products (GNPs) and their biosynthetic pathways from genome-sequenced microbes by targeting glycosyl groups in microbial metabolomes. Microbial GNPs consist of aglycone and glycosyl structure groups in which the sugar unit(s) are often critical for the GNP's bioactivity, e.g., by promoting binding to a target biomolecule. GNPs are a structurally diverse class of molecules with important pharmaceutical and agrochemical applications. Herein, O- and N-glycosyl groups are characterized in their sugar monomers by tandem mass spectrometry (MS) and matched to corresponding glycosylation genes in secondary metabolic pathways by a MS-glycogenetic code. The associated aglycone biosynthetic genes of the GNP genotype then classify the natural product to further guide structure elucidation. We highlight the glycogenomic strategy by the characterization of several bioactive glycosylated molecules and their gene clusters, including the anticancer agent cinerubin B from Streptomyces sp. SPB74 and an antibiotic, arenimycin B, from Salinispora arenicola CNB-527.
An integrated clinical and genomic information system for cancer precision medicine.

PubMed

Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk

2018-04-20

Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.
Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space

PubMed Central

Bustos-Korts, Daniela; Malosetti, Marcos; Chapman, Scott; Biddulph, Ben; van Eeuwijk, Fred

2016-01-01

Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CD method. For the rice panel, all training set construction methods led to similar predictive ability, a reflection of the very strong population structure in this panel. PMID:27672112
Chompy: an infestation of MITE-like repetitive elements in the crocodilian genome.

PubMed

Ray, David A; Hedges, Dale J; Herke, Scott W; Fowlkes, Justin D; Barnes, Erin W; LaVie, Daniel K; Goodwin, Lindsey M; Densmore, Llewellyn D; Batzer, Mark A

2005-12-05

Interspersed repeats are a major component of most eukaryotic genomes and have an impact on genome size and stability, but the repetitive element landscape of crocodilian genomes has not yet been fully investigated. In this report, we provide the first detailed characterization of an interspersed repeat element in any crocodilian genome. Chompy is a putative miniature inverted-repeat transposable element (MITE) family initially recovered from the genome of Alligator mississippiensis (American alligator) but also present in the genomes of Crocodylus moreletii (Morelet's crocodile) and Gavialis gangeticus (Indian gharial). The element has all of the hallmarks of MITEs including terminal inverted repeats, possible target site duplications, and a tendency to form secondary structures. We estimate the copy number in the alligator genome to be approximately 46,000 copies. As a result of their size and unique properties, Chompy elements may provide a useful source of genomic variation for crocodilian comparative genomics.

A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action.

PubMed

Abadi, Shiran; Yan, Winston X; Amar, David; Mayrose, Itay

2017-10-01

The adaptation of the CRISPR-Cas9 system as a genome editing technique has generated much excitement in recent years owing to its ability to manipulate targeted genes and genomic regions that are complementary to a programmed single guide RNA (sgRNA). However, the efficacy of a specific sgRNA is not uniquely defined by exact sequence homology to the target site, thus unintended off-targets might additionally be cleaved. Current methods for sgRNA design are mainly concerned with predicting off-targets for a given sgRNA using basic sequence features and employ elementary rules for ranking possible sgRNAs. Here, we introduce CRISTA (CRISPR Target Assessment), a novel algorithm within the machine learning framework that determines the propensity of a genomic site to be cleaved by a given sgRNA. We show that the predictions made with CRISTA are more accurate than other available methodologies. We further demonstrate that the occurrence of bulges is not a rare phenomenon and should be accounted for in the prediction process. Beyond predicting cleavage efficiencies, the learning process provides inferences regarding patterns that underlie the mechanism of action of the CRISPR-Cas9 system. We discover that attributes that describe the spatial structure and rigidity of the entire genomic site as well as those surrounding the PAM region are a major component of the prediction capabilities.
Bioinformatics prediction of siRNAs as potential antiviral agents against dengue viruses

PubMed Central

Villegas-Rosales, Paula M; Méndez-Tenorio, Alfonso; Ortega-Soto, Elizabeth; Barrón, Blanca L

2012-01-01

Dengue virus (DENV 1-4) represents the major emerging arthropod-borne viral infection in the world. Currently, there is neither an available vaccine nor a specific treatment. Hence, there is a need of antiviral drugs for these viral infections; we describe the prediction of short interfering RNA (siRNA) as potential therapeutic agents against the four DENV serotypes. Our strategy was to carry out a series of multiple alignments using ClustalX program to find conserved sequences among the four DENV serotype genomes to obtain a consensus sequence for siRNAs design. A highly conserved sequence among the four DENV serotypes, located in the encoding sequence for NS4B and NS5 proteins was found. A total of 2,893 complete DENV genomes were downloaded from the NCBI, and after a depuration procedure to identify identical sequences, 220 complete DENV genomes were left. They were edited to select the NS4B and NS5 sequences, which were aligned to obtain a consensus sequence. Three different servers were used for siRNA design, and the resulting siRNAs were aligned to identify the most prevalent sequences. Three siRNAs were chosen, one targeted the genome region that codifies for NS4B protein and the other two; the region for NS5 protein. Predicted secondary structure for DENV genomes was used to demonstrate that the siRNAs were able to target the viral genome forming double stranded structures, necessary to activate the RNA silencing machinery. PMID:22829722
FACT complex is required for DNA demethylation at heterochromatin during reproduction in Arabidopsis.

PubMed

Frost, Jennifer M; Kim, M Yvonne; Park, Guen Tae; Hsieh, Ping-Hung; Nakamura, Miyuki; Lin, Samuel J H; Yoo, Hyunjin; Choi, Jaemyung; Ikeda, Yoko; Kinoshita, Tetsu; Choi, Yeonhee; Zilberman, Daniel; Fischer, Robert L

2018-05-15

The DEMETER (DME) DNA glycosylase catalyzes genome-wide DNA demethylation and is required for endosperm genomic imprinting and embryo viability. Targets of DME-mediated DNA demethylation reside in small, euchromatic, AT-rich transposons and at the boundaries of large transposons, but how DME interacts with these diverse chromatin states is unknown. The STRUCTURE SPECIFIC RECOGNITION PROTEIN 1 (SSRP1) subunit of the chromatin remodeler FACT (facilitates chromatin transactions), was previously shown to be involved in the DME-dependent regulation of genomic imprinting in Arabidopsis endosperm. Therefore, to investigate the interaction between DME and chromatin, we focused on the activity of the two FACT subunits, SSRP1 and SUPPRESSOR of TY16 (SPT16), during reproduction in Arabidopsis We found that FACT colocalizes with nuclear DME in vivo, and that DME has two classes of target sites, the first being euchromatic and accessible to DME, but the second, representing over half of DME targets, requiring the action of FACT for DME-mediated DNA demethylation genome-wide. Our results show that the FACT-dependent DME targets are GC-rich heterochromatin domains with high nucleosome occupancy enriched with H3K9me2 and H3K27me1. Further, we demonstrate that heterochromatin-associated linker histone H1 specifically mediates the requirement for FACT at a subset of DME-target loci. Overall, our results demonstrate that FACT is required for DME targeting by facilitating its access to heterochromatin. Copyright © 2018 the Author(s). Published by PNAS.
Identification and characterization of potential druggable targets among hypothetical proteins of extensively drug resistant Mycobacterium tuberculosis (XDR KZN 605) through subtractive genomics approach.

PubMed

Uddin, Reaz; Siddiqui, Quratulain Nehal; Azam, Syed Sikander; Saima, Bibi; Wadood, Abdul

2018-03-01

Among the resistant isolates of tuberculosis (TB), the multidrug resistance tuberculosis (MDR-TB) and extensively drug resistant tuberculosis (XDR-TB) are the areas of growing concern for which the front-line antibiotics are no more effective. As a result, the search of new therapeutic targets against TB is an imperative need of time. On the other hand, the target identification is an a priori step in drug discovery based research. Furthermore, the availability of the complete proteomic data of extensively drug resistant Mycobacterium tuberculosis (XDR-MTB) made it possible to carry out in silico analysis for the discovery of new drug targets. In the current study, we aimed to prioritize the potential drug targets among the hypothetical proteins of XDR-TB via subtractive genomics approach. In the subtractive genomics, we stepwise reduced the complete proteome of XDR-MTB to only two hypothetical proteins and evidently proposed them as new therapeutic targets. The 3D structure of one of the two target proteins was predicted via homology modeling and later on, validated by various analysis tools. Our study suggested that the domains identified and the motif hits found in the sequences of the shortlisted drug targets are crucial for the survival of the XDR-MTB. To the best of our knowledge, the current study is the first attempt in which the complete proteomic data of XDR-MTB was subjected to the computational subtractive genomics approach and therefore, would provide an opportunity to identify the unique therapeutic targets against deadly XDR-MTB. Copyright © 2017 Elsevier B.V. All rights reserved.
Post-genomics nanotechnology is gaining momentum: nanoproteomics and applications in life sciences.

PubMed

Kobeissy, Firas H; Gulbakan, Basri; Alawieh, Ali; Karam, Pierre; Zhang, Zhiqun; Guingab-Cagmat, Joy D; Mondello, Stefania; Tan, Weihong; Anagli, John; Wang, Kevin

2014-02-01

The post-genomics era has brought about new Omics biotechnologies, such as proteomics and metabolomics, as well as their novel applications to personal genomics and the quantified self. These advances are now also catalyzing other and newer post-genomics innovations, leading to convergences between Omics and nanotechnology. In this work, we systematically contextualize and exemplify an emerging strand of post-genomics life sciences, namely, nanoproteomics and its applications in health and integrative biological systems. Nanotechnology has been utilized as a complementary component to revolutionize proteomics through different kinds of nanotechnology applications, including nanoporous structures, functionalized nanoparticles, quantum dots, and polymeric nanostructures. Those applications, though still in their infancy, have led to several highly sensitive diagnostics and new methods of drug delivery and targeted therapy for clinical use. The present article differs from previous analyses of nanoproteomics in that it offers an in-depth and comparative evaluation of the attendant biotechnology portfolio and their applications as seen through the lens of post-genomics life sciences and biomedicine. These include: (1) immunosensors for inflammatory, pathogenic, and autoimmune markers for infectious and autoimmune diseases, (2) amplified immunoassays for detection of cancer biomarkers, and (3) methods for targeted therapy and automatically adjusted drug delivery such as in experimental stroke and brain injury studies. As nanoproteomics becomes available both to the clinician at the bedside and the citizens who are increasingly interested in access to novel post-genomics diagnostics through initiatives such as the quantified self, we anticipate further breakthroughs in personalized and targeted medicine.
Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

PubMed Central

Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

2016-01-01

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
Evolutionary and Comparative Genomics to Drive Rational Drug Design, with Particular Focus on Neuropeptide Seven-Transmembrane Receptors.

PubMed

Furlong, Michael; Seong, Jae Young

2017-01-01

Seven transmembrane receptors (7TMRs), also known as G protein-coupled receptors, are popular targets of drug development, particularly 7TMR systems that are activated by peptide ligands. Although many pharmaceutical drugs have been discovered via conventional bulk analysis techniques the increasing availability of structural and evolutionary data are facilitating change to rational, targeted drug design. This article discusses the appeal of neuropeptide-7TMR systems as drug targets and provides an overview of concepts in the evolution of vertebrate genomes and gene families. Subsequently, methods that use evolutionary concepts and comparative analysis techniques to aid in gene discovery, gene function identification, and novel drug design are provided along with case study examples.
Evolutionary and Comparative Genomics to Drive Rational Drug Design, with Particular Focus on Neuropeptide Seven-Transmembrane Receptors

PubMed Central

Furlong, Michael; Seong, Jae Young

2017-01-01

Seven transmembrane receptors (7TMRs), also known as G protein-coupled receptors, are popular targets of drug development, particularly 7TMR systems that are activated by peptide ligands. Although many pharmaceutical drugs have been discovered via conventional bulk analysis techniques the increasing availability of structural and evolutionary data are facilitating change to rational, targeted drug design. This article discusses the appeal of neuropeptide-7TMR systems as drug targets and provides an overview of concepts in the evolution of vertebrate genomes and gene families. Subsequently, methods that use evolutionary concepts and comparative analysis techniques to aid in gene discovery, gene function identification, and novel drug design are provided along with case study examples. PMID:28035082
Engineered Cpf1 variants with altered PAM specificities increase genome targeting range

PubMed Central

Gao, Linyi; Cox, David B.T.; Yan, Winston X.; Manteiga, John C.; Schneider, Martin W.; Yamano, Takashi; Nishimasu, Hiroshi; Nureki, Osamu; Crosetto, Nicola; Zhang, Feng

2017-01-01

The RNA-guided endonuclease Cpf1 is a promising tool for genome editing in eukaryotic cells1–7. However, the utility of the commonly used Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) and Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1) is limited by their requirement of a TTTV protospacer adjacent motif (PAM) in the DNA substrate. To address this limitation, we performed a structure-guided mutagenesis screen to increase the targeting range of Cpf1. We engineered two AsCpf1 variants carrying the mutations S542R/K607R and S542R/K548V/N552R, which recognize TYCV and TATV PAMs, respectively, with enhanced activities in vitro and in human cells. Genome-wide assessment of off-target activity using BLISS7 assay indicated that these variants retain high DNA targeting specificity, which we further improved by introducing an additional non-PAM-interacting mutation. Introducing the identified mutations at their corresponding positions in LbCpf1 similarly altered its PAM specificity. Together, these variants increase the targeting range of Cpf1 by approximately three-fold in human coding sequences to one cleavage site per ~11 bp. PMID:28581492
Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

PubMed

Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

2018-04-19

Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50 = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis

PubMed Central

Gianola, Daniel; Fariello, Maria I.; Naya, Hugo; Schön, Chris-Carolin

2016-01-01

Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. PMID:27520956
Precision medicine driven by cancer systems biology.

PubMed

Filipp, Fabian V

2017-03-01

Molecular insights from genome and systems biology are influencing how cancer is diagnosed and treated. We critically evaluate big data challenges in precision medicine. The melanoma research community has identified distinct subtypes involving chronic sun-induced damage and the mitogen-activated protein kinase driver pathway. In addition, despite low mutation burden, non-genomic mitogen-activated protein kinase melanoma drivers are found in membrane receptors, metabolism, or epigenetic signaling with the ability to bypass central mitogen-activated protein kinase molecules and activating a similar program of mitogenic effectors. Mutation hotspots, structural modeling, UV signature, and genomic as well as non-genomic mechanisms of disease initiation and progression are taken into consideration to identify resistance mutations and novel drug targets. A comprehensive precision medicine profile of a malignant melanoma patient illustrates future rational drug targeting strategies. Network analysis emphasizes an important role of epigenetic and metabolic master regulators in oncogenesis. Co-occurrence of driver mutations in signaling, metabolic, and epigenetic factors highlights how cumulative alterations of our genomes and epigenomes progressively lead to uncontrolled cell proliferation. Precision insights have the ability to identify independent molecular pathways suitable for drug targeting. Synergistic treatment combinations of orthogonal modalities including immunotherapy, mitogen-activated protein kinase inhibitors, epigenetic inhibitors, and metabolic inhibitors have the potential to overcome immune evasion, side effects, and drug resistance.
The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods.

PubMed

Gabanyi, Margaret J; Adams, Paul D; Arnold, Konstantin; Bordoli, Lorenza; Carter, Lester G; Flippen-Andersen, Judith; Gifford, Lida; Haas, Juergen; Kouranov, Andrei; McLaughlin, William A; Micallef, David I; Minor, Wladek; Shah, Raship; Schwede, Torsten; Tao, Yi-Ping; Westbrook, John D; Zimmerman, Matthew; Berman, Helen M

2011-07-01

The Protein Structure Initiative's Structural Biology Knowledgebase (SBKB, URL: http://sbkb.org ) is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI's high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology.
Experimental approaches to identify cellular G-quadruplex structures and functions.

PubMed

Di Antonio, Marco; Rodriguez, Raphaël; Balasubramanian, Shankar

2012-05-01

Guanine-rich nucleic acids can fold into non-canonical DNA secondary structures called G-quadruplexes. The formation of these structures can interfere with the biology that is crucial to sustain cellular homeostases and metabolism via mechanisms that include transcription, translation, splicing, telomere maintenance and DNA recombination. Thus, due to their implication in several biological processes and possible role promoting genomic instability, G-quadruplex forming sequences have emerged as potential therapeutic targets. There has been a growing interest in the development of synthetic molecules and biomolecules for sensing G-quadruplex structures in cellular DNA. In this review, we summarise and discuss recent methods developed for cellular imaging of G-quadruplexes, and the application of experimental genomic approaches to detect G-quadruplexes throughout genomic DNA. In particular, we will discuss the use of engineered small molecules and natural proteins to enable pull-down, ChIP-Seq, ChIP-chip and fluorescence imaging of G-quadruplex structures in cellular DNA. Copyright © 2012 Elsevier Inc. All rights reserved.
Single-Stranded γPNAs for In Vivo Site-Specific Genome Editing via Watson-Crick Recognition

PubMed Central

Bahal, Raman; Quijano, Elias; McNeer, Nicole Ali; Liu, Yanfeng; Bhunia, Dinesh C.; López-Giráldez, Francesco; Fields, Rachel J.; Saltzman, W. Mark; Ly, Danith H.; Glazer, Peter M.

2014-01-01

Triplex-forming peptide nucleic acids (PNAs) facilitate gene editing by stimulating recombination of donor DNAs within genomic DNA via site-specific formation of altered helical structures that further stimulate DNA repair. However, PNAs designed for triplex formation are sequence restricted to homopurine sites. Herein we describe a novel strategy where next generation single-stranded gamma PNAs (γPNAs) containing miniPEG substitutions at the gamma position can target genomic DNA in mouse bone marrow at mixed-sequence sites to induce targeted gene editing. In addition to enhanced binding, γPNAs confer increased solubility and improved formulation into poly(lactic-co-glycolic acid) (PLGA) nanoparticles for efficient intracellular delivery. Single-stranded γPNAs induce targeted gene editing at frequencies of 0.8% in mouse bone marrow cells treated ex vivo and 0.1% in vivo via IV injection, without detectable toxicity. These results suggest that γPNAs may provide a new tool for induced gene editing based on Watson-Crick recognition without sequence restriction. PMID:25174576
Single-stranded γPNAs for in vivo site-specific genome editing via Watson-Crick recognition.

PubMed

Bahal, Raman; Quijano, Elias; McNeer, Nicole A; Liu, Yanfeng; Bhunia, Dinesh C; Lopez-Giraldez, Francesco; Fields, Rachel J; Saltzman, William M; Ly, Danith H; Glazer, Peter M

2014-01-01

Triplex-forming peptide nucleic acids (PNAs) facilitate gene editing by stimulating recombination of donor DNAs within genomic DNA via site-specific formation of altered helical structures that further stimulate DNA repair. However, PNAs designed for triplex formation are sequence restricted to homopurine sites. Herein we describe a novel strategy where next generation single-stranded gamma PNAs (γPNAs) containing miniPEG substitutions at the gamma position can target genomic DNA in mouse bone marrow at mixed-sequence sites to induce targeted gene editing. In addition to enhanced binding, γPNAs confer increased solubility and improved formulation into poly(lactic-co-glycolic acid) (PLGA) nanoparticles for efficient intracellular delivery. Single-stranded γPNAs induce targeted gene editing at frequencies of 0.8% in mouse bone marrow cells treated ex vivo and 0.1% in vivo via IV injection, without detectable toxicity. These results suggest that γPNAs may provide a new tool for induced gene editing based on Watson-Crick recognition without sequence restriction.
The Vigna Genome Server, 'VigGS': A Genomic Knowledge Base of the Genus Vigna Based on High-Quality, Annotated Genome Sequence of the Azuki Bean, Vigna angularis (Willd.) Ohwi & Ohashi.

PubMed

Sakai, Hiroaki; Naito, Ken; Takahashi, Yu; Sato, Toshiyuki; Yamamoto, Toshiya; Muto, Isamu; Itoh, Takeshi; Tomooka, Norihiko

2016-01-01

The genus Vigna includes legume crops such as cowpea, mungbean and azuki bean, as well as >100 wild species. A number of the wild species are highly tolerant to severe environmental conditions including high-salinity, acid or alkaline soil; drought; flooding; and pests and diseases. These features of the genus Vigna make it a good target for investigation of genetic diversity in adaptation to stressful environments; however, a lack of genomic information has hindered such research in this genus. Here, we present a genome database of the genus Vigna, Vigna Genome Server ('VigGS', http://viggs.dna.affrc.go.jp), based on the recently sequenced azuki bean genome, which incorporates annotated exon-intron structures, along with evidence for transcripts and proteins, visualized in GBrowse. VigGS also facilitates user construction of multiple alignments between azuki bean genes and those of six related dicot species. In addition, the database displays sequence polymorphisms between azuki bean and its wild relatives and enables users to design primer sequences targeting any variant site. VigGS offers a simple keyword search in addition to sequence similarity searches using BLAST and BLAT. To incorporate up to date genomic information, VigGS automatically receives newly deposited mRNA sequences of pre-set species from the public database once a week. Users can refer to not only gene structures mapped on the azuki bean genome on GBrowse but also relevant literature of the genes. VigGS will contribute to genomic research into plant biotic and abiotic stresses and to the future development of new stress-tolerant crops. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Cytochrome P450 monooxygenase CYP53 family in fungi: comparative structural and evolutionary analysis and its role as a common alternative anti-fungal drug target.

PubMed

Jawallapersand, Poojah; Mashele, Samson Sitheni; Kovačič, Lidija; Stojan, Jure; Komel, Radovan; Pakala, Suresh Babu; Kraševec, Nada; Syed, Khajamohiddin

2014-01-01

Cytochrome P450 monooxygenases (CYPs/P450s) are heme-thiolate proteins whose role as a drug target against pathogenic microbes has been explored because of their stereo- and regio-specific oxidation activity. We aimed to assess the CYP53 family's role as a common alternative drug target against animal (including human) and plant pathogenic fungi and its role in fungal-mediated wood degradation. Genome-wide analysis of fungal species revealed the presence of CYP53 members in ascomycetes and basidiomycetes. Basidiomycetes had a higher number of CYP53 members in their genomes than ascomycetes. Only two CYP53 subfamilies were found in ascomycetes and six subfamilies in basidiomycetes, suggesting that during the divergence of phyla ascomycetes lost CYP53 P450s. According to phylogenetic and gene-structure analysis, enrichment of CYP53 P450s in basidiomycetes occurred due to the extensive duplication of CYP53 P450s in their genomes. Numerous amino acids (103) were found to be conserved in the ascomycetes CYP53 P450s, against only seven in basidiomycetes CYP53 P450s. 3D-modelling and active-site cavity mapping data revealed that the ascomycetes CYP53 P450s have a highly conserved protein structure whereby 78% amino acids in the active-site cavity were found to be conserved. Because of this rigid nature of ascomycetes CYP53 P450s' active site cavity, any inhibitor directed against this P450 family can serve as a common anti-fungal drug target, particularly toward pathogenic ascomycetes. The dynamic nature of basidiomycetes CYP53 P450s at a gene and protein level indicates that these P450s are destined to acquire novel functions. Functional analysis of CYP53 P450s strongly supported our hypothesis that the ascomycetes CYP53 P450s ability is limited for detoxification of toxic molecules, whereas basidiomycetes CYP53 P450s play an additional role, i.e. involvement in degradation of wood and its derived components. This study is the first report on genome-wide comparative structural (gene and protein structure-level) and evolutionary analysis of a fungal P450 family.
Translating the “Banana Genome” to Delineate Stress Resistance, Dwarfing, Parthenocarpy and Mechanisms of Fruit Ripening

PubMed Central

Dash, Prasanta K.; Rai, Rhitu

2016-01-01

Evolutionary frozen, genetically sterile and globally iconic fruit “Banana” remained untouched by the green revolution and, as of today, researchers face intrinsic impediments for its varietal improvement. Recently, this wonder crop entered the genomics era with decoding of structural genome of double haploid Pahang (AA genome constitution) genotype of Musa acuminata. Its complex genome decoded by hybrid sequencing strategies revealed panoply of genes and transcription factors involved in the process of sucrose conversion that imparts sweetness to its fruit. Historically, banana has faced the wrath of pandemic bacterial, fungal, and viral diseases and multitude of abiotic stresses that has ruined the livelihood of small/marginal farmers’ and destroyed commercial plantations. Decoding structural genome of this climacteric fruit has given impetus to a deeper understanding of the repertoire of genes involved in disease resistance, understanding the mechanism of dwarfing to develop an ideal plant type, unraveling the process of parthenocarpy, and fruit ripening for better fruit quality. Further, injunction of comparative genomics will usher in integration of information from its decoded genome and other monocots into field applications in banana related but not limited to yield enhancement, food security, livelihood assurance, and energy sustainability. In this mini review, we discuss pre- and post-genomic discoveries and highlight accomplishments in structural genomics, genetic engineering and forward genetic accomplishments with an aim to target genes and transcription factors for translational research in banana. PMID:27833619
Roles of the nuclear lamina in stable nuclear association and assembly of a herpesviral transactivator complex on viral immediate-early genes.

PubMed

Silva, Lindsey; Oh, Hyung Suk; Chang, Lynne; Yan, Zhipeng; Triezenberg, Steven J; Knipe, David M

2012-01-01

Little is known about the mechanisms of gene targeting within the nucleus and its effect on gene expression, but most studies have concluded that genes located near the nuclear periphery are silenced by heterochromatin. In contrast, we found that early herpes simplex virus (HSV) genome complexes localize near the nuclear lamina and that this localization is associated with reduced heterochromatin on the viral genome and increased viral immediate-early (IE) gene transcription. In this study, we examined the mechanism of this effect and found that input virion transactivator protein, virion protein 16 (VP16), targets sites adjacent to the nuclear lamina and is required for targeting of the HSV genome to the nuclear lamina, exclusion of heterochromatin from viral replication compartments, and reduction of heterochromatin on the viral genome. Because cells infected with the VP16 mutant virus in1814 showed a phenotype similar to that of lamin A/C(-/-) cells infected with wild-type virus, we hypothesized that the nuclear lamina is required for VP16 activator complex formation. In lamin A/C(-/-) mouse embryo fibroblasts, VP16 and Oct-1 showed reduced association with the viral IE gene promoters, the levels of VP16 and HCF-1 stably associated with the nucleus were lower than in wild-type cells, and the association of VP16 with HCF-1 was also greatly reduced. These results show that the nuclear lamina is required for stable nuclear localization and formation of the VP16 activator complex and provide evidence for the nuclear lamina being the site of assembly of the VP16 activator complex. The targeting of chromosomes in the cell nucleus is thought to be important in the regulation of expression of genes on the chromosomes. The major documented effect of intranuclear targeting has been silencing of chromosomes at sites near the nuclear periphery. In this study, we show that targeting of the herpes simplex virus DNA genome to the nuclear periphery promotes formation of transcriptional activator complexes on the viral genome, demonstrating that the nuclear periphery also has sites for activation of transcription. These results highlight the importance of the nuclear lamina, the structure that lines the inner nuclear membrane, in both transcriptional activation and repression. Future studies defining the molecular structures of these two types of nuclear sites should define new levels of gene regulation.

Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome.

PubMed

De Nicola, Beatrice; Lech, Christopher J; Heddi, Brahim; Regmi, Sagar; Frasson, Ilaria; Perrone, Rosalba; Richter, Sara N; Phan, Anh Tuân

2016-07-27

The long terminal repeat (LTR) of the proviral human immunodeficiency virus (HIV)-1 genome is integral to virus transcription and host cell infection. The guanine-rich U3 region within the LTR promoter, previously shown to form G-quadruplex structures, represents an attractive target to inhibit HIV transcription and replication. In this work, we report the structure of a biologically relevant G-quadruplex within the LTR promoter region of HIV-1. The guanine-rich sequence designated LTR-IV forms a well-defined structure in physiological cationic solution. The nuclear magnetic resonance (NMR) structure of this sequence reveals a parallel-stranded G-quadruplex containing a single-nucleotide thymine bulge, which participates in a conserved stacking interaction with a neighboring single-nucleotide adenine loop. Transcription analysis in a HIV-1 replication competent cell indicates that the LTR-IV region may act as a modulator of G-quadruplex formation in the LTR promoter. Consequently, the LTR-IV G-quadruplex structure presented within this work could represent a valuable target for the design of HIV therapeutics. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Stella, Stefano; University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen; Molina, Rafael

Crystal structures of BurrH and the BurrH–DNA complex are reported. DNA editing offers new possibilities in synthetic biology and biomedicine for modulation or modification of cellular functions to organisms. However, inaccuracy in this process may lead to genome damage. To address this important problem, a strategy allowing specific gene modification has been achieved through the addition, removal or exchange of DNA sequences using customized proteins and the endogenous DNA-repair machinery. Therefore, the engineering of specific protein–DNA interactions in protein scaffolds is key to providing ‘toolkits’ for precise genome modification or regulation of gene expression. In a search for putative DNA-bindingmore » domains, BurrH, a protein that recognizes a 19 bp DNA target, was identified. Here, its apo and DNA-bound crystal structures are reported, revealing a central region containing 19 repeats of a helix–loop–helix modular domain (BurrH domain; BuD), which identifies the DNA target by a single residue-to-nucleotide code, thus facilitating its redesign for gene targeting. New DNA-binding specificities have been engineered in this template, showing that BuD-derived nucleases (BuDNs) induce high levels of gene targeting in a locus of the human haemoglobin β (HBB) gene close to mutations responsible for sickle-cell anaemia. Hence, the unique combination of high efficiency and specificity of the BuD arrays can push forward diverse genome-modification approaches for cell or organism redesign, opening new avenues for gene editing.« less
Covering complete proteomes with X-ray structures: A current snapshot

DOE PAGES

Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...

2014-10-23

Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less
Recurrent DNA inversion rearrangements in the human genome

PubMed Central

Flores, Margarita; Morales, Lucía; Gonzaga-Jauregui, Claudia; Domínguez-Vidaña, Rocío; Zepeda, Cinthya; Yañez, Omar; Gutiérrez, María; Lemus, Tzitziki; Valle, David; Avila, Ma. Carmen; Blanco, Daniel; Medina-Ruiz, Sofía; Meza, Karla; Ayala, Erandi; García, Delfino; Bustos, Patricia; González, Víctor; Girard, Lourdes; Tusie-Luna, Teresa; Dávila, Guillermo; Palacios, Rafael

2007-01-01

Several lines of evidence suggest that reiterated sequences in the human genome are targets for nonallelic homologous recombination (NAHR), which facilitates genomic rearrangements. We have used a PCR-based approach to identify breakpoint regions of rearranged structures in the human genome. In particular, we have identified intrachromosomal identical repeats that are located in reverse orientation, which may lead to chromosomal inversions. A bioinformatic workflow pathway to select appropriate regions for analysis was developed. Three such regions overlapping with known human genes, located on chromosomes 3, 15, and 19, were analyzed. The relative proportion of wild-type to rearranged structures was determined in DNA samples from blood obtained from different, unrelated individuals. The results obtained indicate that recurrent genomic rearrangements occur at relatively high frequency in somatic cells. Interestingly, the rearrangements studied were significantly more abundant in adults than in newborn individuals, suggesting that such DNA rearrangements might start to appear during embryogenesis or fetal life and continue to accumulate after birth. The relevance of our results in regard to human genomic variation is discussed. PMID:17389356
Causal gene identification using combinatorial V-structure search.

PubMed

Cai, Ruichu; Zhang, Zhenjie; Hao, Zhifeng

2013-07-01

With the advances of biomedical techniques in the last decade, the costs of human genomic sequencing and genomic activity monitoring are coming down rapidly. To support the huge genome-based business in the near future, researchers are eager to find killer applications based on human genome information. Causal gene identification is one of the most promising applications, which may help the potential patients to estimate the risk of certain genetic diseases and locate the target gene for further genetic therapy. Unfortunately, existing pattern recognition techniques, such as Bayesian networks, cannot be directly applied to find the accurate causal relationship between genes and diseases. This is mainly due to the insufficient number of samples and the extremely high dimensionality of the gene space. In this paper, we present the first practical solution to causal gene identification, utilizing a new combinatorial formulation over V-Structures commonly used in conventional Bayesian networks, by exploring the combinations of significant V-Structures. We prove the NP-hardness of the combinatorial search problem under a general settings on the significance measure on the V-Structures, and present a greedy algorithm to find sub-optimal results. Extensive experiments show that our proposal is both scalable and effective, particularly with interesting findings on the causal genes over real human genome data. Copyright © 2013 Elsevier Ltd. All rights reserved.
Structural analysis of a set of proteins resulting from a bacterial genomics project.

PubMed

Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

2005-09-01

The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.
Minireview: DNA Replication in Plant Mitochondria

PubMed Central

Cupp, John D.; Nielsen, Brent L.

2014-01-01

Higher plant mitochondrial genomes exhibit much greater structural complexity as compared to most other organisms. Unlike well-characterized metazoan mitochondrial DNA (mtDNA) replication, an understanding of the mechanism(s) and proteins involved in plant mtDNA replication remains unclear. Several plant mtDNA replication proteins, including DNA polymerases, DNA primase/helicase, and accessory proteins have been identified. Mitochondrial dynamics, genome structure, and the complexity of dual-targeted and dual-function proteins that provide at least partial redundancy suggest that plants have a unique model for maintaining and replicating mtDNA when compared to the replication mechanism utilized by most metazoan organisms. PMID:24681310
A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation

PubMed Central

Farasat, Iman; Salis, Howard M.

2016-01-01

The ability to precisely modify genomes and regulate specific genes will greatly accelerate several medical and engineering applications. The CRISPR/Cas9 (Type II) system binds and cuts DNA using guide RNAs, though the variables that control its on-target and off-target activity remain poorly characterized. Here, we develop and parameterize a system-wide biophysical model of Cas9-based genome editing and gene regulation to predict how changing guide RNA sequences, DNA superhelical densities, Cas9 and crRNA expression levels, organisms and growth conditions, and experimental conditions collectively control the dynamics of dCas9-based binding and Cas9-based cleavage at all DNA sites with both canonical and non-canonical PAMs. We combine statistical thermodynamics and kinetics to model Cas9:crRNA complex formation, diffusion, site selection, reversible R-loop formation, and cleavage, using large amounts of structural, biochemical, expression, and next-generation sequencing data to determine kinetic parameters and develop free energy models. Our results identify DNA supercoiling as a novel mechanism controlling Cas9 binding. Using the model, we predict Cas9 off-target binding frequencies across the lambdaphage and human genomes, and explain why Cas9’s off-target activity can be so high. With this improved understanding, we propose several rules for designing experiments for minimizing off-target activity. We also discuss the implications for engineering dCas9-based genetic circuits. PMID:26824432
Uridine monophosphate kinase as potential target for tuberculosis: from target to lead identification.

PubMed

Arvind, Akanksha; Jain, Vaibhav; Saravanan, Parameswaran; Mohan, C Gopi

2013-12-01

Mycobacterium tuberculosis (Mtb) is a causative agent of tuberculosis (TB) disease, which has affected approximately 2 billion people worldwide. Due to the emergence of resistance towards the existing drugs, discovery of new anti-TB drugs is an important global healthcare challenge. To address this problem, there is an urgent need to identify new drug targets in Mtb. In the present study, the subtractive genomics approach has been employed for the identification of new drug targets against TB. Screening the Mtb proteome using the Database of Essential Genes (DEG) and human proteome resulted in the identification of 60 key proteins which have no eukaryotic counterparts. Critical analysis of these proteins using Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways database revealed uridine monophosphate kinase (UMPK) enzyme as a potential drug target for developing novel anti-TB drugs. Homology model of Mtb-UMPK was constructed for the first time on the basis of the crystal structure of E. coli-UMPK, in order to understand its structure-function relationships, and which would in turn facilitate to perform structure-based inhibitor design. Furthermore, the structural similarity search was carried out using physiological inhibitor UTP of Mtb-UMPK to virtually screen ZINC database. Retrieved hits were further screened by implementing several filters like ADME and toxicity followed by molecular docking. Finally, on the basis of the Glide docking score and the mode of binding, 6 putative leads were identified as inhibitors of this enzyme which can potentially emerge as future drugs for the treatment of TB.
Structural homologies between phenformin, lipitor and gleevec aim the same metabolic oncotarget in leukemia and melanoma.

PubMed

Somlyai, Gábor; Collins, T Que; Meuillet, Emmanuelle J; Hitendra, Patel; D'Agostino, Dominic P; Boros, László G

2017-07-25

Phenformin's recently demonstrated efficacy in melanoma and Gleevec's demonstrated anti-proliferative action in chronic myeloid leukemia may lie within these drugs' significant pharmacokinetics, pharmacodynamics and structural homologies, which are reviewed herein. Gleevec's success in turning a fatal leukemia into a manageable chronic disease has been trumpeted in medical, economic, political and social circles because it is considered the first successful targeted therapy. Investments have been immense in omics analyses and while in some cases they greatly helped the management of patients, in others targeted therapies failed to achieve clinically stable recurrence-free disease course or to substantially extend survival. Nevertheless protein kinase controlling approaches have persisted despite early warnings that the targeted genomics narrative is overblown. Experimental and clinical observations with Phenformin suggest an alternative explanation for Gleevec's mode of action. Using 13C-guided precise flux measurements, a comparative multiple cell line study demonstrated the drug's downstream impact on submolecular fatty acid processing metabolic events that occurred independent of Gleevec's molecular target. Clinical observations that hyperlipidemia and diabetes are both reversed in mice and in patients taking Gleevec support the drugs' primary metabolic targets by biguanides and statins. This is evident by structural data demonstrating that Gleevec shows pyridine- and phenyl-guanidine homology with Phenformin and identical phenylcarbamoyl structural and ligand binding homology with Lipitor. The misunderstood mechanism of action of Gleevec is emblematic of the pervasive flawed reasoning that genomic analysis will lead to targeted, personalized diagnosis and therapy. The alternative perspective for Gleevec's mode of action may turn oncotargets towards metabolic channel reaction architectures in leukemia and melanoma, as well as in other cancers.
The Crystal Structure of the RNA-Dependent RNA Polymerase from Human Rhinovirus: A Dual Function Target for Common Cold Antiviral Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Love, Robert A.; Maegley, Karen A.; Yu, Xiu

Human rhinoviruses (HRV), the predominant members of the Picornaviridae family of positive-strand RNA viruses, are the major causative agents of the common cold. Given the lack of effective treatments for rhinoviral infections, virally encoded proteins have become attractive therapeutic targets. The HRV genome encodes an RNA-dependent RNA polymerase (RdRp) denoted 3D{sup pol}, which is responsible for replicating the viral genome and for synthesizing a protein primer used in the replication. Here the crystal structures for three viral serotypes (1B, 14, and 16) of HRV 3D{sup pol} have been determined. The three structures are very similar to one another, and tomore » the closely related poliovirus (PV) 3D{sup pol} enzyme. Because the reported PV crystal structure shows significant disorder, HRV 3D{sup pol} provides the first complete view of a picornaviral RdRp. The folding topology of HRV 3D{sup pol} also resembles that of RdRps from hepatitis C virus (HCV) and rabbit hemorrhagic disease virus (RHDV) despite very low sequence homology.« less
Post-Genomics Nanotechnology Is Gaining Momentum: Nanoproteomics and Applications in Life Sciences

PubMed Central

Kobeissy, Firas H.; Gulbakan, Basri; Alawieh, Ali; Karam, Pierre; Zhang, Zhiqun; Guingab-Cagmat, Joy D.; Mondello, Stefania; Tan, Weihong; Anagli, John

2014-01-01

Abstract The post-genomics era has brought about new Omics biotechnologies, such as proteomics and metabolomics, as well as their novel applications to personal genomics and the quantified self. These advances are now also catalyzing other and newer post-genomics innovations, leading to convergences between Omics and nanotechnology. In this work, we systematically contextualize and exemplify an emerging strand of post-genomics life sciences, namely, nanoproteomics and its applications in health and integrative biological systems. Nanotechnology has been utilized as a complementary component to revolutionize proteomics through different kinds of nanotechnology applications, including nanoporous structures, functionalized nanoparticles, quantum dots, and polymeric nanostructures. Those applications, though still in their infancy, have led to several highly sensitive diagnostics and new methods of drug delivery and targeted therapy for clinical use. The present article differs from previous analyses of nanoproteomics in that it offers an in-depth and comparative evaluation of the attendant biotechnology portfolio and their applications as seen through the lens of post-genomics life sciences and biomedicine. These include: (1) immunosensors for inflammatory, pathogenic, and autoimmune markers for infectious and autoimmune diseases, (2) amplified immunoassays for detection of cancer biomarkers, and (3) methods for targeted therapy and automatically adjusted drug delivery such as in experimental stroke and brain injury studies. As nanoproteomics becomes available both to the clinician at the bedside and the citizens who are increasingly interested in access to novel post-genomics diagnostics through initiatives such as the quantified self, we anticipate further breakthroughs in personalized and targeted medicine. PMID:24410486
Analysis of piRNA-mediated silencing of active TEs in Drosophila melanogaster suggests limits on the evolution of host genome defense.

PubMed

Kelleher, Erin S; Barbash, Daniel A

2013-08-01

The Piwi-interacting RNA (piRNA) pathway defends animal genomes against the harmful consequences of transposable element (TE) infection by imposing small-RNA-mediated silencing. Because silencing is targeted by TE-derived piRNAs, piRNA production is posited to be central to the evolution of genome defense. We harnessed genomic data sets from Drosophila melanogaster, including genome-wide measures of piRNA, mRNA, and genomic abundance, along with estimates of age structure and risk of ectopic recombination, to address fundamental questions about the functional and evolutionary relationships between TE families and their regulatory piRNAs. We demonstrate that mRNA transcript abundance, robustness of "ping-pong" amplification, and representation in piRNA clusters together explain the majority of variation in piRNA abundance between TE families, providing the first robust statistical support for the prevailing model of piRNA biogenesis. Intriguingly, we also discover that the most transpositionally active TE families, with the greatest capacity to induce harmful mutations or disrupt gametogenesis, are not necessarily the most abundant among piRNAs. Rather, the level of piRNA targeting is largely independent of recent transposition rate for active TE families, but is rapidly lost for inactive TEs. These observations are consistent with population genetic theory that suggests a limited selective advantage for host repression of transposition. Additionally, we find no evidence that piRNA targeting responds to selection against a second major cost of TE infection: ectopic recombination between TE insertions. Our observations confirm the pivotal role of piRNA-mediated silencing in defending the genome against selfish transposition, yet also suggest limits to the optimization of host genome defense.
Selective whole genome amplification for resequencing target microbial species from complex natural samples.

PubMed

Leichty, Aaron R; Brisson, Dustin

2014-10-01

Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.
Solution NMR structures of homeodomains from human proteins ALX4, ZHX1, and CASP8AP2 contribute to the structural coverage of the Human Cancer Protein Interaction Network.

PubMed

Xu, Xianzhong; Pulavarti, Surya V S R K; Eletsky, Alexander; Huang, Yuanpeng Janet; Acton, Thomas B; Xiao, Rong; Everett, John K; Montelione, Gaetano T; Szyperski, Thomas

2014-12-01

High-quality solution NMR structures of three homeodomains from human proteins ALX4, ZHX1 and CASP8AP2 were solved. These domains were chosen as targets of a biomedical theme project pursued by the Northeast Structural Genomics Consortium. This project focuses on increasing the structural coverage of human proteins associated with cancer.
Genetic characterization of a core collection of flax (Linum usitatissimum L.) suitable for association mapping studies and evidence of divergent selection between fiber and linseed types

PubMed Central

2013-01-01

Background Flax is valued for its fiber, seed oil and nutraceuticals. Recently, the fiber industry has invested in the development of products made from linseed stems, making it a dual purpose crop. Simultaneous targeting of genomic regions controlling stem fiber and seed quality traits could enable the development of dual purpose cultivars. However, the genetic diversity, population structure and linkage disequilibrium (LD) patterns necessary for association mapping (AM) have not yet been assessed in flax because genomic resources have only recently been developed. We characterized 407 globally distributed flax accessions using 448 microsatellite markers. The data was analyzed to assess the suitability of this core collection for AM. Genomic scans to identify candidate genes selected during the divergent breeding process of fiber flax and linseed were conducted using the whole genome shotgun sequence of flax. Results Combined genetic structure analysis assigned all accessions to two major groups with six sub-groups. Population differentiation was weak between the major groups (FST = 0.094) and for most of the pairwise comparisons among sub-groups. The molecular coancestry analysis indicated weak relatedness (mean = 0.287) for most individual pairs. Abundant genetic diversity was observed in the total panel (5.32 alleles per locus), and some sub-groups showed a high proportion of private alleles. The average genome-wide LD (r2) was 0.036, with a relatively fast decay of 1.5 cM. Genomic scans between fiber flax and linseed identified candidate genes involved in cell-wall biogenesis/modification, xylem identity and fatty acid biosynthesis congruent with genes previously identified in flax and other plant species. Conclusions Based on the abundant genetic diversity, weak population structure and relatedness and relatively fast LD decay, we concluded that this core collection is suitable for AM studies targeting multiple agronomic and quality traits aiming at the improvement of flax as a true dual purpose crop. Our genomic scans provide the first insights into candidate regions affected by divergent selection in flax. In combination with AM, genomic scans have the ability to increase the power to detect loci influencing complex traits. PMID:23647851
Genetic characterization of a core collection of flax (Linum usitatissimum L.) suitable for association mapping studies and evidence of divergent selection between fiber and linseed types.

PubMed

Soto-Cerda, Braulio J; Diederichsen, Axel; Ragupathy, Raja; Cloutier, Sylvie

2013-05-06

Flax is valued for its fiber, seed oil and nutraceuticals. Recently, the fiber industry has invested in the development of products made from linseed stems, making it a dual purpose crop. Simultaneous targeting of genomic regions controlling stem fiber and seed quality traits could enable the development of dual purpose cultivars. However, the genetic diversity, population structure and linkage disequilibrium (LD) patterns necessary for association mapping (AM) have not yet been assessed in flax because genomic resources have only recently been developed. We characterized 407 globally distributed flax accessions using 448 microsatellite markers. The data was analyzed to assess the suitability of this core collection for AM. Genomic scans to identify candidate genes selected during the divergent breeding process of fiber flax and linseed were conducted using the whole genome shotgun sequence of flax. Combined genetic structure analysis assigned all accessions to two major groups with six sub-groups. Population differentiation was weak between the major groups (F(ST) = 0.094) and for most of the pairwise comparisons among sub-groups. The molecular coancestry analysis indicated weak relatedness (mean = 0.287) for most individual pairs. Abundant genetic diversity was observed in the total panel (5.32 alleles per locus), and some sub-groups showed a high proportion of private alleles. The average genome-wide LD (r²) was 0.036, with a relatively fast decay of 1.5 cM. Genomic scans between fiber flax and linseed identified candidate genes involved in cell-wall biogenesis/modification, xylem identity and fatty acid biosynthesis congruent with genes previously identified in flax and other plant species. Based on the abundant genetic diversity, weak population structure and relatedness and relatively fast LD decay, we concluded that this core collection is suitable for AM studies targeting multiple agronomic and quality traits aiming at the improvement of flax as a true dual purpose crop. Our genomic scans provide the first insights into candidate regions affected by divergent selection in flax. In combination with AM, genomic scans have the ability to increase the power to detect loci influencing complex traits.
BuD, a helix–loop–helix DNA-binding domain for genome modification

PubMed Central

Stella, Stefano; Molina, Rafael; López-Méndez, Blanca; Juillerat, Alexandre; Bertonati, Claudia; Daboussi, Fayza; Campos-Olivas, Ramon; Duchateau, Phillippe; Montoya, Guillermo

2014-01-01

DNA editing offers new possibilities in synthetic biology and biomedicine for modulation or modification of cellular functions to organisms. However, inaccuracy in this process may lead to genome damage. To address this important problem, a strategy allowing specific gene modification has been achieved through the addition, removal or exchange of DNA sequences using customized proteins and the endogenous DNA-repair machinery. Therefore, the engineering of specific protein–DNA interactions in protein scaffolds is key to providing ‘toolkits’ for precise genome modification or regulation of gene expression. In a search for putative DNA-binding domains, BurrH, a protein that recognizes a 19 bp DNA target, was identified. Here, its apo and DNA-bound crystal structures are reported, revealing a central region containing 19 repeats of a helix–loop–helix modular domain (BurrH domain; BuD), which identifies the DNA target by a single residue-to-nucleotide code, thus facilitating its redesign for gene targeting. New DNA-binding specificities have been engineered in this template, showing that BuD-derived nucleases (BuDNs) induce high levels of gene targeting in a locus of the human haemoglobin β (HBB) gene close to mutations responsible for sickle-cell anaemia. Hence, the unique combination of high efficiency and specificity of the BuD arrays can push forward diverse genome-modification approaches for cell or organism redesign, opening new avenues for gene editing. PMID:25004980
Multiple roles of genome-attached bacteriophage terminal proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Redrejo-Rodríguez, Modesto; Salas, Margarita, E-mail: msalas@cbm.csic.es

2014-11-15

Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid.more » Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer.« less
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis.

PubMed

Gianola, Daniel; Fariello, Maria I; Naya, Hugo; Schön, Chris-Carolin

2016-10-13

Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals ( G: ) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G,: provided variance components are unaffected by exclusion of such marker(s) from G: The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G: does matter. Removal of eigenvectors from G: can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. Copyright © 2016 Gianola et al.

The JCSG high-throughput structural biology pipeline.

PubMed

Elsliger, Marc André; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wooley, John; Wüthrich, Kurt; Wilson, Ian A

2010-10-01

The Joint Center for Structural Genomics high-throughput structural biology pipeline has delivered more than 1000 structures to the community over the past ten years. The JCSG has made a significant contribution to the overall goal of the NIH Protein Structure Initiative (PSI) of expanding structural coverage of the protein universe, as well as making substantial inroads into structural coverage of an entire organism. Targets are processed through an extensive combination of bioinformatics and biophysical analyses to efficiently characterize and optimize each target prior to selection for structure determination. The pipeline uses parallel processing methods at almost every step in the process and can adapt to a wide range of protein targets from bacterial to human. The construction, expansion and optimization of the JCSG gene-to-structure pipeline over the years have resulted in many technological and methodological advances and developments. The vast number of targets and the enormous amounts of associated data processed through the multiple stages of the experimental pipeline required the development of variety of valuable resources that, wherever feasible, have been converted to free-access web-based tools and applications.
Single molecule sequencing-guided scaffolding and correction of draft assemblies.

PubMed

Zhu, Shenglong; Chen, Danny Z; Emrich, Scott J

2017-12-06

Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.
The molecular genetic makeup of acute lymphoblastic leukemia | Office of Cancer Genomics

Cancer.gov

Abstract: Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention.
Structural systems pharmacology: a new frontier in discovering novel drug targets.

PubMed

Tan, Hepan; Ge, Xiaoxia; Xie, Lei

2013-08-01

The modern target-based drug discovery process, characterized by the one-drug-one-gene paradigm, has been of limited success. In contrast, phenotype-based screening produces thousands of active compounds but gives no hint as to what their molecular targets are or which ones merit further research. This presents a question: What is a suitable target for an efficient and safe drug? In this paper, we argue that target selection should take into account the proteome-wide energetic and kinetic landscape of drug-target interactions, as well as their cellular and organismal consequences. We propose a new paradigm of structural systems pharmacology to deconvolute the molecular targets of successful drugs as well as to identify druggable targets and their drug-like binders. Here we face two major challenges in structural systems pharmacology: How do we characterize and analyze the structural and energetic origins of drug-target interactions on a proteome scale? How do we correlate the dynamic molecular interactions to their in vivo activity? We will review recent advances in developing new computational tools for biophysics, bioinformatics, chemoinformatics, and systems biology related to the identification of genome-wide target profiles. We believe that the integration of these tools will realize structural systems pharmacology, enabling us to both efficiently develop effective therapeutics for complex diseases and combat drug resistance.
A method for simultaneously delineating multiple targets in 3D-FISH using limited channels, lasers, and fluorochromes.

PubMed

Zhao, F Y; Yang, X; Chen, D Y; Ma, W Y; Zheng, J G; Zhang, X M

2014-01-01

Many studies have suggested a link between the spatial organization of genomes and fundamental biological processes such as genome reprogramming, gene expression, and differentiation. Multicolor fluorescence in situ hybridization on three-dimensionally preserved nuclei (3D-FISH), in combination with confocal microscopy, has become an effective technique for analyzing 3D genome structure and spatial patterns of defined nucleus targets including entire chromosome territories and single gene loci. This technique usually requires the simultaneous visualization of numerous targets labeled with different colored fluorochromes. Thus, the number of channels and lasers must be sufficient for the commonly used labeling scheme of 3D-FISH, "one probe-one target". However, these channels and lasers are usually restricted by a given microscope system. This paper presents a method for simultaneously delineating multiple targets in 3D-FISH using limited channels, lasers, and fluorochromes. In contrast to other labeling schemes, this method is convenient and simple for multicolor 3D-FISH studies, which may result in widespread adoption of the technique. Lastly, as an application of the method, the nucleus locations of chromosome territory 18/21 and centromere 18/21/13 in normal human lymphocytes were analyzed, which might present evidence of a radial higher order chromatin arrangement.
Rotifer rDNA-specific R9 retrotransposable elements generate an exceptionally long target site duplication upon insertion.

PubMed

Gladyshev, Eugene A; Arkhipova, Irina R

2009-12-15

Ribosomal DNA genes in many eukaryotes contain insertions of non-LTR retrotransposable elements belonging to the R2 clade. These elements persist in the host genomes by inserting site-specifically into multicopy target sites, thereby avoiding random disruption of single-copy host genes. Here we describe R9 retrotransposons from the R2 clade in the 28S RNA genes of bdelloid rotifers, small freshwater invertebrate animals best known for their long-term asexuality and for their ability to survive repeated cycles of desiccation and rehydration. While the structural organization of R9 elements is highly similar to that of other members of the R2 clade, they are characterized by two distinct features: site-specific insertion into a previously unreported target sequence within the 28S gene, and an unusually long target site duplication of 126 bp. We discuss the implications of these findings in the context of bdelloid genome organization and the mechanisms of target-primed reverse transcription.
Identification of KasA as the cellular target of an anti-tubercular scaffold

PubMed Central

Abrahams, Katherine A.; Chung, Chun-wa; Ghidelli-Disse, Sonja; Rullas, Joaquín; Rebollo-López, María José; Gurcha, Sudagar S.; Cox, Jonathan A. G.; Mendoza, Alfonso; Jiménez-Navarro, Elena; Martínez-Martínez, María Santos; Neu, Margarete; Shillings, Anthony; Homes, Paul; Argyrou, Argyrides; Casanueva, Ruth; Loman, Nicholas J.; Moynihan, Patrick J.; Lelièvre, Joël; Selenski, Carolyn; Axtman, Matthew; Kremer, Laurent; Bantscheff, Marcus; Angulo-Barturen, Iñigo; Izquierdo, Mónica Cacho; Cammack, Nicholas C.; Drewes, Gerard; Ballell, Lluis; Barros, David; Besra, Gurdyal S.; Bates, Robert H.

2016-01-01

Phenotypic screens for bactericidal compounds are starting to yield promising hits against tuberculosis. In this regard, whole-genome sequencing of spontaneous resistant mutants generated against an indazole sulfonamide (GSK3011724A) identifies several specific single-nucleotide polymorphisms in the essential Mycobacterium tuberculosis β-ketoacyl synthase (kas) A gene. Here, this genomic-based target assignment is confirmed by biochemical assays, chemical proteomics and structural resolution of a KasA-GSK3011724A complex by X-ray crystallography. Finally, M. tuberculosis GSK3011724A-resistant mutants increase the in vitro minimum inhibitory concentration and the in vivo 99% effective dose in mice, establishing in vitro and in vivo target engagement. Surprisingly, the lack of target engagement of the related β-ketoacyl synthases (FabH and KasB) suggests a different mode of inhibition when compared with other Kas inhibitors of fatty acid biosynthesis in bacteria. These results clearly identify KasA as the biological target of GSK3011724A and validate this enzyme for further drug discovery efforts against tuberculosis. PMID:27581223
Splicing-Related Features of Introns Serve to Propel Evolution

PubMed Central

Luo, Yuping; Li, Chun; Gong, Xi; Wang, Yanlu; Zhang, Kunshan; Cui, Yaru; Sun, Yi Eve; Li, Siguang

2013-01-01

The role of spliceosomal intronic structures played in evolution has only begun to be elucidated. Comparative genomic analyses of fungal snoRNA sequences, which are often contained within introns and/or exons, revealed that about one-third of snoRNA-associated introns in three major snoRNA gene clusters manifested polymorphisms, likely resulting from intron loss and gain events during fungi evolution. Genomic deletions can clearly be observed as one mechanism underlying intron and exon loss, as well as generation of complex introns where several introns lie in juxtaposition without intercalating exons. Strikingly, by tracking conserved snoRNAs in introns, we found that some introns had moved from one position to another by excision from donor sites and insertion into target sties elsewhere in the genome without needing transposon structures. This study revealed the origin of many newly gained introns. Moreover, our analyses suggested that intron-containing sequences were more prone to sustainable structural changes than DNA sequences without introns due to intron's ability to jump within the genome via unknown mechanisms. We propose that splicing-related structural features of introns serve as an additional motor to propel evolution. PMID:23516505
Screening a fragment cocktail library using ultrafiltration

PubMed Central

Shibata, Sayaka; Zhang, Zhongsheng; Korotkov, Konstantin V.; Delarosa, Jaclyn; Napuli, Alberto; Kelley, Angela M.; Mueller, Natasha; Ross, Jennifer; Zucker, Frank H.; Buckner, Frederick S.; Merritt, Ethan A.; Verlinde, Christophe L. M. J.; Van Voorhis, Wesley C.; Hol, Wim G. J.; Fan, Erkang

2011-01-01

Ultrafiltration provides a generic method to discover ligands for protein drug targets with millimolar to micromolar Kd, the typical range of fragment-based drug discovery. This method was tailored to a 96-well format, and cocktails of fragment-sized molecules, with molecular masses between 150 and 300 Da, were screened against medical structural genomics target proteins. The validity of the method was confirmed through competitive binding assays in the presence of ligands known to bind the target proteins. PMID:21750879
TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS

PubMed Central

Jones, Matthew R.; Good, Jeffrey M.

2016-01-01

The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993
How gene order is influenced by the biophysics of transcription regulation

PubMed Central

Kolesov, Grigory; Wunderlich, Zeba; Laikova, Olga N.; Gelfand, Mikhail S.; Mirny, Leonid A.

2007-01-01

What are the forces that shape the structure of prokaryotic genomes: the order of genes, their proximity, and their orientation? Coregulation and coordinated horizontal gene transfer are believed to promote the proximity of functionally related genes and the formation of operons. However, forces that influence the structure of the genome beyond the level of a single operon remain unknown. Here, we show that the biophysical mechanism by which regulatory proteins search for their sites on DNA can impose constraints on genome structure. Using simulations, we demonstrate that rapid and reliable gene regulation requires that the transcription factor (TF) gene be close to the site on DNA the TF has to bind, thus promoting the colocalization of TF genes and their targets on the genome. We use parameters that have been measured in recent experiments to estimate the relevant length and times scales of this process and demonstrate that the search for a cognate site may be prohibitively slow if a TF has a low copy number and is not colocalized. We also analyze TFs and their sites in a number of bacterial genomes, confirm that they are colocalized significantly more often than expected, and show that this observation cannot be attributed to the pressure for coregulation or formation of selfish gene clusters, thus supporting the role of the biophysical constraint in shaping the structure of prokaryotic genomes. Our results demonstrate how spatial organization can influence timing and noise in gene expression. PMID:17709750
The High-Throughput Protein Sample Production Platform of the Northeast Structural Genomics Consortium

PubMed Central

Xiao, Rong; Anderson, Stephen; Aramini, James; Belote, Rachel; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John K.; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Jiang, Mei; Kornhaber, Gregory J.; Lee, Dong Yup; Locke, Jessica Y.; Ma, Li-Chung; Maglaqui, Melissa; Mao, Lei; Mitra, Saheli; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Sharma, Seema; Shastry, Ritu; Swapna, G.V.T.; Tong, Saichu N.; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.; Acton, Thomas B.

2014-01-01

We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities. PMID:20688167
Small molecules targeting viral RNA.

PubMed

Hermann, Thomas

2016-11-01

Highly conserved noncoding RNA (ncRNA) elements in viral genomes and transcripts offer new opportunities to expand the repertoire of drug targets for the development of antiinfective therapy. Ligands binding to ncRNA architectures are able to affect interactions, structural stability or conformational changes and thereby block processes essential for viral replication. Proof of concept for targeting functional RNA by small molecule inhibitors has been demonstrated for multiple viruses with RNA genomes. Strategies to identify antiviral compounds as inhibitors of ncRNA are increasingly emphasizing consideration of drug-like properties of candidate molecules emerging from screening and ligand design. Recent efforts of antiviral lead discovery for RNA targets have provided drug-like small molecules that inhibit viral replication and include inhibitors of human immunodeficiency virus (HIV), hepatitis C virus (HCV), severe respiratory syndrome coronavirus (SARS CoV), and influenza A virus. While target selectivity remains a challenge for the discovery of useful RNA-binding compounds, a better understanding is emerging of properties that define RNA targets amenable for inhibition by small molecule ligands. Insight from successful approaches of targeting viral ncRNA in HIV, HCV, SARS CoV, and influenza A will provide a basis for the future exploration of RNA targets for therapeutic intervention in other viral pathogens which create urgent, unmet medical needs. Viruses for which targeting ncRNA components in the genome or transcripts may be promising include insect-borne flaviviruses (Dengue, Zika, and West Nile) and filoviruses (Ebola and Marburg). WIREs RNA 2016, 7:726-743. doi: 10.1002/wrna.1373 For further resources related to this article, please visit the WIREs website. © 2016 Wiley Periodicals, Inc.
Whole-genome sequencing of an aggressive BRAF wild-type papillary thyroid cancer identified EML4-ALK translocation as a therapeutic target.

PubMed

Demeure, Michael J; Aziz, Meraj; Rosenberg, Richard; Gurley, Steven D; Bussey, Kimberly J; Carpten, John D

2014-06-01

Recent advances in the treatment of cancer have focused on targeting genomic aberrations with selective therapeutic agents. In radioiodine resistant aggressive papillary thyroid cancers, there remain few effective therapeutic options. A 62-year-old man who underwent multiple operations for papillary thyroid cancer and whose metastases progressed despite standard treatments provided tumor tissue. We analyzed tumor and whole blood DNA by whole genome sequencing, achieving 80× or greater coverage over 94 % of the exome and 90 % of the genome. We determined somatic mutations and structural alterations. We found a total of 57 somatic mutations in 55 genes of the cancer genome. There was notably a lack of mutations in NRAS and BRAF, and no RET/PTC rearrangement. There was a mutation in the TRAPP oncogene and a loss of heterozygosity of the p16, p18, and RB1 tumor suppressor genes. The oncogenic driver for this tumor is a translocation involving the genes for anaplastic lymphoma receptor tyrosine kinase (ALK) and echinoderm microtubule associated protein like 4 (EML4). The EML4-ALK translocation has been reported in approximately 5 % of lung cancers, as well as in pediatric neuroblastoma, and is a therapeutic target for crizotinib. This is the first report of the whole genomic sequencing of a papillary thyroid cancer in which we identified an EML4-ALK translocation of a TRAPP oncogene mutation. These findings suggest that this tumor has a more distinct oncogenesis than BRAF mutant papillary thyroid cancer. Whole genome sequencing can elucidate an oncogenic context and expose potential therapeutic vulnerabilities in rare cancers.
Genome Editing with CRISPR-Cas9: Can It Get Any Better?

PubMed Central

Haeussler, Maximilian; Concordet, Jean-Paul

2017-01-01

The CRISPR-Cas revolution is taking place in virtually all fields of life sciences. Harnessing DNA cleavage with the CRISPR-Cas9 system of Streptococcus pyogenes has proven to be extraordinarily simple and efficient, relying only on the design of a synthetic single guide RNA (sgRNA) and its co-expression with Cas9. Here, we review the progress in the design of sgRNA from the original dual RNA guide for S. pyogenes and Staphylococcus aureus Cas9 (SpCas9 and SaCas9). New assays for genome-wide identification of off-targets have provided important insights into the issue of cleavage specificity in vivo. At the same time, the on-target activity of thousands of guides has been determined. These data have led to numerous online tools that facilitate the selection of guide RNAs in target sequences. It appears that for most basic research applications, cleavage activity can be maximized and off-targets minimized by carefully choosing guide RNAs based on computational predictions. Moreover, recent studies of Cas proteins have further improved the flexibility and precision of the CRISPR-Cas toolkit for genome editing. Inspired by the crystal structure of the complex of sgRNA-SpCas9 bound to target DNA, several variants of SpCas9 have recently been engineered, either with novel protospacer adjacent motifs (PAMs) or with drastically reduced off-targets. Novel Cas9 and Cas9-like proteins called Cpf1 have also been characterized from other bacteria and will benefit from the insights obtained from SpCas9. Genome editing with CRISPR-Cas9 may also progress with better understanding and control of cellular DNA repair pathways activated after Cas9-induced DNA cleavage. PMID:27210042
Expansion of the CRISPR-Cas9 genome targeting space through the use of H1 promoter-expressed guide RNAs.

PubMed

Ranganathan, Vinod; Wahlin, Karl; Maruotti, Julien; Zack, Donald J

2014-08-08

The repurposed CRISPR-Cas9 system has recently emerged as a revolutionary genome-editing tool. Here we report a modification in the expression of the guide RNA (gRNA) required for targeting that greatly expands the targetable genome. gRNA expression through the commonly used U6 promoter requires a guanosine nucleotide to initiate transcription, thus constraining genomic-targeting sites to GN19NGG. We demonstrate the ability to modify endogenous genes using H1 promoter-expressed gRNAs, which can be used to target both AN19NGG and GN19NGG genomic sites. AN19NGG sites occur ~15% more frequently than GN19NGG sites in the human genome and the increase in targeting space is also enriched at human genes and disease loci. Together, our results enhance the versatility of the CRISPR technology by more than doubling the number of targetable sites within the human genome and other eukaryotic species.
Genome-wide Analyses of the Structural Gene Families Involved in the Legume-specific 5-Deoxyisoflavonoid Biosynthesis of Lotus japonicus

PubMed Central

Shimada, Norimoto; Sato, Shusei; Akashi, Tomoyoshi; Nakamura, Yasukazu; Tabata, Satoshi; Ayabe, Shin-ichi; Aoki, Toshio

2007-01-01

Abstract A model legume Lotus japonicus (Regel) K. Larsen is one of the subjects of genome sequencing and functional genomics programs. In the course of targeted approaches to the legume genomics, we analyzed the genes encoding enzymes involved in the biosynthesis of the legume-specific 5-deoxyisoflavonoid of L. japonicus, which produces isoflavan phytoalexins on elicitor treatment. The paralogous biosynthetic genes were assigned as comprehensively as possible by biochemical experiments, similarity searches, comparison of the gene structures, and phylogenetic analyses. Among the 10 biosynthetic genes investigated, six comprise multigene families, and in many cases they form gene clusters in the chromosomes. Semi-quantitative reverse transcriptase–PCR analyses showed coordinate up-regulation of most of the genes during phytoalexin induction and complex accumulation patterns of the transcripts in different organs. Some paralogous genes exhibited similar expression specificities, suggesting their genetic redundancy. The molecular evolution of the biosynthetic genes is discussed. The results presented here provide reliable annotations of the genes and genetic markers for comparative and functional genomics of leguminous plants. PMID:17452423
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

PubMed

Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
The organization and evolution of the Responder satellite in species of the Drosophila melanogaster group: dynamic evolution of a target of meiotic drive.

PubMed

Larracuente, Amanda M

2014-11-25

Satellite DNA can make up a substantial fraction of eukaryotic genomes and has roles in genome structure and chromosome segregation. The rapid evolution of satellite DNA can contribute to genomic instability and genetic incompatibilities between species. Despite its ubiquity and its contribution to genome evolution, we currently know little about the dynamics of satellite DNA evolution. The Responder (Rsp) satellite DNA family is found in the pericentric heterochromatin of chromosome 2 of Drosophila melanogaster. Rsp is well-known for being the target of Segregation Distorter (SD)- an autosomal meiotic drive system in D. melanogaster. I present an evolutionary genetic analysis of the Rsp family of repeats in D. melanogaster and its closely-related species in the melanogaster group (D. simulans, D. sechellia, D. mauritiana, D. erecta, and D. yakuba) using a combination of available BAC sequences, whole genome shotgun Sanger reads, Illumina short read deep sequencing, and fluorescence in situ hybridization. I show that Rsp repeats have euchromatic locations throughout the D. melanogaster genome, that Rsp arrays show evidence for concerted evolution, and that Rsp repeats exist outside of D. melanogaster, in the melanogaster group. The repeats in these species are considerably diverged at the sequence level compared to D. melanogaster, and have a strikingly different genomic distribution, even between closely-related sister taxa. The genomic organization of the Rsp repeat in the D. melanogaster genome is complex-it exists of large blocks of tandem repeats in the heterochromatin and small blocks of tandem repeats in the euchromatin. My discovery of heterochromatic Rsp-like sequences outside of D. melanogaster suggests that SD evolved after its target satellite and that the evolution of the Rsp satellite family is highly dynamic over a short evolutionary time scale (<240,000 years).
Genomic Target Database (GTD): A database of potential targets in human pathogenic bacteria

PubMed Central

Barh, Debmalya; Kumar, Anil; Misra, Amarendra Narayana

2009-01-01

A Genomic Target Database (GTD) has been developed having putative genomic drug targets for human bacterial pathogens. The selected pathogens are either drug resistant or vaccines are yet to be developed against them. The drug targets have been identified using subtractive genomics approaches and these are subsequently classified into Drug targets in pathogen specific unique metabolic pathways,Drug targets in host-pathogen common metabolic pathways, andMembrane localized drug targets. HTML code is used to link each target to its various properties and other available public resources. Essential resources and tools for subtractive genomic analysis, sub-cellular localization, vaccine and drug designing are also mentioned. To the best of authors knowledge, no such database (DB) is presently available that has listed metabolic pathways and membrane specific genomic drug targets based on subtractive genomics. Listed targets in GTD are readily available resource in developing drug and vaccine against the respective pathogen, its subtypes, and other family members. Currently GTD contains 58 drug targets for four pathogens. Shortly, drug targets for six more pathogens will be listed. Availability GTD is available at IIOAB website http://www.iioab.webs.com/GTD.htm. It can also be accessed at http://www.iioabdgd.webs.com.GTD is free for academic research and non-commercial use only. Commercial use is strictly prohibited without prior permission from IIOAB. PMID:20011153

A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
Microbial genome analysis: the COG approach.

PubMed

Galperin, Michael Y; Kristensen, David M; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

2017-09-14

For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
An Emerging Tick-Borne Disease of Humans Is Caused by a Subset of Strains with Conserved Genome Structure

PubMed Central

Barbet, Anthony F.; Al-Khedery, Basima; Stuen, Snorre; Granquist, Erik G.; Felsheim, Roderick F.; Munderloh, Ulrike G.

2013-01-01

The prevalence of tick-borne diseases is increasing worldwide. One such emerging disease is human anaplasmosis. The causative organism, Anaplasma phagocytophilum, is known to infect multiple animal species and cause human fatalities in the U.S., Europe and Asia. Although long known to infect ruminants, it is unclear why there are increasing numbers of human infections. We analyzed the genome sequences of strains infecting humans, animals and ticks from diverse geographic locations. Despite extensive variability amongst these strains, those infecting humans had conserved genome structure including the pfam01617 superfamily that encodes the major, neutralization-sensitive, surface antigen. These data provide potential targets to identify human-infective strains and have significance for understanding the selective pressures that lead to emergence of disease in new species. PMID:25437207
The Enzyme Function Initiative†

PubMed Central

Gerlt, John A.; Allen, Karen N.; Almo, Steven C.; Armstrong, Richard N.; Babbitt, Patricia C.; Cronan, John E.; Dunaway-Mariano, Debra; Imker, Heidi J.; Jacobson, Matthew P.; Minor, Wladek; Poulter, C. Dale; Raushel, Frank M.; Sali, Andrej; Shoichet, Brian K.; Sweedler, Jonathan V.

2011-01-01

The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily-specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include: 1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation); 2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia; 3) computational and bioinformatic tools for using the strategy; 4) provision of experimental protocols and/or reagents for enzyme production and characterization; and 5) dissemination of data via the EFI’s website, enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal and pharmaceutical efforts. PMID:21999478
The Enzyme Function Initiative.

PubMed

Gerlt, John A; Allen, Karen N; Almo, Steven C; Armstrong, Richard N; Babbitt, Patricia C; Cronan, John E; Dunaway-Mariano, Debra; Imker, Heidi J; Jacobson, Matthew P; Minor, Wladek; Poulter, C Dale; Raushel, Frank M; Sali, Andrej; Shoichet, Brian K; Sweedler, Jonathan V

2011-11-22

The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic, we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include (1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation), (2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia, (3) computational and bioinformatic tools for using the strategy, (4) provision of experimental protocols and/or reagents for enzyme production and characterization, and (5) dissemination of data via the EFI's Website, http://enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal, and pharmaceutical efforts. © 2011 American Chemical Society
Polyploidy and the relationship between leaf structure and function: implications for correlated evolution of anatomy, morphology, and physiology in Brassica.

PubMed

Baker, Robert L; Yarkhunova, Yulia; Vidal, Katherine; Ewers, Brent E; Weinig, Cynthia

2017-01-05

Polyploidy is well studied from a genetic and genomic perspective, but the morphological, anatomical, and physiological consequences of polyploidy remain relatively uncharacterized. Whether these potential changes bear on functional integration or are idiosyncratic remains an open question. Repeated allotetraploid events and multiple genomic combinations as well as overlapping targets of artificial selection make the Brassica triangle an excellent system for exploring variation in the connection between plant structure (anatomy and morphology) and function (physiology). We examine phenotypic integration among structural aspects of leaves including external morphology and internal anatomy with leaf-level physiology among several species of Brassica. We compare diploid and allotetraploid species to ascertain patterns of phenotypic correlations among structural and functional traits and test the hypothesis that allotetraploidy results in trait disintegration allowing for transgressive phenotypes and additional evolutionary and crop improvement potential. Among six Brassica species, we found significant effects of species and ploidy level for morphological, anatomical and physiological traits. We identified three suites of intercorrelated traits in both diploid parents and allotetraploids: Morphological traits (such as leaf area and perimeter) anatomic traits (including ab- and ad- axial epidermis) and aspects of physiology. In general, there were more correlations between structural and functional traits for allotetraploid hybrids than diploid parents. Parents and hybrids did not have any significant structure-function correlations in common. Of particular note, there were no significant correlations between morphological structure and physiological function in the diploid parents. Increased phenotypic integration in the allotetraploid hybrids may be due, in part, to increased trait ranges or simply different structure-function relationships. Genomic and chromosomal instability in early generation allotetraploids may allow Brassica species to explore new trait space and potentially reach higher adaptive peaks than their progenitor species could, despite temporary fitness costs associated with unstable genomes. The trait correlations that disappear after hybridization as well as the novel trait correlations observed in allotetraploid hybrids may represent relatively evolutionarily labile associations and therefore could be ideal targets for artificial selection and crop improvement.
TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

PubMed

Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

2018-04-11

Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
A Roadmap for Functional Structural Variants in the Soybean Genome

PubMed Central

Anderson, Justin E.; Kantar, Michael B.; Kono, Thomas Y.; Fu, Fengli; Stec, Adrian O.; Song, Qijian; Cregan, Perry B.; Specht, James E.; Diers, Brian W.; Cannon, Steven B.; McHale, Leah K.; Stupar, Robert M.

2014-01-01

Gene structural variation (SV) has recently emerged as a key genetic mechanism underlying several important phenotypic traits in crop species. We screened a panel of 41 soybean (Glycine max) accessions serving as parents in a soybean nested association mapping population for deletions and duplications in more than 53,000 gene models. Array hybridization and whole genome resequencing methods were used as complementary technologies to identify SV in 1528 genes, or approximately 2.8%, of the soybean gene models. Although SV occurs throughout the genome, SV enrichment was noted in families of biotic defense response genes. Among accessions, SV was nearly eightfold less frequent for gene models that have retained paralogs since the last whole genome duplication event, compared with genes that have not retained paralogs. Increases in gene copy number, similar to that described at the Rhg1 resistance locus, account for approximately one-fourth of the genic SV events. This assessment of soybean SV occurrence presents a target list of genes potentially responsible for rapidly evolving and/or adaptive traits. PMID:24855315
CRISPR/Cas9 in Genome Editing and Beyond.

PubMed

Wang, Haifeng; La Russa, Marie; Qi, Lei S

2016-06-02

The Cas9 protein (CRISPR-associated protein 9), derived from type II CRISPR (clustered regularly interspaced short palindromic repeats) bacterial immune systems, is emerging as a powerful tool for engineering the genome in diverse organisms. As an RNA-guided DNA endonuclease, Cas9 can be easily programmed to target new sites by altering its guide RNA sequence, and its development as a tool has made sequence-specific gene editing several magnitudes easier. The nuclease-deactivated form of Cas9 further provides a versatile RNA-guided DNA-targeting platform for regulating and imaging the genome, as well as for rewriting the epigenetic status, all in a sequence-specific manner. With all of these advances, we have just begun to explore the possible applications of Cas9 in biomedical research and therapeutics. In this review, we describe the current models of Cas9 function and the structural and biochemical studies that support it. We focus on the applications of Cas9 for genome editing, regulation, and imaging, discuss other possible applications and some technical considerations, and highlight the many advantages that CRISPR/Cas9 technology offers.
Genome-wide high-throughput SNP discovery and genotyping for understanding natural (functional) allelic diversity and domestication patterns in wild chickpea

PubMed Central

Bajaj, Deepak; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

2015-01-01

We identified 82489 high-quality genome-wide SNPs from 93 wild and cultivated Cicer accessions through integrated reference genome- and de novo-based GBS assays. High intra- and inter-specific polymorphic potential (66–85%) and broader natural allelic diversity (6–64%) detected by genome-wide SNPs among accessions signify their efficacy for monitoring introgression and transferring target trait-regulating genomic (gene) regions/allelic variants from wild to cultivated Cicer gene pools for genetic improvement. The population-specific assignment of wild Cicer accessions pertaining to the primary gene pool are more influenced by geographical origin/phenotypic characteristics than species/gene-pools of origination. The functional significance of allelic variants (non-synonymous and regulatory SNPs) scanned from transcription factors and stress-responsive genes in differentiating wild accessions (with potential known sources of yield-contributing and stress tolerance traits) from cultivated desi and kabuli accessions, fine-mapping/map-based cloning of QTLs and determination of LD patterns across wild and cultivated gene-pools are suitably elucidated. The correlation between phenotypic (agromorphological traits) and molecular diversity-based admixed domestication patterns within six structured populations of wild and cultivated accessions via genome-wide SNPs was apparent. This suggests utility of whole genome SNPs as a potential resource for identifying naturally selected trait-regulating genomic targets/functional allelic variants adaptive to diverse agroclimatic regions for genetic enhancement of cultivated gene-pools. PMID:26208313
Structural basis for genome wide recognition of 5-bp GC motifs by SMAD transcription factors.

PubMed

Martin-Malpartida, Pau; Batet, Marta; Kaczmarska, Zuzanna; Freier, Regina; Gomes, Tiago; Aragón, Eric; Zou, Yilong; Wang, Qiong; Xi, Qiaoran; Ruiz, Lidia; Vea, Angela; Márquez, José A; Massagué, Joan; Macias, Maria J

2017-12-12

Smad transcription factors activated by TGF-β or by BMP receptors form trimeric complexes with Smad4 to target specific genes for cell fate regulation. The CAGAC motif has been considered as the main binding element for Smad2/3/4, whereas Smad1/5/8 have been thought to preferentially bind GC-rich elements. However, chromatin immunoprecipitation analysis in embryonic stem cells showed extensive binding of Smad2/3/4 to GC-rich cis-regulatory elements. Here, we present the structural basis for specific binding of Smad3 and Smad4 to GC-rich motifs in the goosecoid promoter, a nodal-regulated differentiation gene. The structures revealed a 5-bp consensus sequence GGC(GC)|(CG) as the binding site for both TGF-β and BMP-activated Smads and for Smad4. These 5GC motifs are highly represented as clusters in Smad-bound regions genome-wide. Our results provide a basis for understanding the functional adaptability of Smads in different cellular contexts, and their dependence on lineage-determining transcription factors to target specific genes in TGF-β and BMP pathways.
DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo.

PubMed

Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M; Weissman, Jonathan S; Rouskin, Silvi

2017-01-01

Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.
Natural Allelic Variations in Highly Polyploidy Saccharum Complex

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.

Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
Natural Allelic Variations in Highly Polyploidy Saccharum Complex

DOE PAGES

Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.; ...

2016-06-08

Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
Personalized oncogenomic analysis of metastatic adenoid cystic carcinoma: using whole-genome sequencing to inform clinical decision-making

PubMed Central

Chahal, Manik; Pleasance, Erin; Grewal, Jasleen; Zhao, Eric; Ng, Tony; Chapman, Erin; Jones, Martin R.; Shen, Yaoqing; Mungall, Karen L.; Bonakdar, Melika; Taylor, Gregory A.; Ma, Yussanne; Mungall, Andrew J.; Moore, Richard A.; Lim, Howard; Renouf, Daniel; Yip, Stephen; Jones, Steven J.M.; Marra, Marco A.; Laskin, Janessa

2018-01-01

Metastatic adenoid cystic carcinomas (ACCs) can cause significant morbidity and mortality. Because of their slow growth and relative rarity, there is limited evidence for systemic therapy regimens. Recently, molecular profiling studies have begun to reveal the genetic landscape of these poorly understood cancers, and new treatment possibilities are beginning to emerge. The objective is to use whole-genome and transcriptome sequencing and analysis to better understand the genetic alterations underlying the pathology of metastatic and rare ACCs and determine potentially actionable therapeutic targets. We report five cases of metastatic ACC, not originating in the salivary glands, in patients enrolled in the Personalized Oncogenomics (POG) Program at the BC Cancer Agency. Genomic workup included whole-genome and transcriptome sequencing, detailed analysis of tumor alterations, and integration with existing knowledge of drug–target combinations to identify potential therapeutic targets. Analysis reveals low mutational burden in these five ACC cases, and mutation signatures that are commonly observed in multiple cancer types. Notably, the only recurrent structural aberration identified was the well-described MYB-NFIB fusion that was present in four of five cases, and one case exhibited a closely related MYBL1-NFIB fusion. Recurrent mutations were also identified in BAP1 and BCOR, with additional mutations in individual samples affecting NOTCH1 and the epigenetic regulators ARID2, SMARCA2, and SMARCB1. Copy changes were rare, and they included amplification of MYC and homozygous loss of CDKN2A in individual samples. Genomic analysis revealed therapeutic targets in all five cases and served to inform a therapeutic choice in three of the cases to date. PMID:29610392
Bigfoot. a new family of MITE elements characterized from the Medicago genus.

PubMed

Charrier, B; Foucher, F; Kondorosi, E; d'Aubenton-Carafa, Y; Thermes, C; Kondorosi, A; Ratet, P

1999-05-01

We have characterized from the legume plant Medicago a new family of miniature inverted-repeat transposable elements (MITE), called the Bigfoot transposable elements. Two of these insertion elements are present only in a single allele of two different M. sativa genes. Using a PCR strategy we have isolated 19 other Bigfoot elements from the M. sativa and M. truncatula genomes. They differ from the previously characterized MITEs by their sequence, a target site of 9 bp and a partially clustered genomic distribution. In addition, we show that they exhibit a significantly stable secondary structure. These elements may represent up to 0.1% of the genome of the outcrossing Medicago sativa but are present at a reduced copy number in the genome of the autogamous M. truncatula plant, revealing major differences in the genome organization of these two plants.
Refolding strategies from inclusion bodies in a structural genomics project.

PubMed

Trésaugues, Lionel; Collinet, Bruno; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Liger, Dominique; Zhou, Cong-Zhao; Janin, Joël; Van Tilbeurgh, Herman; Quevillon-Cheruel, Sophie

2004-01-01

The South-Paris Yeast Structural Genomics Project aims at systematically expressing, purifying and determining the structure of S. cerevisiae proteins with no detectable homology to proteins of known structure. We brought 250 yeast ORFs to expression in E. coli, but 37% of them form inclusion bodies. This important fraction of proteins that are well expressed but lost for structural studies prompted us to test methodologies to recover these proteins. Three different strategies were explored in parallel on a set of 20 proteins: (1) refolding from solubilized inclusion bodies using an original and fast 96-well plates screening test, (2) co-expression of the targets in E. coli with DnaK-DnaJ-GrpE and GroEL-GroES chaperones, and (3) use of the cell-free expression system. Most of the tested proteins (17/20) could be resolubilized at least by one approach, but the subsequent purification proved to be difficult for most of them.
Novel and viable acetylcholinesterase target site for developing effective and environmentally safe insecticides.

PubMed

Pang, Yuan-Ping; Brimijoin, Stephen; Ragsdale, David W; Zhu, Kun Yan; Suranyi, Robert

2012-04-01

Insect pests are responsible for human suffering and financial losses worldwide. New and environmentally safe insecticides are urgently needed to cope with these serious problems. Resistance to current insecticides has resulted in a resurgence of insect pests, and growing concerns about insecticide toxicity to humans discourage the use of insecticides for pest control. The small market for insecticides has hampered insecticide development; however, advances in genomics and structural genomics offer new opportunities to develop insecticides that are less dependent on the insecticide market. This review summarizes the literature data that support the hypothesis that an insect-specific cysteine residue located at the opening of the acetylcholinesterase active site is a promising target site for developing new insecticides with reduced off-target toxicity and low propensity for insect resistance. These data are used to discuss the differences between targeting the insect-specific cysteine residue and targeting the ubiquitous catalytic serine residue of acetylcholinesterase from the perspective of reducing off-target toxicity and insect resistance. Also discussed is the prospect of developing cysteine-targeting anticholinesterases as effective and environmentally safe insecticides for control of disease vectors, crop damage, and residential insect pests within the financial confines of the present insecticide market.
Novel and Viable Acetylcholinesterase Target Site for Developing Effective and Environmentally Safe Insecticides

PubMed Central

Pang, Yuan-Ping; Brimijoin, Stephen; Ragsdale, David W; Zhu, Kun Yan; Suranyi, Robert

2012-01-01

Insect pests are responsible for human suffering and financial losses worldwide. New and environmentally safe insecticides are urgently needed to cope with these serious problems. Resistance to current insecticides has resulted in a resurgence of insect pests, and growing concerns about insecticide toxicity to humans discourage the use of insecticides for pest control. The small market for insecticides has hampered insecticide development; however, advances in genomics and structural genomics offer new opportunities to develop insecticides that are less dependent on the insecticide market. This review summarizes the literature data that support the hypothesis that an insect-specific cysteine residue located at the opening of the acetylcholinesterase active site is a promising target site for developing new insecticides with reduced off-target toxicity and low propensity for insect resistance. These data are used to discuss the differences between targeting the insect-specific cysteine residue and targeting the ubiquitous catalytic serine residue of acetylcholinesterase from the perspective of reducing off-target toxicity and insect resistance. Also discussed is the prospect of developing cysteine-targeting anticholinesterases as effective and environmentally safe insecticides for control of disease vectors, crop damage, and residential insect pests within the financial confines of the present insecticide market. PMID:22280344
Specific and selective target detection of supra-genome 21 Mers Salmonella via silicon nanowires biosensor

NASA Astrophysics Data System (ADS)

Mustafa, Mohammad Razif Bin; Dhahi, Th S.; Ehfaed, Nuri. A. K. H.; Adam, Tijjani; Hashim, U.; Azizah, N.; Mohammed, Mohammed; Noriman, N. Z.

2017-09-01

The nano structure based on silicon can be surface modified to be used as label-free biosensors that allow real-time measurements. The silicon nanowire surface was functionalized using 3-aminopropyltrimethoxysilane (APTES), which functions as a facilitator to immobilize biomolecules on the silicon nanowire surface. The process is simple, economical; this will pave the way for point-of-care applications. However, the surface modification and subsequent detection mechanism still not clear. Thus, study proposed step by step process of silicon nano surface modification and its possible in specific and selective target detection of Supra-genome 21 Mers Salmonella. The device captured the molecule with precisely; the approach took the advantages of strong binding chemistry created between APTES and biomolecule. The results indicated how modifications of the nanowires provide sensing capability with strong surface chemistries that can lead to specific and selective target detection.

Postgenomic strategies in antibacterial drug discovery.

PubMed

Brötz-Oesterhelt, Heike; Sass, Peter

2010-10-01

During the last decade the field of antibacterial drug discovery has changed in many aspects including bacterial organisms of primary interest, discovery strategies applied and pharmaceutical companies involved. Target-based high-throughput screening had been disappointingly unsuccessful for antibiotic research. Understanding of this lack of success has increased substantially and the lessons learned refer to characteristics of targets, screening libraries and screening strategies. The 'genomics' approach was replaced by a diverse array of discovery strategies, for example, searching for new natural product leads among previously abandoned compounds or new microbial sources, screening for synthetic inhibitors by targeted approaches including structure-based design and analyses of focused libraries and designing resistance-breaking properties into antibiotics of established classes. Furthermore, alternative treatment options are being pursued including anti-virulence strategies and immunotherapeutic approaches. This article summarizes the lessons learned from the genomics era and describes discovery strategies resulting from that knowledge.
Are highly morphed peptide frameworks lurking silently in microbial genomes valuable as next generation antibiotic scaffolds?

PubMed

Walsh, Christopher T

2017-07-01

Antibiotics are a therapeutic class that, once deployed, select for resistant bacterial pathogens and so shorten their useful life cycles. As a consequence new versions of antibiotics are constantly needed. Among the antibiotic natural products, morphed peptide scaffolds, converting conformationally mobile, short-lived linear peptides into compact, rigidified small molecule frameworks, act on a wide range of bacterial targets. Advances in bacterial genome mining, biosynthetic gene cluster prediction and expression, and mass spectroscopic structure analysis suggests many more peptides, modified both in side chains and peptide backbones, await discovery. Such molecules may turn up new bacterial targets and be starting points for combinatorial or semisynthetic manipulations to optimize activity and pharmacology parameters.
NMR-based investigations into target DNA search processes of proteins.

PubMed

Iwahara, Junji; Zandarashvili, Levani; Kemme, Catherine A; Esadze, Alexandre

2018-05-10

To perform their function, transcription factors and DNA-repair/modifying enzymes must first locate their targets in the vast presence of nonspecific, but structurally similar sites on genomic DNA. Before reaching their targets, these proteins stochastically scan DNA and dynamically move from one site to another on DNA. Solution NMR spectroscopy provides unique atomic-level insights into the dynamic DNA-scanning processes, which are difficult to gain by any other experimental means. In this review, we provide an introductory overview on the NMR methods for the structural, dynamic, and kinetic investigations of target DNA search by proteins. We also discuss advantages and disadvantages of these NMR methods over other methods such as single-molecule techniques and biochemical approaches. Copyright © 2018 Elsevier Inc. All rights reserved.
Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1

PubMed Central

Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron

2014-01-01

• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629
Nuclease Target Site Selection for Maximizing On-target Activity and Minimizing Off-target Effects in Genome Editing

PubMed Central

Lee, Ciaran M; Cradick, Thomas J; Fine, Eli J; Bao, Gang

2016-01-01

The rapid advancement in targeted genome editing using engineered nucleases such as ZFNs, TALENs, and CRISPR/Cas9 systems has resulted in a suite of powerful methods that allows researchers to target any genomic locus of interest. A complementary set of design tools has been developed to aid researchers with nuclease design, target site selection, and experimental validation. Here, we review the various tools available for target selection in designing engineered nucleases, and for quantifying nuclease activity and specificity, including web-based search tools and experimental methods. We also elucidate challenges in target selection, especially in predicting off-target effects, and discuss future directions in precision genome editing and its applications. PMID:26750397
Structure and Engineering of Francisella novicida Cas9

PubMed Central

Hirano, Hisato; Gootenberg, Jonathan S.; Horii, Takuro; Abudayyeh, Omar O.; Kimura, Mika; Hsu, Patrick D.; Nakane, Takanori; Ishitani, Ryuichiro; Hatada, Izuho; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

2016-01-01

Summary The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA, and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5′-NGG-3′ PAM, and used the structural information to create a variant that can recognize the more relaxed 5′-YG-3′ PAM. Furthermore, we demonstrated that pre-assembled FnCas9 ribonucleoprotein complexes can be microinjected into mouse zygotes to edit endogenous sites with the 5′-YG-3′ PAMs, thus expanding the target space of the CRISPR-Cas9 toolbox. PMID:26875867
Structure and Engineering of Francisella novicida Cas9.

PubMed

Hirano, Hisato; Gootenberg, Jonathan S; Horii, Takuro; Abudayyeh, Omar O; Kimura, Mika; Hsu, Patrick D; Nakane, Takanori; Ishitani, Ryuichiro; Hatada, Izuho; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

2016-02-25

The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5'-NGG-3' PAM, and used the structural information to create a variant that can recognize the more relaxed 5'-YG-3' PAM. Furthermore, we demonstrated that the FnCas9-ribonucleoprotein complex can be microinjected into mouse zygotes to edit endogenous sites with the 5'-YG-3' PAM, thus expanding the target space of the CRISPR-Cas9 toolbox. Copyright © 2016 Elsevier Inc. All rights reserved.
Subclonal diversification of primary breast cancer revealed by multiregion sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian

Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less
Subclonal diversification of primary breast cancer revealed by multiregion sequencing

DOE PAGES

Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; ...

2015-06-22

Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less
Chromatin accessibility and guide sequence secondary structure affect CRISPR-Cas9 gene editing efficiency.

PubMed

Jensen, Kristopher Torp; Fløe, Lasse; Petersen, Trine Skov; Huang, Jinrong; Xu, Fengping; Bolund, Lars; Luo, Yonglun; Lin, Lin

2017-07-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (CRISPR-Cas9) systems have emerged as the method of choice for genome editing, but large variations in on-target efficiencies continue to limit their applicability. Here, we investigate the effect of chromatin accessibility on Cas9-mediated gene editing efficiency for 20 gRNAs targeting 10 genomic loci in HEK293T cells using both SpCas9 and the eSpCas9(1.1) variant. Our study indicates that gene editing is more efficient in euchromatin than in heterochromatin, and we validate this finding in HeLa cells and in human fibroblasts. Furthermore, we investigate the gRNA sequence determinants of CRISPR-Cas9 activity using a surrogate reporter system and find that the efficiency of Cas9-mediated gene editing is dependent on guide sequence secondary structure formation. This knowledge can aid in the further improvement of tools for gRNA design. © 2017 Federation of European Biochemical Societies.
TARGET Research Goals

Cancer.gov

TARGET researchers use various sequencing and array-based methods to examine the genomes, transcriptomes, and for some diseases epigenomes of select childhood cancers. This “multi-omic” approach generates a comprehensive profile of molecular alterations for each cancer type. Alterations are changes in DNA or RNA, such as rearrangements in chromosome structure or variations in gene expression, respectively. Through computational analyses and assays to validate biological function, TARGET researchers predict which alterations disrupt the function of a gene or pathway and promote cancer growth, progression, and/or survival. Researchers identify candidate therapeutic targets and/or prognostic markers from the cancer-associated alterations.
Imaging analysis of nuclear antiviral factors through direct detection of incoming adenovirus genome complexes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Komatsu, Tetsuro; Department of Infection Biology, Faculty of Medicine, University of Tsukuba, Tsukuba 305-8575; Will, Hans

2016-04-22

Recent studies involving several viral systems have highlighted the importance of cellular intrinsic defense mechanisms through nuclear antiviral proteins that restrict viral propagation. These factors include among others components of PML nuclear bodies, the nuclear DNA sensor IFI16, and a potential restriction factor PHF13/SPOC1. For several nuclear replicating DNA viruses, it was shown that these factors sense and target viral genomes immediately upon nuclear import. In contrast to the anticipated view, we recently found that incoming adenoviral genomes are not targeted by PML nuclear bodies. Here we further explored cellular responses against adenoviral infection by focusing on specific conditions asmore » well as additional nuclear antiviral factors. In line with our previous findings, we show that neither interferon treatment nor the use of specific isoforms of PML nuclear body components results in co-localization between incoming adenoviral genomes and the subnuclear domains. Furthermore, our imaging analyses indicated that neither IFI16 nor PHF13/SPOC1 are likely to target incoming adenoviral genomes. Thus our findings suggest that incoming adenoviral genomes may be able to escape from a large repertoire of nuclear antiviral mechanisms, providing a rationale for the efficient initiation of lytic replication cycle. - Highlights: • Host nuclear antiviral factors were analyzed upon adenovirus genome delivery. • Interferon treatments fail to permit PML nuclear bodies to target adenoviral genomes. • Neither Sp100A nor B targets adenoviral genomes despite potentially opposite roles. • The nuclear DNA sensor IFI16 does not target incoming adenoviral genomes. • PHF13/SPOC1 targets neither incoming adenoviral genomes nor genome-bound protein VII.« less
Identification and characterization of a class of MALAT1 -like genomic loci

DOE PAGES

Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.; ...

2017-05-23

The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less
Identification and characterization of a class of MALAT1 -like genomic loci

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.

The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less
Mechanisms Used for Genomic Proliferation by Thermophilic Group II Introns

PubMed Central

Mohr, Georg; Ghanem, Eman; Lambowitz, Alan M.

2010-01-01

Mobile group II introns, which are found in bacterial and organellar genomes, are site-specific retroelments hypothesized to be evolutionary ancestors of spliceosomal introns and retrotransposons in higher organisms. Most bacteria, however, contain no more than one or a few group II introns, making it unclear how introns could have proliferated to higher copy numbers in eukaryotic genomes. An exception is the thermophilic cyanobacterium Thermosynechococcus elongatus, which contains 28 closely related copies of a group II intron, constituting ∼1.3% of the genome. Here, by using a combination of bioinformatics and mobility assays at different temperatures, we identified mechanisms that contribute to the proliferation of T. elongatus group II introns. These mechanisms include divergence of DNA target specificity to avoid target site saturation; adaptation of some intron-encoded reverse transcriptases to splice and mobilize multiple degenerate introns that do not encode reverse transcriptases, leading to a common splicing apparatus; and preferential insertion within other mobile introns or insertion elements, which provide new unoccupied sites in expanding non-essential DNA regions. Additionally, unlike mesophilic group II introns, the thermophilic T. elongatus introns rely on elevated temperatures to help promote DNA strand separation, enabling access to a larger number of DNA target sites by base pairing of the intron RNA, with minimal constraint from the reverse transcriptase. Our results provide insight into group II intron proliferation mechanisms and show that higher temperatures, which are thought to have prevailed on Earth during the emergence of eukaryotes, favor intron proliferation by increasing the accessibility of DNA target sites. We also identify actively mobile thermophilic introns, which may be useful for structural studies, gene targeting in thermophiles, and as a source of thermostable reverse transcriptases. PMID:20543989
Early developmental gene enhancers affect subcortical volumes in the adult human brain.

PubMed

Becker, Martin; Guadalupe, Tulio; Franke, Barbara; Hibar, Derrek P; Renteria, Miguel E; Stein, Jason L; Thompson, Paul M; Francks, Clyde; Vernes, Sonja C; Fisher, Simon E

2016-05-01

Genome-wide association screens aim to identify common genetic variants contributing to the phenotypic variability of complex traits, such as human height or brain morphology. The identified genetic variants are mostly within noncoding genomic regions and the biology of the genotype-phenotype association typically remains unclear. In this article, we propose a complementary targeted strategy to reveal the genetic underpinnings of variability in subcortical brain volumes, by specifically selecting genomic loci that are experimentally validated forebrain enhancers, active in early embryonic development. We hypothesized that genetic variation within these enhancers may affect the development and ultimately the structure of subcortical brain regions in adults. We tested whether variants in forebrain enhancer regions showed an overall enrichment of association with volumetric variation in subcortical structures of >13,000 healthy adults. We observed significant enrichment of genomic loci that affect the volume of the hippocampus within forebrain enhancers (empirical P = 0.0015), a finding which robustly passed the adjusted threshold for testing of multiple brain phenotypes (cutoff of P < 0.0083 at an alpha of 0.05). In analyses of individual single nucleotide polymorphisms (SNPs), we identified an association upstream of the ID2 gene with rs7588305 and variation in hippocampal volume. This SNP-based association survived multiple-testing correction for the number of SNPs analyzed but not for the number of subcortical structures. Targeting known regulatory regions offers a way to understand the underlying biology that connects genotypes to phenotypes, particularly in the context of neuroimaging genetics. This biology-driven approach generates testable hypotheses regarding the functional biology of identified associations. Hum Brain Mapp 37:1788-1800, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers

PubMed Central

Sahl, Jason W.; Vazquez, Adam J.; Hall, Carina M.; Busch, Joseph D.; Tuanyok, Apichai; Mayo, Mark; Schupp, James M.; Lummis, Madeline; Pearson, Talima; Shippy, Kenzie; Allender, Christopher J.; Theobald, Vanessa; Hutcheson, Alex; Korlach, Jonas; LiPuma, John J.; Ladner, Jason; Lovett, Sean; Koroleva, Galina; Palacios, Gustavo; Limmathurotsakul, Direk; Wuthiekanun, Vanaporn; Wongsuwan, Gumphol; Currie, Bart J.

2016-01-01

ABSTRACT Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. PMID:27651357
An Adenovirus DNA Replication Factor, but Not Incoming Genome Complexes, Targets PML Nuclear Bodies.

PubMed

Komatsu, Tetsuro; Nagata, Kyosuke; Wodrich, Harald

2016-02-01

Promyelocytic leukemia protein nuclear bodies (PML-NBs) are subnuclear domains implicated in cellular antiviral responses. Despite the antiviral activity, several nuclear replicating DNA viruses use the domains as deposition sites for the incoming viral genomes and/or as sites for viral DNA replication, suggesting that PML-NBs are functionally relevant during early viral infection to establish productive replication. Although PML-NBs and their components have also been implicated in the adenoviral life cycle, it remains unclear whether incoming adenoviral genome complexes target PML-NBs. Here we show using immunofluorescence and live-cell imaging analyses that incoming adenovirus genome complexes neither localize at nor recruit components of PML-NBs during early phases of infection. We further show that the viral DNA binding protein (DBP), an early expressed viral gene and essential DNA replication factor, independently targets PML-NBs. We show that DBP oligomerization is required to selectively recruit the PML-NB components Sp100 and USP7. Depletion experiments suggest that the absence of one PML-NB component might not affect the recruitment of other components toward DBP oligomers. Thus, our findings suggest a model in which an adenoviral DNA replication factor, but not incoming viral genome complexes, targets and modulates PML-NBs to support a conducive state for viral DNA replication and argue against a generalized concept that PML-NBs target incoming viral genomes. The immediate fate upon nuclear delivery of genomes of incoming DNA viruses is largely unclear. Early reports suggested that incoming genomes of herpesviruses are targeted and repressed by PML-NBs immediately upon nuclear import. Genome localization and/or viral DNA replication has also been observed at PML-NBs for other DNA viruses. Thus, it was suggested that PML-NBs may immediately sense and target nuclear viral genomes and hence serve as sites for deposition of incoming viral genomes and/or subsequent viral DNA replication. Here we performed a detailed analyses of the spatiotemporal distribution of incoming adenoviral genome complexes and found, in contrast to the expectation, that an adenoviral DNA replication factor, but not incoming genomes, targets PML-NBs. Thus, our findings may explain why adenoviral genomes could be observed at PML-NBs in earlier reports but argue against a generalized role for PML-NBs in targeting invading viral genomes. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
An analysis of possible off target effects following CAS9/CRISPR targeted deletions of neuropeptide gene enhancers from the mouse genome.

PubMed

Hay, Elizabeth Anne; Khalaf, Abdulla Razak; Marini, Pietro; Brown, Andrew; Heath, Karyn; Sheppard, Darrin; MacKenzie, Alasdair

2017-08-01

We have successfully used comparative genomics to identify putative regulatory elements within the human genome that contribute to the tissue specific expression of neuropeptides such as galanin and receptors such as CB1. However, a previous inability to rapidly delete these elements from the mouse genome has prevented optimal assessment of their function in-vivo. This has been solved using CAS9/CRISPR genome editing technology which uses a bacterial endonuclease called CAS9 that, in combination with specifically designed guide RNA (gRNA) molecules, cuts specific regions of the mouse genome. However, reports of "off target" effects, whereby the CAS9 endonuclease is able to cut sites other than those targeted, limits the appeal of this technology. We used cytoplasmic microinjection of gRNA and CAS9 mRNA into 1-cell mouse embryos to rapidly generate enhancer knockout mouse lines. The current study describes our analysis of the genomes of these enhancer knockout lines to detect possible off-target effects. Bioinformatic analysis was used to identify the most likely putative off-target sites and to design PCR primers that would amplify these sequences from genomic DNA of founder enhancer deletion mouse lines. Amplified DNA was then sequenced and blasted against the mouse genome sequence to detect off-target effects. Using this approach we were unable to detect any evidence of off-target effects in the genomes of three founder lines using any of the four gRNAs used in the analysis. This study suggests that the problem of off-target effects in transgenic mice have been exaggerated and that CAS9/CRISPR represents a highly effective and accurate method of deleting putative neuropeptide gene enhancer sequences from the mouse genome. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Hyb-Seq: combining target enrichment and genome skimming for plant phylogenomics

Treesearch

Kevin Weitemier; Shannon C.K. Straub; Richard C. Cronn; Mark Fishbein; Roswitha Schmickl; Angela McDonnell; Aaron Liston

2014-01-01

â¢ Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. â¢ Methods and Results: Genome and transcriptome assemblies for milkweed ( Asclepias syriaca ) were used to design enrichment probes for 3385...

Enhancing Targeted Genomic DNA Editing in Chicken Cells Using the CRISPR/Cas9 System

PubMed Central

Wang, Ling; Yang, Likai; Guo, Yijie; Du, Weili; Yin, Yajun; Zhang, Tao; Lu, Hongzhao

2017-01-01

The CRISPR/Cas9 system has enabled highly efficient genome targeted editing for various organisms. However, few studies have focused on CRISPR/Cas9 nuclease-mediated chicken genome editing compared with mammalian genomes. The current study combined CRISPR with yeast Rad52 (yRad52) to enhance targeted genomic DNA editing in chicken DF-1 cells. The efficiency of CRISPR/Cas9 nuclease-induced targeted mutations in the chicken genome was increased to 41.9% via the enrichment of the dual-reporter surrogate system. In addition, the combined effect of CRISPR nuclease and yRad52 dramatically increased the efficiency of the targeted substitution in the myostatin gene using 50-mer oligodeoxynucleotides (ssODN) as the donor DNA, resulting in a 36.7% editing efficiency after puromycin selection. Furthermore, based on the effect of yRad52, the frequency of exogenous gene integration in the chicken genome was more than 3-fold higher than that without yRad52. Collectively, these results suggest that ssODN is an ideal donor DNA for targeted substitution and that CRISPR/Cas9 combined with yRad52 significantly enhances chicken genome editing. These findings could be extensively applied in other organisms. PMID:28068387
Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

PubMed

Wong, Lai-Ping; Lai, Jason Kuan-Han; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Cheng, Anthony Youzhi; Pillai, Nisha Esakimuthu; Liu, Xuanyao; Xu, Wenting; Chen, Peng; Foo, Jia-Nee; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Soong, Richie; Wenk, Markus Rene; Lim, Wei-Yen; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

2014-05-01

South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Epigenomics in cancer management

PubMed Central

Costa, Fabricio F

2010-01-01

The identification of all epigenetic modifications implicated in gene expression is the next step for a better understanding of human biology in both normal and pathological states. This field is referred to as epigenomics, and it is defined as epigenetic changes (ie, DNA methylation, histone modifications and regulation by noncoding RNAs such as microRNAs) on a genomic scale rather than a single gene. Epigenetics modulate the structure of the chromatin, thereby affecting the transcription of genes in the genome. Different studies have already identified changes in epigenetic modifications in a few genes in specific pathways in cancers. Based on these epigenetic changes, drugs against different types of tumors were developed, which mainly target epimutations in the genome. Examples include DNA methylation inhibitors, histone modification inhibitors, and small molecules that target chromatin-remodeling proteins. However, these drugs are not specific, and side effects are a major problem; therefore, new DNA sequencing technologies combined with epigenomic tools have the potential to identify novel biomarkers and better molecular targets to treat cancers. The purpose of this review is to discuss current and emerging epigenomic tools and to address how these new technologies may impact the future of cancer management. PMID:21188117
Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization.

PubMed

Nora, Elphège P; Goloborodko, Anton; Valton, Anne-Laure; Gibcus, Johan H; Uebersohn, Alec; Abdennur, Nezar; Dekker, Job; Mirny, Leonid A; Bruneau, Benoit G

2017-05-18

The molecular mechanisms underlying folding of mammalian chromosomes remain poorly understood. The transcription factor CTCF is a candidate regulator of chromosomal structure. Using the auxin-inducible degron system in mouse embryonic stem cells, we show that CTCF is absolutely and dose-dependently required for looping between CTCF target sites and insulation of topologically associating domains (TADs). Restoring CTCF reinstates proper architecture on altered chromosomes, indicating a powerful instructive function for CTCF in chromatin folding. CTCF remains essential for TAD organization in non-dividing cells. Surprisingly, active and inactive genome compartments remain properly segregated upon CTCF depletion, revealing that compartmentalization of mammalian chromosomes emerges independently of proper insulation of TADs. Furthermore, our data support that CTCF mediates transcriptional insulator function through enhancer blocking but not as a direct barrier to heterochromatin spreading. Beyond defining the functions of CTCF in chromosome folding, these results provide new fundamental insights into the rules governing mammalian genome organization. Copyright © 2017 Elsevier Inc. All rights reserved.
Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization

PubMed Central

Nora, Elphège P.; Goloborodko, Anton; Valton, Anne-Laure; Gibcus, Johan H.; Uebersohn, Alec; Abdennur, Nezar; Dekker, Job; Mirny, Leonid A.; Bruneau, Benoit G.

2017-01-01

Summary The molecular mechanisms underlying folding of mammalian chromosomes remain poorly understood. The transcription factor CTCF is a candidate regulator of chromosomal structure. Using the auxin-inducible degron system in mouse embryonic stem cells, we show that CTCF is absolutely and dose-dependently required for looping between CTCF target sites and insulation of topologically associating domains (TADs). Restoring CTCF reinstates proper architecture on altered chromosomes, indicating a powerful instructive function for CTCF in chromatin folding. CTCF remains essential for TAD organization in non-dividing cells. Surprisingly, active and inactive genome compartments remain properly segregated upon CTCF depletion, revealing that compartmentalization of mammalian chromosomes emerges independently of proper insulation of TADs. Further, our data support that CTCF mediates transcriptional insulator function through enhancer-blocking but not as a direct barrier to heterochromatin spreading. Beyond defining the functions of CTCF in chromosome folding these results provide new fundamental insights into the rules governing mammalian genome organization. PMID:28525758
Structural insights into ligand recognition and selectivity for class A, B, and C GPCRs

PubMed Central

Lee, Sang-Min; Booe, Jason M.; Pioszak, Augen A.

2015-01-01

The G protein-coupled receptor (GPCR) superfamily constitutes the largest collection of cell surface signaling proteins with approximately 800 members in the human genome. GPCRs regulate virtually all aspects of physiology and they are an important class of drug targets with ~30% of drugs on the market targeting a GPCR. Breakthroughs in GPCR structural biology in recent years have significantly expanded our understanding of GPCR structure and function and ushered in a new era of structure-based drug design for GPCRs. Crystal structures for nearly thirty distinct GPCRs are now available including receptors from each of the major classes, A, B, C, and F. These structures provide a foundation for understanding the molecular basis of GPCR pharmacology. Here, we review structural mechanisms of ligand recognition and selectivity of GPCRs with a focus on selected examples from classes A, B, and C, and we highlight major unresolved questions for future structural studies. PMID:25981303
A review of the prevalence, utility, and caveats of using chloroplast simple sequence repeats for studies of plant biology1

PubMed Central

Wheeler, Gregory L.; Dorman, Hanna E.; Buchanan, Alenda; Challagundla, Lavanya; Wallace, Lisa E.

2014-01-01

Microsatellites occur in all plant genomes and provide useful markers for studies of genetic diversity and structure. Chloroplast microsatellites (cpSSRs) are frequently targeted because they are more easily isolated than nuclear microsatellites. Here, we quantified the frequency and uses of cpSSRs based on a literature review of over 400 studies published 1995–2013. These markers are an important and economical tool for plant biologists and continue to be used alongside modern genomics approaches to study genetic diversity and structure, evolutionary history, and hybridization in native and agricultural species. Studies using species-specific primers reported a greater number of polymorphic loci than those employing universal primers. A major disadvantage to cpSSRs is fragment size homoplasy; therefore, we documented its occurrence at several cpSSR loci within and between species of Acmispon (Fabaceae). Based on our empirical data set, we recommend targeted sequencing of a subset of samples combined with fragment genotyping as a cost-efficient, data-rich approach to the use of cpSSRs and as a test of homoplasy. The availability of genomic resources for plants aids in the development of primers for new study systems, thereby enhancing the utility of cpSSRs across plant biology. PMID:25506520
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...

2017-07-18

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

PubMed Central

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting.

PubMed

Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D

2017-04-07

Bacterial CRISPR-Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR-Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR-Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification.
Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting

PubMed Central

Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D.

2017-01-01

Bacterial CRISPR–Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR–Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR–Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification. PMID:28387220
CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

PubMed Central

Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

2015-01-01

Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360
Read count-based method for high-throughput allelic genotyping of transposable elements and structural variants.

PubMed

Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F

2015-07-08

Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.
Psoralen interstrand cross-link repair is specifically altered by an adjacent triple-stranded structure

PubMed Central

Guillonneau, F.; Guieysse, A. L.; Nocentini, S.; Giovannangeli, C.; Praseuth, D.

2004-01-01

Targeting DNA-damaging agents to specific DNA sites by using sequence-specific DNA ligands has been successful in directing genomic modifications. The understanding of repair processing of such targeted damage and the influence of the adjacent complex is largely unknown. In this way, directed interstrand cross-links (ICLs) have already been generated by psoralen targeting. The mechanisms responsible for ICL removal are far from being understood in mammalian cells, with the proposed involvement of both mutagenic and recombinogenic pathways. Here, a unique ICL was introduced at a selected site by photoactivation of a psoralen moiety with the use of psoralen conjugates of triplex-forming oligonucleotides. The processing of psoralen ICL was evaluated in vitro and in cells for two types of cross-linked substrates, either containing a psoralen ICL alone or with an adjacent triple-stranded structure. We show that the presence of a neighbouring triplex structure interferes with different stages of psoralen ICL processing: (i) the ICL-induced DNA repair synthesis in HeLa cell extracts is inhibited by the triplex structure, as measured by the efficiency of ‘true’ and futile repair synthesis, stopping at the ICL site; (ii) in HeLa cells, the ICL removal via a nucleotide excision repair (NER) pathway is delayed in the presence of a neighbouring triplex; and (iii) the binding to ICL of recombinant xeroderma pigmentosum A protein, which is involved in pre-incision recruitment of NER factors is impaired by the presence of the third DNA strand. These data characterize triplex-induced modulation of ICL repair pathways at specific steps, which might have implications for the controlled induction of targeted genomic modifications and for the associated cellular responses. PMID:14966263
RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design.

PubMed

Garcia-Martin, Juan Antonio; Clote, Peter; Dotu, Ivan

2013-04-01

Synthetic biology is a rapidly emerging discipline with long-term ramifications that range from single-molecule detection within cells to the creation of synthetic genomes and novel life forms. Truly phenomenal results have been obtained by pioneering groups--for instance, the combinatorial synthesis of genetic networks, genome synthesis using BioBricks, and hybridization chain reaction (HCR), in which stable DNA monomers assemble only upon exposure to a target DNA fragment, biomolecular self-assembly pathways, etc. Such work strongly suggests that nanotechnology and synthetic biology together seem poised to constitute the most transformative development of the 21st century. In this paper, we present a Constraint Programming (CP) approach to solve the RNA inverse folding problem. Given a target RNA secondary structure, we determine an RNA sequence which folds into the target structure; i.e. whose minimum free energy structure is the target structure. Our approach represents a step forward in RNA design--we produce the first complete RNA inverse folding approach which allows for the specification of a wide range of design constraints. We also introduce a Large Neighborhood Search approach which allows us to tackle larger instances at the cost of losing completeness, while retaining the advantages of meeting design constraints (motif, GC-content, etc.). Results demonstrate that our software, RNAiFold, performs as well or better than all state-of-the-art approaches; nevertheless, our approach is unique in terms of completeness, flexibility, and the support of various design constraints. The algorithms presented in this paper are publicly available via the interactive webserver http://bioinformatics.bc.edu/clotelab/RNAiFold; additionally, the source code can be downloaded from that site.
Natural Allelic Diversity, Genetic Structure and Linkage Disequilibrium Pattern in Wild Chickpea

PubMed Central

Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

2014-01-01

Characterization of natural allelic diversity and understanding the genetic structure and linkage disequilibrium (LD) pattern in wild germplasm accessions by large-scale genotyping of informative microsatellite and single nucleotide polymorphism (SNP) markers is requisite to facilitate chickpea genetic improvement. Large-scale validation and high-throughput genotyping of genome-wide physically mapped 478 genic and genomic microsatellite markers and 380 transcription factor gene-derived SNP markers using gel-based assay, fluorescent dye-labelled automated fragment analyser and matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass array have been performed. Outcome revealed their high genotyping success rate (97.5%) and existence of a high level of natural allelic diversity among 94 wild and cultivated Cicer accessions. High intra- and inter-specific polymorphic potential and wider molecular diversity (11–94%) along with a broader genetic base (13–78%) specifically in the functional genic regions of wild accessions was assayed by mapped markers. It suggested their utility in monitoring introgression and transferring target trait-specific genomic (gene) regions from wild to cultivated gene pool for the genetic enhancement. Distinct species/gene pool-wise differentiation, admixed domestication pattern, and differential genome-wide recombination and LD estimates/decay observed in a six structured population of wild and cultivated accessions using mapped markers further signifies their usefulness in chickpea genetics, genomics and breeding. PMID:25222488
A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus).

PubMed

Chapman, Mark A; Pashley, Catherine H; Wenzler, Jessica; Hvala, John; Tang, Shunxue; Knapp, Steven J; Burke, John M

2008-11-01

Genomic scans for selection are a useful tool for identifying genes underlying phenotypic transitions. In this article, we describe the results of a genome scan designed to identify candidates for genes targeted by selection during the evolution of cultivated sunflower. This work involved screening 492 loci derived from ESTs on a large panel of wild, primitive (i.e., landrace), and improved sunflower (Helianthus annuus) lines. This sampling strategy allowed us to identify candidates for selectively important genes and investigate the likely timing of selection. Thirty-six genes showed evidence of selection during either domestication or improvement based on multiple criteria, and a sequence-based test of selection on a subset of these loci confirmed this result. In view of what is known about the structure of linkage disequilibrium across the sunflower genome, these genes are themselves likely to have been targeted by selection, rather than being merely linked to the actual targets. While the selection candidates showed a broad range of putative functions, they were enriched for genes involved in amino acid synthesis and protein catabolism. Given that a similar pattern has been detected in maize (Zea mays), this finding suggests that selection on amino acid composition may be a general feature of the evolution of crop plants. In terms of genomic locations, the selection candidates were significantly clustered near quantitative trait loci (QTL) that contribute to phenotypic differences between wild and cultivated sunflower, and specific instances of QTL colocalization provide some clues as to the roles that these genes may have played during sunflower evolution.
A Polyamide Inhibits Replication of Vesicular Stomatitis Virus by Targeting RNA in the Nucleocapsid

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gumpper, Ryan H.; Li, Weike; Castañeda, Carlos H.

Polyamides have been shown to bind double-stranded DNA by complementing the curvature of the minor groove and forming various hydrogen bonds with DNA. Several polyamide molecules have been found to have potent antiviral activities against papillomavirus, a double-stranded DNA virus. By analogy, we reason that polyamides may also interact with the structured RNA bound in the nucleocapsid of a negative-strand RNA virus. Vesicular stomatitis virus (VSV) was selected as a prototype virus to test this possibility since its genomic RNA encapsidated in the nucleocapsid forms a structure resembling one strand of an A-form RNA duplex. One polyamide molecule, UMSL1011, wasmore » found to inhibit infection of VSV. To confirm that the polyamide targeted the nucleocapsid, a nucleocapsid-like particle (NLP) was incubated with UMSL1011. The encapsidated RNA in the polyamide-treated NLP was protected from thermo-release and digestion by RNase A. UMSL1011 also inhibits viral RNA synthesis in the intracellular activity assay for the viral RNA-dependent RNA polymerase. The crystal structure revealed that UMSL1011 binds the structured RNA in the nucleocapsid. The conclusion of our studies is that the RNA in the nucleocapsid is a viable antiviral target of polyamides. Since the RNA structure in the nucleocapsid is similar in all negative-strand RNA viruses, polyamides may be optimized to target the specific RNA genome of a negative-strand RNA virus, such as respiratory syncytial virus and Ebola virus. IMPORTANCENegative-strand RNA viruses (NSVs) include several life-threatening pathogens, such as rabies virus, respiratory syncytial virus, and Ebola virus. There are no effective antiviral drugs against these viruses. Polyamides offer an exceptional opportunity because they may be optimized to target each NSV. Our studies on vesicular stomatitis virus, an NSV, demonstrated that a polyamide molecule could specifically target the viral RNA in the nucleocapsid and inhibit viral growth. The target specificity of the polyamide molecule was proved by its inhibition of thermo-release and RNA nuclease digestion of the RNA bound in a model nucleocapsid, and a crystal structure of the polyamide inside the nucleocapsid. This encouraging observation provided the proof-of-concept rationale for designing polyamides as antiviral drugs against NSVs.« less
A Polyamide Inhibits Replication of Vesicular Stomatitis Virus by Targeting RNA in the Nucleocapsid.

PubMed

Gumpper, Ryan H; Li, Weike; Castañeda, Carlos H; Scuderi, M José; Bashkin, James K; Luo, Ming

2018-04-15

Polyamides have been shown to bind double-stranded DNA by complementing the curvature of the minor groove and forming various hydrogen bonds with DNA. Several polyamide molecules have been found to have potent antiviral activities against papillomavirus, a double-stranded DNA virus. By analogy, we reason that polyamides may also interact with the structured RNA bound in the nucleocapsid of a negative-strand RNA virus. Vesicular stomatitis virus (VSV) was selected as a prototype virus to test this possibility since its genomic RNA encapsidated in the nucleocapsid forms a structure resembling one strand of an A-form RNA duplex. One polyamide molecule, UMSL1011, was found to inhibit infection of VSV. To confirm that the polyamide targeted the nucleocapsid, a nucleocapsid-like particle (NLP) was incubated with UMSL1011. The encapsidated RNA in the polyamide-treated NLP was protected from thermo-release and digestion by RNase A. UMSL1011 also inhibits viral RNA synthesis in the intracellular activity assay for the viral RNA-dependent RNA polymerase. The crystal structure revealed that UMSL1011 binds the structured RNA in the nucleocapsid. The conclusion of our studies is that the RNA in the nucleocapsid is a viable antiviral target of polyamides. Since the RNA structure in the nucleocapsid is similar in all negative-strand RNA viruses, polyamides may be optimized to target the specific RNA genome of a negative-strand RNA virus, such as respiratory syncytial virus and Ebola virus. IMPORTANCE Negative-strand RNA viruses (NSVs) include several life-threatening pathogens, such as rabies virus, respiratory syncytial virus, and Ebola virus. There are no effective antiviral drugs against these viruses. Polyamides offer an exceptional opportunity because they may be optimized to target each NSV. Our studies on vesicular stomatitis virus, an NSV, demonstrated that a polyamide molecule could specifically target the viral RNA in the nucleocapsid and inhibit viral growth. The target specificity of the polyamide molecule was proved by its inhibition of thermo-release and RNA nuclease digestion of the RNA bound in a model nucleocapsid, and a crystal structure of the polyamide inside the nucleocapsid. This encouraging observation provided the proof-of-concept rationale for designing polyamides as antiviral drugs against NSVs. Copyright © 2018 American Society for Microbiology.

From Bioengineering to CRISPR/Cas9 – A Personal Retrospective of 20 Years of Research in Programmable Genome Targeting

PubMed Central

Jeltsch, Albert

2018-01-01

Genome targeting of restriction enzymes and DNA methyltransferases has many important applications including genome and epigenome editing. 15–20 years ago, my group was involved in the development of approaches for programmable genome targeting, aiming to connect enzymes with an oligodeoxynucleotide (ODN), which could form a sequence-specific triple helix at the genomic target site. Importantly, the target site of such enzyme-ODN conjugate could be varied simply by altering the ODN sequence promising great applicative values. However, this approach was facing many problems including the preparation and purification of the enzyme-ODN conjugates, their efficient delivery into cells, slow kinetics of triple helix formation and the requirement of a poly-purine target site sequence. Hence, for several years genome and epigenome editing approaches mainly were based on Zinc fingers and TAL proteins as targeting devices. More recently, CRISPR/Cas systems were discovered, which use a bound RNA for genome targeting that forms an RNA/DNA duplex with one DNA strand of the target site. These systems combine all potential advantages of the once imagined enzyme-ODN conjugates and avoid all main disadvantageous. Consequently, the application of CRISPR/Cas in genome and epigenome editing has exploded in recent years. We can draw two important conclusions from this example of research history. First, evolution still is the better bioengineer than humans and, whenever tested in parallel, natural solutions outcompete engineered ones. Second, CRISPR/Cas system were discovered in pure, curiosity driven, basic research, highlighting that it is basic, bottom-up research paving the way for fundamental innovation. PMID:29434619
Quantifying on- and off-target genome editing.

PubMed

Hendel, Ayal; Fine, Eli J; Bao, Gang; Porteus, Matthew H

2015-02-01

Genome editing with engineered nucleases is a rapidly growing field thanks to transformative technologies that allow researchers to precisely alter genomes for numerous applications including basic research, biotechnology, and human gene therapy. While the ability to make precise and controlled changes at specified sites throughout the genome has grown tremendously in recent years, we still lack a comprehensive and standardized battery of assays for measuring the different genome editing outcomes created at endogenous genomic loci. Here we review the existing assays for quantifying on- and off-target genome editing and describe their utility in advancing the technology. We also highlight unmet assay needs for quantifying on- and off-target genome editing outcomes and discuss their importance for the genome editing field. Copyright © 2014 Elsevier Ltd. All rights reserved.
Structure of a Trypanosoma Brucei Alpha/Beta--Hydrolase Fold Protein With Unknown Function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Merritt, E.A.; Holmes, M.; Buckner, F.S.

2009-05-26

The structure of a structural genomics target protein, Tbru020260AAA from Trypanosoma brucei, has been determined to a resolution of 2.2 {angstrom} using multiple-wavelength anomalous diffraction at the Se K edge. This protein belongs to Pfam sequence family PF08538 and is only distantly related to previously studied members of the {alpha}/{beta}-hydrolase fold family. Structural superposition onto representative {alpha}/{beta}-hydrolase fold proteins of known function indicates that a possible catalytic nucleophile, Ser116 in the T. brucei protein, lies at the expected location. However, the present structure and by extension the other trypanosomatid members of this sequence family have neither sequence nor structural similaritymore » at the location of other active-site residues typical for proteins with this fold. Together with the presence of an additional domain between strands {beta}6 and {beta}7 that is conserved in trypanosomatid genomes, this suggests that the function of these homologs has diverged from other members of the fold family.« less
BAC sequencing using pooled methods.

PubMed

Saski, Christopher A; Feltus, F Alex; Parida, Laxmi; Haiminen, Niina

2015-01-01

Shotgun sequencing and assembly of a large, complex genome can be both expensive and challenging to accurately reconstruct the true genome sequence. Repetitive DNA arrays, paralogous sequences, polyploidy, and heterozygosity are main factors that plague de novo genome sequencing projects that typically result in highly fragmented assemblies and are difficult to extract biological meaning. Targeted, sub-genomic sequencing offers complexity reduction by removing distal segments of the genome and a systematic mechanism for exploring prioritized genomic content through BAC sequencing. If one isolates and sequences the genome fraction that encodes the relevant biological information, then it is possible to reduce overall sequencing costs and efforts that target a genomic segment. This chapter describes the sub-genome assembly protocol for an organism based upon a BAC tiling path derived from a genome-scale physical map or from fine mapping using BACs to target sub-genomic regions. Methods that are described include BAC isolation and mapping, DNA sequencing, and sequence assembly.
Novel proteases from the genome of the carnivorous plant Drosera capensis: structural prediction and comparative analysis

PubMed Central

Butts, Carter T.; Bierma, Jan C.; Martin, Rachel W.

2016-01-01

In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a “ferment” similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. PMID:27353064
ZifBASE: a database of zinc finger proteins and associated resources.

PubMed

Jayakanthan, Mannu; Muthukumaran, Jayaraman; Chandrasekar, Sanniyasi; Chawla, Konika; Punetha, Ankita; Sundar, Durai

2009-09-09

Information on the occurrence of zinc finger protein motifs in genomes is crucial to the developing field of molecular genome engineering. The knowledge of their target DNA-binding sequences is vital to develop chimeric proteins for targeted genome engineering and site-specific gene correction. There is a need to develop a computational resource of zinc finger proteins (ZFP) to identify the potential binding sites and its location, which reduce the time of in vivo task, and overcome the difficulties in selecting the specific type of zinc finger protein and the target site in the DNA sequence. ZifBASE provides an extensive collection of various natural and engineered ZFP. It uses standard names and a genetic and structural classification scheme to present data retrieved from UniProtKB, GenBank, Protein Data Bank, ModBase, Protein Model Portal and the literature. It also incorporates specialized features of ZFP including finger sequences and positions, number of fingers, physiochemical properties, classes, framework, PubMed citations with links to experimental structures (PDB, if available) and modeled structures of natural zinc finger proteins. ZifBASE provides information on zinc finger proteins (both natural and engineered ones), the number of finger units in each of the zinc finger proteins (with multiple fingers), the synergy between the adjacent fingers and their positions. Additionally, it gives the individual finger sequence and their target DNA site to which it binds for better and clear understanding on the interactions of adjacent fingers. The current version of ZifBASE contains 139 entries of which 89 are engineered ZFPs, containing 3-7F totaling to 296 fingers. There are 50 natural zinc finger protein entries ranging from 2-13F, totaling to 307 fingers. It has sequences and structures from literature, Protein Data Bank, ModBase and Protein Model Portal. The interface is cross linked to other public databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point) of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface http://web.iitd.ac.in/~sundar/zifbase.
Whole genome analysis of CRISPR Cas9 sgRNA off-target homologies via an efficient computational algorithm.

PubMed

Zhou, Hong; Zhou, Michael; Li, Daisy; Manthey, Joseph; Lioutikova, Ekaterina; Wang, Hong; Zeng, Xiao

2017-11-17

The beauty and power of the genome editing mechanism, CRISPR Cas9 endonuclease system, lies in the fact that it is RNA-programmable such that Cas9 can be guided to any genomic loci complementary to a 20-nt RNA, single guide RNA (sgRNA), to cleave double stranded DNA, allowing the introduction of wanted mutations. Unfortunately, it has been reported repeatedly that the sgRNA can also guide Cas9 to off-target sites where the DNA sequence is homologous to sgRNA. Using human genome and Streptococcus pyogenes Cas9 (SpCas9) as an example, this article mathematically analyzed the probabilities of off-target homologies of sgRNAs and discovered that for large genome size such as human genome, potential off-target homologies are inevitable for sgRNA selection. A highly efficient computationl algorithm was developed for whole genome sgRNA design and off-target homology searches. By means of a dynamically constructed sequence-indexed database and a simplified sequence alignment method, this algorithm achieves very high efficiency while guaranteeing the identification of all existing potential off-target homologies. Via this algorithm, 1,876,775 sgRNAs were designed for the 19,153 human mRNA genes and only two sgRNAs were found to be free of off-target homology. By means of the novel and efficient sgRNA homology search algorithm introduced in this article, genome wide sgRNA design and off-target analysis were conducted and the results confirmed the mathematical analysis that for a sgRNA sequence, it is almost impossible to escape potential off-target homologies. Future innovations on the CRISPR Cas9 gene editing technology need to focus on how to eliminate the Cas9 off-target activity.
The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species

PubMed Central

Park, Inkyu; Kim, Wook-jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin

2017-01-01

Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC–trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species. PMID:28863163
The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species.

PubMed

Park, Inkyu; Kim, Wook-Jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin; Moon, Byeong Cheol

2017-01-01

Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.
A community effort towards a knowledge-base and mathematical model of the human pathogen Salmonella Typhimurium LT2

USDA-ARS?s Scientific Manuscript database

Metabolic reconstructions (MRs) are common denominators in systems biology and represent biochemical, genetic, and genomic (BiGG) knowledge-bases for target organisms by capturing currently available information in a consistent, structured manner. Salmonella enterica subspecies I serovar Typhimurium...
CRISPR-Cas9 Targeting of PCSK9 in Human Hepatocytes In Vivo-Brief Report.

PubMed

Wang, Xiao; Raghavan, Avanthi; Chen, Tao; Qiao, Lyon; Zhang, Yongxian; Ding, Qiurong; Musunuru, Kiran

2016-05-01

Although early proof-of-concept studies of somatic in vivo genome editing of the mouse ortholog of proprotein convertase subtilisin/kexin type 9 (Pcsk9) in mice have established its therapeutic potential for the prevention of cardiovascular disease, the unique nature of genome-editing technology-permanent alteration of genomic DNA sequences-mandates that it be tested in vivo against human genes in normal human cells with human genomes to give reliable preclinical insights into the efficacy (on-target mutagenesis) and safety (lack of off-target mutagenesis) of genome-editing therapy before it can be used in patients. We used a clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) 9 genome-editing system to target the human PCSK9 gene in chimeric liver-humanized mice bearing human hepatocytes. We demonstrated high on-target mutagenesis (approaching 50%), greatly reduced blood levels of human PCSK9 protein, and minimal off-target mutagenesis. This work yields important information on the efficacy and safety of CRISPR-Cas9 therapy targeting the human PCSK9 gene in human hepatocytes in vivo, and it establishes humanized mice as a useful platform for the preclinical assessment of applications of somatic in vivo genome editing. © 2016 American Heart Association, Inc.
A DEK Domain-Containing Protein Modulates Chromatin Structure and Function in Arabidopsis[W][OPEN

PubMed Central

Waidmann, Sascha; Kusenda, Branislav; Mayerhofer, Juliane; Mechtler, Karl; Jonak, Claudia

2014-01-01

Chromatin is a major determinant in the regulation of virtually all DNA-dependent processes. Chromatin architectural proteins interact with nucleosomes to modulate chromatin accessibility and higher-order chromatin structure. The evolutionarily conserved DEK domain-containing protein is implicated in important chromatin-related processes in animals, but little is known about its DNA targets and protein interaction partners. In plants, the role of DEK has remained elusive. In this work, we identified DEK3 as a chromatin-associated protein in Arabidopsis thaliana. DEK3 specifically binds histones H3 and H4. Purification of other proteins associated with nuclear DEK3 also established DNA topoisomerase 1α and proteins of the cohesion complex as in vivo interaction partners. Genome-wide mapping of DEK3 binding sites by chromatin immunoprecipitation followed by deep sequencing revealed enrichment of DEK3 at protein-coding genes throughout the genome. Using DEK3 knockout and overexpressor lines, we show that DEK3 affects nucleosome occupancy and chromatin accessibility and modulates the expression of DEK3 target genes. Furthermore, functional levels of DEK3 are crucial for stress tolerance. Overall, data indicate that DEK3 contributes to modulation of Arabidopsis chromatin structure and function. PMID:25387881
Inference of Expanded Lrp-Like Feast/Famine Transcription Factor Targets in a Non-Model Organism Using Protein Structure-Based Prediction

PubMed Central

Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.

2014-01-01

Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272
Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.

PubMed

Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S

2014-01-01

Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Genome Editing with CRISPR-Cas9: Can It Get Any Better?

PubMed

Haeussler, Maximilian; Concordet, Jean-Paul

2016-05-20

The CRISPR-Cas revolution is taking place in virtually all fields of life sciences. Harnessing DNA cleavage with the CRISPR-Cas9 system of Streptococcus pyogenes has proven to be extraordinarily simple and efficient, relying only on the design of a synthetic single guide RNA (sgRNA) and its co-expression with Cas9. Here, we review the progress in the design of sgRNA from the original dual RNA guide for S. pyogenes and Staphylococcus aureus Cas9 (SpCas9 and SaCas9). New assays for genome-wide identification of off-targets have provided important insights into the issue of cleavage specificity in vivo. At the same time, the on-target activity of thousands of guides has been determined. These data have led to numerous online tools that facilitate the selection of guide RNAs in target sequences. It appears that for most basic research applications, cleavage activity can be maximized and off-targets minimized by carefully choosing guide RNAs based on computational predictions. Moreover, recent studies of Cas proteins have further improved the flexibility and precision of the CRISPR-Cas toolkit for genome editing. Inspired by the crystal structure of the complex of sgRNA-SpCas9 bound to target DNA, several variants of SpCas9 have recently been engineered, either with novel protospacer adjacent motifs (PAMs) or with drastically reduced off-targets. Novel Cas9 and Cas9-like proteins called Cpf1 have also been characterized from other bacteria and will benefit from the insights obtained from SpCas9. Genome editing with CRISPR-Cas9 may also progress with better understanding and control of cellular DNA repair pathways activated after Cas9-induced DNA cleavage. Copyright © 2016 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Systems genetics for drug target discovery

PubMed Central

Penrod, Nadia M.; Cowper-Sal_lari, Richard; Moore, Jason H.

2011-01-01

The collection and analysis of genomic data has the potential to reveal novel druggable targets by providing insight into the genetic basis of disease. However, the number of drugs, targeting new molecular entities, approved by the US Food and Drug Administration (FDA) has not increased in the years since the collection of genomic data has become commonplace. The paucity of translatable results can be partly attributed to conventional analysis methods that test one gene at a time in an effort to identify disease-associated factors as candidate drug targets. By disengaging genetic factors from their position within the genetic regulatory system, much of the information stored within the genomic data set is lost. Here we discuss how genomic data is used to identify disease-associated genes or genomic regions, how disease-associated regions are validated as functional targets, and the role network analysis can play in bridging the gap between data generation and effective drug target identification. PMID:21862141
DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo

PubMed Central

Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M.; Weissman, Jonathan S.; Rouskin, Silvi

2017-01-01

Coupling structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structural studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduces biases and necessitates population-average assessments of RNA structure. Here we present dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase (TGIRT). DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in non-canonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs to their mature isoforms. These applications illustrate DMS-MaPseq’s capacity to dramatically expand in vivo analysis of RNA structure. PMID:27819661
Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

PubMed

Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

2017-10-18

Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the study of packaging signals in other RNA viruses. Improved understanding of RNA packaging may lead to novel vaccine approaches or targets for antiviral drugs with broad spectrum activity. Copyright © 2017 Logan et al.
Fruitful research: drug target discovery for neurodegenerative diseases in Drosophila.

PubMed

Konsolaki, Mary

2013-12-01

Although vertebrate model systems have obvious advantages in the study of human disease, invertebrate organisms have contributed enormously to this field as well. The conservation of genome structure and physiology among organisms poses unexpected peculiarities, and the redundancy in certain gene families or the presence of polymorphisms that can slightly alter gene expression can, in certain instances, bring invertebrate systems, such as Drosophila, closer to humans than mice and vice versa. This necessitates the analysis of disease pathways in multiple model organisms. The author highlights findings from Drosophila models of neurodegenerative diseases that have occurred in the past few years. She also highlights and discusses various molecular, genetic and genomic tools used in flies, as well as methods for generating disease models. Finally, the author describes Drosophila models of Alzheimer's, Parkinson's tri-nucleotide repeat diseases, and Fragile X syndrome and summarizes insights in disease mechanisms that have been discovered directly in fly models. Full genome genetic screens in Drosophila can lead to the rapid identification of drug target candidates that can be subsequently validated in a vertebrate system. In addition, the Drosophila models of neurodegeneration may often show disease phenotypes that are absent in equivalent mouse models. The author believes that the extensive contribution of Drosophila to both new disease drug target discovery, in addition to target validation, makes them indispensible to drug discovery and development.
Spy: a new group of eukaryotic DNA transposons without target site duplications.

PubMed

Han, Min-Jin; Xu, Hong-En; Zhang, Hua-Hao; Feschotte, Cédric; Zhang, Ze

2014-06-24

Class 2 or DNA transposons populate the genomes of most eukaryotes and like other mobile genetic elements have a profound impact on genome evolution. Most DNA transposons belong to the cut-and-paste types, which are relatively simple elements characterized by terminal-inverted repeats (TIRs) flanking a single gene encoding a transposase. All eukaryotic cut-and-paste transposons so far described are also characterized by target site duplications (TSDs) of host DNA generated upon chromosomal insertion. Here, we report a new group of evolutionarily related DNA transposons called Spy, which also include TIRs and DDE motif-containing transposase but surprisingly do not create TSDs upon insertion. Instead, Spy transposons appear to transpose precisely between 5'-AAA and TTT-3' host nucleotides, without duplication or modification of the AAATTT target sites. Spy transposons were identified in the genomes of diverse invertebrate species based on transposase homology searches and structure-based approaches. Phylogenetic analyses indicate that Spy transposases are distantly related to IS5, ISL2EU, and PIF/Harbinger transposases. However, Spy transposons are distinct from these and other DNA transposon superfamilies by their lack of TSD and their target site preference. Our findings expand the known diversity of DNA transposons and reveal a new group of eukaryotic DDE transposases with unusual catalytic properties. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Structure and Distribution of Centromeric Retrotransposons at Diploid and Allotetraploid Coffea Centromeric and Pericentromeric Regions

PubMed Central

de Castro Nunes, Renata; Orozco-Arias, Simon; Crouzillat, Dominique; Mueller, Lukas A.; Strickler, Suzy R.; Descombes, Patrick; Fournier, Coralie; Moine, Deborah; de Kochko, Alexandre; Yuyama, Priscila M.; Vanzela, André L. L.; Guyot, Romain

2018-01-01

Centromeric regions of plants are generally composed of large array of satellites from a specific lineage of Gypsy LTR-retrotransposons, called Centromeric Retrotransposons. Repeated sequences interact with a specific H3 histone, playing a crucial function on kinetochore formation. To study the structure and composition of centromeric regions in the genus Coffea, we annotated and classified Centromeric Retrotransposons sequences from the allotetraploid C. arabica genome and its two diploid ancestors: Coffea canephora and C. eugenioides. Ten distinct CRC (Centromeric Retrotransposons in Coffea) families were found. The sequence mapping and FISH experiments of CRC Reverse Transcriptase domains in C. canephora, C. eugenioides, and C. arabica clearly indicate a strong and specific targeting mainly onto proximal chromosome regions, which can be associated also with heterochromatin. PacBio genome sequence analyses of putative centromeric regions on C. arabica and C. canephora chromosomes showed an exceptional density of one family of CRC elements, and the complete absence of satellite arrays, contrasting with usual structure of plant centromeres. Altogether, our data suggest a specific centromere organization in Coffea, contrasting with other plant genomes. PMID:29497436
A Single Multiplex crRNA Array for FnCpf1-Mediated Human Genome Editing.

PubMed

Sun, Huihui; Li, Fanfan; Liu, Jie; Yang, Fayu; Zeng, Zhenhai; Lv, Xiujuan; Tu, Mengjun; Liu, Yeqing; Ge, Xianglian; Liu, Changbao; Zhao, Junzhao; Zhang, Zongduan; Qu, Jia; Song, Zongming; Gu, Feng

2018-06-15

Cpf1 has been harnessed as a tool for genome manipulation in various species because of its simplicity and high efficiency. Our recent study demonstrated that FnCpf1 could be utilized for human genome editing with notable advantages for target sequence selection due to the flexibility of the protospacer adjacent motif (PAM) sequence. Multiplex genome editing provides a powerful tool for targeting members of multigene families, dissecting gene networks, modeling multigenic disorders in vivo, and applying gene therapy. However, there are no reports at present that show FnCpf1-mediated multiplex genome editing via a single customized CRISPR RNA (crRNA) array. In the present study, we utilize a single customized crRNA array to simultaneously target multiple genes in human cells. In addition, we also demonstrate that a single customized crRNA array to target multiple sites in one gene could be achieved. Collectively, FnCpf1, a powerful genome-editing tool for multiple genomic targets, can be harnessed for effective manipulation of the human genome. Copyright © 2018 The American Society of Gene and Cell Therapy. Published by Elsevier Inc. All rights reserved.
Cas9 versus Cas12a/Cpf1: Structure-function comparisons and implications for genome editing.

PubMed

Swarts, Daan C; Jinek, Martin

2018-05-22

Cas9 and Cas12a are multidomain CRISPR-associated nucleases that can be programmed with a guide RNA to bind and cleave complementary DNA targets. The guide RNA sequence can be varied, making these effector enzymes versatile tools for genome editing and gene regulation applications. While Cas9 is currently the best-characterized and most widely used nuclease for such purposes, Cas12a (previously named Cpf1) has recently emerged as an alternative for Cas9. Cas9 and Cas12a have distinct evolutionary origins and exhibit different structural architectures, resulting in distinct molecular mechanisms. Here we compare the structural and mechanistic features that distinguish Cas9 and Cas12a, and describe how these features modulate their activity. We discuss implications for genome editing, and how they may influence the choice of Cas9 or Cas12a for specific applications. Finally, we review recent studies in which Cas12a has been utilized as a genome editing tool. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Regulatory RNAs/RNAi/Riboswitches > Biogenesis of Effector Small RNAs RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes. © 2018 Wiley Periodicals, Inc.
Directed evolution approach to a structural genomics project: Rv2002 from Mycobacterium tuberculosis.

PubMed

Yang, Jin Kuk; Park, Min S; Waldo, Geoffrey S; Suh, Se Won

2003-01-21

One of the serious bottlenecks in structural genomics projects is overexpression of the target proteins in soluble form. We have applied the directed evolution technique and prepared soluble mutants of the Mycobacterium tuberculosis Rv2002 gene product, the wild type of which had been expressed as inclusion bodies in Escherichia coli. A triple mutant I6TV47MT69K (Rv2002-M3) was chosen for structural and functional characterizations. Enzymatic assays indicate that the Rv2002-M3 protein has a high catalytic activity as a NADH-dependent 3alpha, 20beta-hydroxysteroid dehydrogenase. We have determined the crystal structures of a binary complex with NAD(+) and a ternary complex with androsterone and NADH. The structure reveals that Asp-38 determines the cofactor specificity. The catalytic site includes the triad Ser-140Tyr-153Lys-157. Additionally, it has an unusual feature, Glu-142. Enzymatic assays of the E142A mutant of Rv2002-M3 indicate that Glu-142 reverses the effect of Lys-157 in influencing the pKa of Tyr-153. This study suggests that the Rv2002 gene product is a unique member of the SDR family and is likely to be involved in steroid metabolism in M. tuberculosis. Our work demonstrates the power of the directed evolution technique as a general way of overcoming the difficulties in overexpressing the target proteins in soluble form.
Revisiting the structure/function relationships of H/ACA(-like) RNAs: a unified model for Euryarchaea and Crenarchaea

PubMed Central

Toffano-Nioche, Claire; Gautheret, Daniel; Leclerc, Fabrice

2015-01-01

A structural and functional classification of H/ACA and H/ACA-like motifs is obtained from the analysis of the H/ACA guide RNAs which have been identified previously in the genomes of Euryarchaea (Pyrococcus) and Crenarchaea (Pyrobaculum). A unified structure/function model is proposed based on the common structural determinants shared by H/ACA and H/ACA-like motifs in both Euryarchaea and Crenarchaea. Using a computational approach, structural and energetic rules for the guide:target RNA-RNA interactions are derived from structural and functional data on the H/ACA RNP particles. H/ACA(-like) motifs found in Pyrococcus are evaluated through the classification and their biological relevance is discussed. Extra-ribosomal targets found in both Pyrococcus and Pyrobaculum might support the hypothesis of a gene regulation mediated by H/ACA(-like) guide RNAs in archaea. PMID:26240384
High-throughput Cloning and Expression of Integral Membrane Proteins in Escherichia coli

PubMed Central

Bruni, Renato

2014-01-01

Recently, several structural genomics centers have been established and a remarkable number of three-dimensional structures of soluble proteins have been solved. For membrane proteins, the number of structures solved has been significantly trailing those for their soluble counterparts, not least because over-expression and purification of membrane proteins is a much more arduous process. By using high throughput technologies, a large number of membrane protein targets can be screened simultaneously and a greater number of expression and purification conditions can be employed, leading to a higher probability of successfully determining the structure of membrane proteins. This unit describes the cloning, expression and screening of membrane proteins using high throughput methodologies developed in our laboratory. Basic Protocol 1 deals with the cloning of inserts into expression vectors by ligation-independent cloning. Basic Protocol 2 describes the expression and purification of the target proteins on a miniscale. Lastly, for the targets that express at the miniscale, basic protocols 3 and 4 outline the methods employed for the expression and purification of targets at the midi-scale, as well as a procedure for detergent screening and identification of detergent(s) in which the target protein is stable. PMID:24510647
TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data

PubMed Central

Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M.; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A.; Gilks, C. Blake; Huntsman, David G.; McAlpine, Jessica N.; Aparicio, Samuel

2014-01-01

The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. PMID:25060187
Sequencing Needs for Viral Diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, S N; Lam, M; Mulakken, N J

2004-01-26

We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

PubMed Central

Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

2014-01-01

CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782
A Dual-Specific Targeting Approach Based on the Simultaneous Recognition of Duplex and Quadruplex Motifs.

PubMed

Nguyen, Thi Quynh Ngoc; Lim, Kah Wai; Phan, Anh Tuân

2017-09-20

Small-molecule ligands targeting nucleic acids have been explored as potential therapeutic agents. Duplex groove-binding ligands have been shown to recognize DNA in a sequence-specific manner. On the other hand, quadruplex-binding ligands exhibit high selectivity between quadruplex and duplex, but show limited discrimination between different quadruplex structures. Here we propose a dual-specific approach through the simultaneous application of duplex- and quadruplex-binders. We demonstrated that a quadruplex-specific ligand and a duplex-specific ligand can simultaneously interact at two separate binding sites of a quadruplex-duplex hybrid harbouring both quadruplex and duplex structural elements. Such a dual-specific targeting strategy would combine the sequence specificity of duplex-binders and the strong binding affinity of quadruplex-binders, potentially allowing the specific targeting of unique quadruplex structures. Future research can be directed towards the development of conjugated compounds targeting specific genomic quadruplex-duplex sites, for which the linker would be highly context-dependent in terms of length and flexibility, as well as the attachment points onto both ligands.
Whole Genome Analyses of a Well-Differentiated Liposarcoma Reveals Novel SYT1 and DDR2 Rearrangements

PubMed Central

Egan, Jan B.; Barrett, Michael T.; Champion, Mia D.; Middha, Sumit; Lenkiewicz, Elizabeth; Evers, Lisa; Francis, Princy; Schmidt, Jessica; Shi, Chang-Xin; Van Wier, Scott; Badar, Sandra; Ahmann, Gregory; Kortuem, K. Martin; Boczek, Nicole J.; Fonseca, Rafael; Craig, David W.; Carpten, John D.; Borad, Mitesh J.; Stewart, A. Keith

2014-01-01

Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR) where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2. PMID:24505276
CSAR-web: a web server of contig scaffolding using algebraic rearrangements.

PubMed

Chen, Kun-Tze; Lu, Chin Lung

2018-05-04

CSAR-web is a web-based tool that allows the users to efficiently and accurately scaffold (i.e. order and orient) the contigs of a target draft genome based on a complete or incomplete reference genome from a related organism. It takes as input a target genome in multi-FASTA format and a reference genome in FASTA or multi-FASTA format, depending on whether the reference genome is complete or incomplete, respectively. In addition, it requires the users to choose either 'NUCmer on nucleotides' or 'PROmer on translated amino acids' for CSAR-web to identify conserved genomic markers (i.e. matched sequence regions) between the target and reference genomes, which are used by the rearrangement-based scaffolding algorithm in CSAR-web to order and orient the contigs of the target genome based on the reference genome. In the output page, CSAR-web displays its scaffolding result in a graphical mode (i.e. scalable dotplot) allowing the users to visually validate the correctness of scaffolded contigs and in a tabular mode allowing the users to view the details of scaffolds. CSAR-web is available online at http://genome.cs.nthu.edu.tw/CSAR-web.
CRISPR-based screening of genomic island excision events in bacteria.

PubMed

Selle, Kurt; Klaenhammer, Todd R; Barrangou, Rodolphe

2015-06-30

Genomic analysis of Streptococcus thermophilus revealed that mobile genetic elements (MGEs) likely contributed to gene acquisition and loss during evolutionary adaptation to milk. Clustered regularly interspaced short palindromic repeats-CRISPR-associated genes (CRISPR-Cas), the adaptive immune system in bacteria, limits genetic diversity by targeting MGEs including bacteriophages, transposons, and plasmids. CRISPR-Cas systems are widespread in streptococci, suggesting that the interplay between CRISPR-Cas systems and MGEs is one of the driving forces governing genome homeostasis in this genus. To investigate the genetic outcomes resulting from CRISPR-Cas targeting of integrated MGEs, in silico prediction revealed four genomic islands without essential genes in lengths from 8 to 102 kbp, totaling 7% of the genome. In this study, the endogenous CRISPR3 type II system was programmed to target the four islands independently through plasmid-based expression of engineered CRISPR arrays. Targeting lacZ within the largest 102-kbp genomic island was lethal to wild-type cells and resulted in a reduction of up to 2.5-log in the surviving population. Genotyping of Lac(-) survivors revealed variable deletion events between the flanking insertion-sequence elements, all resulting in elimination of the Lac-encoding island. Chimeric insertion sequence footprints were observed at the deletion junctions after targeting all of the four genomic islands, suggesting a common mechanism of deletion via recombination between flanking insertion sequences. These results established that self-targeting CRISPR-Cas systems may direct significant evolution of bacterial genomes on a population level, influencing genome homeostasis and remodeling.
Packaging signals in two single-stranded RNA viruses imply a conserved assembly mechanism and geometry of the packaged genome.

PubMed

Dykeman, Eric C; Stockley, Peter G; Twarock, Reidun

2013-09-09

The current paradigm for assembly of single-stranded RNA viruses is based on a mechanism involving non-sequence-specific packaging of genomic RNA driven by electrostatic interactions. Recent experiments, however, provide compelling evidence for sequence specificity in this process both in vitro and in vivo. The existence of multiple RNA packaging signals (PSs) within viral genomes has been proposed, which facilitates assembly by binding coat proteins in such a way that they promote the protein-protein contacts needed to build the capsid. The binding energy from these interactions enables the confinement or compaction of the genomic RNAs. Identifying the nature of such PSs is crucial for a full understanding of assembly, which is an as yet untapped potential drug target for this important class of pathogens. Here, for two related bacterial viruses, we determine the sequences and locations of their PSs using Hamiltonian paths, a concept from graph theory, in combination with bioinformatics and structural studies. Their PSs have a common secondary structure motif but distinct consensus sequences and positions within the respective genomes. Despite these differences, the distributions of PSs in both viruses imply defined conformations for the packaged RNA genomes in contact with the protein shell in the capsid, consistent with a recent asymmetric structure determination of the MS2 virion. The PS distributions identified moreover imply a preferred, evolutionarily conserved assembly pathway with respect to the RNA sequence with potentially profound implications for other single-stranded RNA viruses known to have RNA PSs, including many animal and human pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.
Seamless Genome Editing in Rice via Gene Targeting and Precise Marker Elimination.

PubMed

Nishizawa-Yokoi, Ayako; Saika, Hiroaki; Toki, Seiichi

2016-01-01

Positive-negative selection using hygromycin phosphotransferase (hpt) and diphtheria toxin A-fragment (DT-A) as positive and negative selection markers, respectively, allows enrichment of cells harboring target genes modified via gene targeting (GT). We have developed a successful GT system employing positive-negative selection and subsequent precise marker excision via the piggyBac transposon derived from the cabbage looper moth to introduce desired modifications into target genes in the rice genome. This approach could be applied to the precision genome editing of almost all endogenous genes throughout the genome, at least in rice.
[Genome-editing: focus on the off-target effects].

PubMed

He, Xiubin; Gu, Feng

2017-10-25

Breakthroughs of genome-editing in recent years have paved the way to develop new therapeutic strategies. These genome-editing tools mainly include Zinc-finger nucleases (ZFNs), Transcription activator-like effector nucleases (TALENs), and clustered regulatory interspaced short palindromic repeat (CRISPR)/Cas-based RNA-guided DNA endonucleases. However, off-target effects are still the major issue in genome editing, and limit the application in gene therapy. Here, we summarized the cause and compared different detection methods of off-targets.
Suppression of HBV replication by the expression of nickase- and nuclease dead-Cas9.

PubMed

Kurihara, Takeshi; Fukuhara, Takasuke; Ono, Chikako; Yamamoto, Satomi; Uemura, Kentaro; Okamoto, Toru; Sugiyama, Masaya; Motooka, Daisuke; Nakamura, Shota; Ikawa, Masato; Mizokami, Masashi; Maehara, Yoshihiko; Matsuura, Yoshiharu

2017-07-21

Complete removal of hepatitis B virus (HBV) DNA from nuclei is difficult by the current therapies. Recent reports have shown that a novel genome-editing tool using Cas9 with a single-guide RNA (sgRNA) system can cleave the HBV genome in vitro and in vivo. However, induction of a double-strand break (DSB) on the targeted genome by Cas9 risks undesirable off-target cleavage on the host genome. Nickase-Cas9 cleaves a single strand of DNA, and thereby two sgRNAs are required for inducing DSBs. To avoid Cas9-induced off-target mutagenesis, we examined the effects of the expressions of nickase-Cas9 and nuclease dead Cas9 (d-Cas9) with sgRNAs on HBV replication. The expression of nickase-Cas9 with a pair of sgRNAs cleaved the target HBV genome and suppressed the viral-protein expression and HBV replication in vitro. Moreover, nickase-Cas9 with the sgRNA pair cleaved the targeted HBV genome in mouse liver. Interestingly, d-Cas9 expression with the sgRNAs also suppressed HBV replication in vitro without cleaving the HBV genome. These results suggest the possible use of nickase-Cas9 and d-Cas9 with a pair of sgRNAs for eliminating HBV DNA from the livers of chronic hepatitis B patients with low risk of undesirable off-target mutation on the host genome.
Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

PubMed

Kroneis, Thomas; El-Heliebi, Amin

2015-01-01

Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.
DNA targeting specificity of RNA-guided Cas9 nucleases.

PubMed

Hsu, Patrick D; Scott, David A; Weinstein, Joshua A; Ran, F Ann; Konermann, Silvana; Agarwala, Vineeta; Li, Yinqing; Fine, Eli J; Wu, Xuebing; Shalem, Ophir; Cradick, Thomas J; Marraffini, Luciano A; Bao, Gang; Zhang, Feng

2013-09-01

The Streptococcus pyogenes Cas9 (SpCas9) nuclease can be efficiently targeted to genomic loci by means of single-guide RNAs (sgRNAs) to enable genome editing. Here, we characterize SpCas9 targeting specificity in human cells to inform the selection of target sites and avoid off-target effects. Our study evaluates >700 guide RNA variants and SpCas9-induced indel mutation levels at >100 predicted genomic off-target loci in 293T and 293FT cells. We find that SpCas9 tolerates mismatches between guide RNA and target DNA at different positions in a sequence-dependent manner, sensitive to the number, position and distribution of mismatches. We also show that SpCas9-mediated cleavage is unaffected by DNA methylation and that the dosage of SpCas9 and sgRNA can be titrated to minimize off-target modification. To facilitate mammalian genome engineering applications, we provide a web-based software tool to guide the selection and validation of target sequences as well as off-target analyses.
The evolutionary fate of the chloroplast and nuclear rps16 genes as revealed through the sequencing and comparative analyses of four novel legume chloroplast genomes from Lupinus

PubMed Central

Keller, J.; Rousseau-Gueutin, M.; Martin, G.E.; Morice, J.; Boutte, J.; Coissac, E.; Ourari, M.; Aïnouche, M.; Salmon, A.; Cabello-Hurtado, F.

2017-01-01

Abstract The Fabaceae family is considered as a model system for understanding chloroplast genome evolution due to the presence of extensive structural rearrangements, gene losses and localized hypermutable regions. Here, we provide sequences of four chloroplast genomes from the Lupinus genus, belonging to the underinvestigated Genistoid clade. Notably, we found in Lupinus species the functional loss of the essential rps16 gene, which was most likely replaced by the nuclear rps16 gene that encodes chloroplast and mitochondrion targeted RPS16 proteins. To study the evolutionary fate of the rps16 gene, we explored all available plant chloroplast, mitochondrial and nuclear genomes. Whereas no plant mitochondrial genomes carry an rps16 gene, many plants still have a functional nuclear and chloroplast rps16 gene. Ka/Ks ratios revealed that both chloroplast and nuclear rps16 copies were under purifying selection. However, due to the dual targeting of the nuclear rps16 gene product and the absence of a mitochondrial copy, the chloroplast gene may be lost. We also performed comparative analyses of lupine plastomes (SNPs, indels and repeat elements), identified the most variable regions and examined their phylogenetic utility. The markers identified here will help to reveal the evolutionary history of lupines, Genistoids and closely related clades. PMID:28338826

Robust one-Tube Ω-PCR Strategy Accelerates Precise Sequence Modification of Plasmids for Functional Genomics

PubMed Central

Chen, Letian; Wang, Fengpin; Wang, Xiaoyu; Liu, Yao-Guang

2013-01-01

Functional genomics requires vector construction for protein expression and functional characterization of target genes; therefore, a simple, flexible and low-cost molecular manipulation strategy will be highly advantageous for genomics approaches. Here, we describe a Ω-PCR strategy that enables multiple types of sequence modification, including precise insertion, deletion and substitution, in any position of a circular plasmid. Ω-PCR is based on an overlap extension site-directed mutagenesis technique, and is named for its characteristic Ω-shaped secondary structure during PCR. Ω-PCR can be performed either in two steps, or in one tube in combination with exonuclease I treatment. These strategies have wide applications for protein engineering, gene function analysis and in vitro gene splicing. PMID:23335613
The Therapeutic Effect of the Antitumor Drug 11 Beta and Related Molecules on Polycystic Kidney Disease

DTIC Science & Technology

2017-10-01

structure central to the pathogenesis of ADPKD. The team at Yale employed CRISPR - Cas9 genome editing technology to generate two isogenic cell lines...possibilities may have contributed to this result, including a clonal effect perhaps from off target CRISPR /Cas9 activity leading to a rescue...Pkd1 knockout background; this cell line is another control for any off-target genetic in the CRISPR /Cas9 editing procedure. This will establish
Stapled peptide inhibitors of RAB25 target context-specific phenotypes in cancer | Office of Cancer Genomics

Cancer.gov

Recent evidence has established a role for the small GTPase RAB25, as well as related effector proteins, in enacting both pro-oncogenic and anti-oncogenic phenotypes in specific cellular contexts. Here we report the development of all-hydrocarbon stabilized peptides derived from the RAB-binding FIP-family of proteins to target RAB25. Relative to unmodified peptides, optimized stapled peptides exhibit increased structural stability, binding affinity, cell permeability, and inhibition of RAB25:FIP complex formation.
Genome and transcriptome sequencing identifies breeding targets in the orphan crop tef (Eragrostis tef).

PubMed

Cannarozzi, Gina; Plaza-Wüthrich, Sonia; Esfeld, Korinna; Larti, Stéphanie; Wilson, Yi Song; Girma, Dejene; de Castro, Edouard; Chanyalew, Solomon; Blösch, Regula; Farinelli, Laurent; Lyons, Eric; Schneider, Michel; Falquet, Laurent; Kuhlemeier, Cris; Assefa, Kebebew; Tadele, Zerihun

2014-07-09

Tef (Eragrostis tef), an indigenous cereal critical to food security in the Horn of Africa, is rich in minerals and protein, resistant to many biotic and abiotic stresses and safe for diabetics as well as sufferers of immune reactions to wheat gluten. We present the genome of tef, the first species in the grass subfamily Chloridoideae and the first allotetraploid assembled de novo. We sequenced the tef genome for marker-assisted breeding, to shed light on the molecular mechanisms conferring tef's desirable nutritional and agronomic properties, and to make its genome publicly available as a community resource. The draft genome contains 672 Mbp representing 87% of the genome size estimated from flow cytometry. We also sequenced two transcriptomes, one from a normalized RNA library and another from unnormalized RNASeq data. The normalized RNA library revealed around 38000 transcripts that were then annotated by the SwissProt group. The CoGe comparative genomics platform was used to compare the tef genome to other genomes, notably sorghum. Scaffolds comprising approximately half of the genome size were ordered by syntenic alignment to sorghum producing tef pseudo-chromosomes, which were sorted into A and B genomes as well as compared to the genetic map of tef. The draft genome was used to identify novel SSR markers, investigate target genes for abiotic stress resistance studies, and understand the evolution of the prolamin family of proteins that are responsible for the immune response to gluten. It is highly plausible that breeding targets previously identified in other cereal crops will also be valuable breeding targets in tef. The draft genome and transcriptome will be of great use for identifying these targets for genetic improvement of this orphan crop that is vital for feeding 50 million people in the Horn of Africa.
New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

PubMed Central

2011-01-01

Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy. PMID:21767393
New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits.

PubMed

Saski, Christopher A; Li, Zhigang; Feltus, Frank A; Luo, Hong

2011-07-18

Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy.
Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria

PubMed Central

Dy, Ron L; Pitman, Andrew R; Fineran, Peter C

2013-01-01

The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution. PMID:24251073
Integrated Genomic Characterization Reveals Novel, Therapeutically Relevant Drug Targets in FGFR and EGFR Pathways in Sporadic Intrahepatic Cholangiocarcinoma

PubMed Central

Liang, Winnie S.; Fonseca, Rafael; Bryce, Alan H.; McCullough, Ann E.; Barrett, Michael T.; Hunt, Katherine; Patel, Maitray D.; Young, Scott W.; Collins, Joseph M.; Silva, Alvin C.; Condjella, Rachel M.; Block, Matthew; McWilliams, Robert R.; Lazaridis, Konstantinos N.; Klee, Eric W.; Bible, Keith C.; Harris, Pamela; Oliver, Gavin R.; Bhavsar, Jaysheel D.; Nair, Asha A.; Middha, Sumit; Asmann, Yan; Kocher, Jean-Pierre; Schahl, Kimberly; Kipp, Benjamin R.; Barr Fritcher, Emily G.; Baker, Angela; Aldrich, Jessica; Kurdoglu, Ahmet; Izatt, Tyler; Christoforides, Alexis; Cherni, Irene; Nasser, Sara; Reiman, Rebecca; Phillips, Lori; McDonald, Jackie; Adkins, Jonathan; Mastrian, Stephen D.; Placek, Pamela; Watanabe, Aprill T.; LoBello, Janine; Han, Haiyong; Von Hoff, Daniel; Craig, David W.; Stewart, A. Keith; Carpten, John D.

2014-01-01

Advanced cholangiocarcinoma continues to harbor a difficult prognosis and therapeutic options have been limited. During the course of a clinical trial of whole genomic sequencing seeking druggable targets, we examined six patients with advanced cholangiocarcinoma. Integrated genome-wide and whole transcriptome sequence analyses were performed on tumors from six patients with advanced, sporadic intrahepatic cholangiocarcinoma (SIC) to identify potential therapeutically actionable events. Among the somatic events captured in our analysis, we uncovered two novel therapeutically relevant genomic contexts that when acted upon, resulted in preliminary evidence of anti-tumor activity. Genome-wide structural analysis of sequence data revealed recurrent translocation events involving the FGFR2 locus in three of six assessed patients. These observations and supporting evidence triggered the use of FGFR inhibitors in these patients. In one example, preliminary anti-tumor activity of pazopanib (in vitro FGFR2 IC50≈350 nM) was noted in a patient with an FGFR2-TACC3 fusion. After progression on pazopanib, the same patient also had stable disease on ponatinib, a pan-FGFR inhibitor (in vitro, FGFR2 IC50≈8 nM). In an independent non-FGFR2 translocation patient, exome and transcriptome analysis revealed an allele specific somatic nonsense mutation (E384X) in ERRFI1, a direct negative regulator of EGFR activation. Rapid and robust disease regression was noted in this ERRFI1 inactivated tumor when treated with erlotinib, an EGFR kinase inhibitor. FGFR2 fusions and ERRFI mutations may represent novel targets in sporadic intrahepatic cholangiocarcinoma and trials should be characterized in larger cohorts of patients with these aberrations. PMID:24550739
The molecular genetic makeup of acute lymphoblastic leukemia.

PubMed

Mullighan, Charles G

2012-01-01

Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention. Mutations in genes regulating lymphoid development are a hallmark of ALL, and alterations of the lymphoid transcription factor gene IKZF1 (IKAROS) are associated with a high risk of treatment failure in B-ALL. Approximately 20% of B-ALL cases harbor genetic alterations that activate kinase signaling that may be amenable to treatment with tyrosine kinase inhibitors, including rearrangements of the cytokine receptor gene CRLF2; rearrangements of ABL1, JAK2, and PDGFRB; and mutations of JAK1 and JAK2. Whole-genome sequencing has also identified novel targets of mutation in aggressive T-lineage ALL, including hematopoietic regulators (ETV6 and RUNX1), tyrosine kinases, and epigenetic regulators. Challenges for the future are to comprehensively identify and experimentally validate all genetic alterations driving leukemogenesis and treatment failure in childhood and adult ALL and to implement genomic profiling into the clinical setting to guide risk stratification and targeted therapy.
MicroRNAs in Honey Bee Caste Determination

PubMed Central

Ashby, Regan; Forêt, Sylvain; Searle, Iain; Maleszka, Ryszard

2016-01-01

The cellular mechanisms employed by some organisms to produce contrasting morphological and reproductive phenotypes from the same genome remains one of the key unresolved issues in biology. Honeybees (Apis mellifera) use differential feeding and a haplodiploid sex determination system to generate three distinct organismal outcomes from the same genome. Here we investigate the honeybee female and male caste-specific microRNA and transcriptomic molecular signatures during a critical time of larval development. Both previously undetected and novel miRNAs have been discovered, expanding the inventory of these genomic regulators in invertebrates. We show significant differences in the microRNA and transcriptional profiles of diploid females relative to haploid drone males as well as between reproductively distinct females (queens and workers). Queens and drones show gene enrichment in physio-metabolic pathways, whereas workers show enrichment in processes associated with neuronal development, cell signalling and caste biased structural differences. Interestingly, predicted miRNA targets are primarily associated with non-physio-metabolic genes, especially neuronal targets, suggesting a mechanistic disjunction from DNA methylation that regulates physio-metabolic processes. Accordingly, miRNA targets are under-represented in methylated genes. Our data show how a common set of genetic elements are differentially harnessed by an organism, which may provide the remarkable level of developmental flexibility required. PMID:26739502
PhytoCRISP-Ex: a web-based and stand-alone application to find specific target sequences for CRISPR/CAS editing.

PubMed

Rastogi, Achal; Murik, Omer; Bowler, Chris; Tirichine, Leila

2016-07-01

With the emerging interest in phytoplankton research, the need to establish genetic tools for the functional characterization of genes is indispensable. The CRISPR/Cas9 system is now well recognized as an efficient and accurate reverse genetic tool for genome editing. Several computational tools have been published allowing researchers to find candidate target sequences for the engineering of the CRISPR vectors, while searching possible off-targets for the predicted candidates. These tools provide built-in genome databases of common model organisms that are used for CRISPR target prediction. Although their predictions are highly sensitive, the applicability to non-model genomes, most notably protists, makes their design inadequate. This motivated us to design a new CRISPR target finding tool, PhytoCRISP-Ex. Our software offers CRIPSR target predictions using an extended list of phytoplankton genomes and also delivers a user-friendly standalone application that can be used for any genome. The software attempts to integrate, for the first time, most available phytoplankton genomes information and provide a web-based platform for Cas9 target prediction within them with high sensitivity. By offering a standalone version, PhytoCRISP-Ex maintains an independence to be used with any organism and widens its applicability in high throughput pipelines. PhytoCRISP-Ex out pars all the existing tools by computing the availability of restriction sites over the most probable Cas9 cleavage sites, which can be ideal for mutant screens. PhytoCRISP-Ex is a simple, fast and accurate web interface with 13 pre-indexed and presently updating phytoplankton genomes. The software was also designed as a UNIX-based standalone application that allows the user to search for target sequences in the genomes of a variety of other species.
Identification of Factors Promoting HBV Capsid Self-Assembly by Assembly-Promoting Antivirals.

PubMed

Rath, Soumya Lipsa; Liu, Huihui; Okazaki, Susumu; Shinoda, Wataru

2018-02-26

Around 270 million individuals currently live with hepatitis B virus (HBV) infection. Heteroaryldihydropyrimidines (HAPs) are a family of antivirals that target the HBV capsid protein and induce aberrant self-assembly. The capsids formed resemble the native capsid structure but are unable to propagate the virus progeny because of a lack of RNA/DNA. Under normal conditions, self-assembly is initiated by the viral genome. The mode of action of HAPs, however, remains largely unknown. In this work, using molecular dynamics simulations, we attempted to understand the action of HAP by comparing the dynamics of capsid proteins with and without HAPs. We found that the inhibitor is more stable in higher oligomers. It retains its stability in the hexamer throughout 1 μs of simulation. Our results also show that the inhibitor might help in stabilizing the C-terminus, the HBc 149-183 arginine-rich domain of the capsid protein. The C-termini of dimers interact with each other, assisted by the HAP inhibitor. During capsid assembly, the termini are supposed to directly interact with the viral genome, thereby suggesting that the viral genome might work in a similar way to stabilize the capsid protein. Our results may help in understanding the underlying molecular mechanism of HBV capsid self-assembly, which should be crucial for exploring new drug targets and structure-based drug design.
Recovery of known T-cell epitopes by computational scanning of a viral genome

NASA Astrophysics Data System (ADS)

Logean, Antoine; Rognan, Didier

2002-04-01

A new computational method (EpiDock) is proposed for predicting peptide binding to class I MHC proteins, from the amino acid sequence of any protein of immunological interest. Starting from the primary structure of the target protein, individual three-dimensional structures of all possible MHC-peptide (8-, 9- and 10-mers) complexes are obtained by homology modelling. A free energy scoring function (Fresno) is then used to predict the absolute binding free energy of all possible peptides to the class I MHC restriction protein. Assuming that immunodominant epitopes are usually found among the top MHC binders, the method can thus be applied to predict the location of immunogenic peptides on the sequence of the protein target. When applied to the prediction of HLA-A*0201-restricted T-cell epitopes from the Hepatitis B virus, EpiDock was able to recover 92% of known high affinity binders and 80% of known epitopes within a filtered subset of all possible nonapeptides corresponding to about one tenth of the full theoretical list. The proposed method is fully automated and fast enough to scan a viral genome in less than an hour on a parallel computing architecture. As it requires very few starting experimental data, EpiDock can be used: (i) to predict potential T-cell epitopes from viral genomes (ii) to roughly predict still unknown peptide binding motifs for novel class I MHC alleles.
Discovery of Nigri/nox and Panto/pox site-specific recombinase systems facilitates advanced genome engineering.

PubMed

Karimova, Madina; Splith, Victoria; Karpinski, Janet; Pisabarro, M Teresa; Buchholz, Frank

2016-07-22

Precise genome engineering is instrumental for biomedical research and holds great promise for future therapeutic applications. Site-specific recombinases (SSRs) are valuable tools for genome engineering due to their exceptional ability to mediate precise excision, integration and inversion of genomic DNA in living systems. The ever-increasing complexity of genome manipulations and the desire to understand the DNA-binding specificity of these enzymes are driving efforts to identify novel SSR systems with unique properties. Here, we describe two novel tyrosine site-specific recombination systems designated Nigri/nox and Panto/pox. Nigri originates from Vibrio nigripulchritudo (plasmid VIBNI_pA) and recombines its target site nox with high efficiency and high target-site selectivity, without recombining target sites of the well established SSRs Cre, Dre, Vika and VCre. Panto, derived from Pantoea sp. aB, is less specific and in addition to its native target site, pox also recombines the target site for Dre recombinase, called rox. This relaxed specificity allowed the identification of residues that are involved in target site selectivity, thereby advancing our understanding of how SSRs recognize their respective DNA targets.
How many genomics targets can a portfolio afford?

PubMed

Betz, Ulrich A K

2005-08-01

The pharmaceutical industry can look back at a history of successful innovations. Although genomics technologies have provided drug discovery pipelines with a plethora of new potential drug targets, solid target validation is crucial to avoiding high attrition rates. Biomarkers for patient stratification and approaches for personalized medicine will further help to reduce the risk associated with new targets. To achieve an overall risk balance, portfolios have to be supplemented with precedented targets, me-too approaches and line extensions of existing drugs. However, capitalizing on genomics investments and working on unprecedented targets is essential for a continuous stream of innovative drugs.
RNA-guided genome editing for target gene mutations in wheat.

PubMed

Upadhyay, Santosh Kumar; Kumar, Jitesh; Alok, Anshu; Tuli, Rakesh

2013-12-09

The clustered, regularly interspaced, short palindromic repeats (CRISPR) and CRISPR-associated protein (Cas) system has been used as an efficient tool for genome editing. We report the application of CRISPR-Cas-mediated genome editing to wheat (Triticum aestivum), the most important food crop plant with a very large and complex genome. The mutations were targeted in the inositol oxygenase (inox) and phytoene desaturase (pds) genes using cell suspension culture of wheat and in the pds gene in leaves of Nicotiana benthamiana. The expression of chimeric guide RNAs (cgRNA) targeting single and multiple sites resulted in indel mutations in all the tested samples. The expression of Cas9 or sgRNA alone did not cause any mutation. The expression of duplex cgRNA with Cas9 targeting two sites in the same gene resulted in deletion of DNA fragment between the targeted sequences. Multiplexing the cgRNA could target two genes at one time. Target specificity analysis of cgRNA showed that mismatches at the 3' end of the target site abolished the cleavage activity completely. The mismatches at the 5' end reduced cleavage, suggesting that the off target effects can be abolished in vivo by selecting target sites with unique sequences at 3' end. This approach provides a powerful method for genome engineering in plants.
Constructing failure in big biology: The socio-technical anatomy of Japan's Protein 3000 Project.

PubMed

Fukushima, Masato

2016-02-01

This study focuses on the 5-year Protein 3000 Project launched in 2002, the largest biological project in Japan. The project aimed to overcome Japan's alleged failure to contribute fully to the Human Genome Project, by determining 3000 protein structures, 30 percent of the global target. Despite its achievement of this goal, the project was fiercely criticized in various sectors of society and was often branded an awkward failure. This article tries to solve the mystery of why such failure discourse was prevalent. Three explanatory factors are offered: first, because some goals were excluded during project development, there was a dynamic of failed expectations; second, structural genomics, while promoting collaboration with the international community, became an 'anti-boundary object', only the absence of which bound heterogeneous domestic actors; third, there developed an urgent sense of international competition in order to obtain patents on such structural information.
Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes

PubMed Central

Stark, Alexander; Kheradpour, Pouya; Parts, Leopold; Brennecke, Julius; Hodges, Emily; Hannon, Gregory J.; Kellis, Manolis

2007-01-01

MicroRNAs (miRNAs) are short regulatory RNAs that inhibit target genes by complementary binding in 3′ untranslated regions (3′ UTRs). They are one of the most abundant classes of regulators, targeting a large fraction of all genes, making their comprehensive study a requirement for understanding regulation and development. Here we use 12 Drosophila genomes to define structural and evolutionary signatures of miRNA hairpins, which we use for their de novo discovery. We predict >41 novel miRNA genes, which encompass many unique families, and 28 of which are validated experimentally. We also define signals for the precise start position of mature miRNAs, which suggest corrections of previously known miRNAs, often leading to drastic changes in their predicted target spectrum. We show that miRNA discovery power scales with the number and divergence of species compared, suggesting that such approaches can be successful in human as dozens of mammalian genomes become available. Interestingly, for some miRNAs sense and anti-sense hairpins score highly and mature miRNAs from both strands can indeed be found in vivo. Similarly, miRNAs with weak 5′ end predictions show increased in vivo processing of multiple alternate 5′ ends and have fewer predicted targets. Lastly, we show that several miRNA star sequences score highly and are likely functional. For mir-10 in particular, both arms show abundant processing, and both show highly conserved target sites in Hox genes, suggesting a possible cooperation of the two arms, and their role as a master Hox regulator. PMID:17989255
Genome-wide specificity of DNA binding, gene regulation, and chromatin remodeling by TALE- and CRISPR/Cas9-based transcriptional activators

PubMed Central

Polstein, Lauren R.; Perez-Pinera, Pablo; Kocak, D. Dewran; Vockley, Christopher M.; Bledsoe, Peggy; Song, Lingyun; Safi, Alexias; Crawford, Gregory E.; Reddy, Timothy E.; Gersbach, Charles A.

2015-01-01

Genome engineering technologies based on the CRISPR/Cas9 and TALE systems are enabling new approaches in science and biotechnology. However, the specificity of these tools in complex genomes and the role of chromatin structure in determining DNA binding are not well understood. We analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators in human cells using ChIP-seq to assess DNA-binding specificity and RNA-seq to measure the specificity of perturbing the transcriptome. Additionally, DNase-seq was used to assess genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these transcription factors are highly specific in both DNA binding and gene regulation and are able to open targeted regions of closed chromatin independent of gene activation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. PMID:26025803
DNA Breaks and End Resection Measured Genome-wide by End Sequencing.

PubMed

Canela, Andres; Sridharan, Sriram; Sciascia, Nicholas; Tubbs, Anthony; Meltzer, Paul; Sleckman, Barry P; Nussenzweig, André

2016-09-01

DNA double-strand breaks (DSBs) arise during physiological transcription, DNA replication, and antigen receptor diversification. Mistargeting or misprocessing of DSBs can result in pathological structural variation and mutation. Here we describe a sensitive method (END-seq) to monitor DNA end resection and DSBs genome-wide at base-pair resolution in vivo. We utilized END-seq to determine the frequency and spectrum of restriction-enzyme-, zinc-finger-nuclease-, and RAG-induced DSBs. Beyond sequence preference, chromatin features dictate the repertoire of these genome-modifying enzymes. END-seq can detect at least one DSB per cell among 10,000 cells not harboring DSBs, and we estimate that up to one out of 60 cells contains off-target RAG cleavage. In addition to site-specific cleavage, we detect DSBs distributed over extended regions during immunoglobulin class-switch recombination. Thus, END-seq provides a snapshot of DNA ends genome-wide, which can be utilized for understanding genome-editing specificities and the influence of chromatin on DSB pathway choice. Published by Elsevier Inc.

Early experience with formalin-fixed paraffin-embedded (FFPE) based commercial clinical genomic profiling of gliomas-robust and informative with caveats.

PubMed

Movassaghi, Masoud; Shabihkhani, Maryam; Hojat, Seyed A; Williams, Ryan R; Chung, Lawrance K; Im, Kyuseok; Lucey, Gregory M; Wei, Bowen; Mareninov, Sergey; Wang, Michael W; Ng, Denise W; Tashjian, Randy S; Magaki, Shino; Perez-Rosendahl, Mari; Yang, Isaac; Khanlou, Negar; Vinters, Harry V; Liau, Linda M; Nghiemphu, Phioanh L; Lai, Albert; Cloughesy, Timothy F; Yong, William H

2017-08-01

Commercial targeted genomic profiling with next generation sequencing using formalin-fixed paraffin embedded (FFPE) tissue has recently entered into clinical use for diagnosis and for the guiding of therapy. However, there is limited independent data regarding the accuracy or robustness of commercial genomic profiling in gliomas. As part of patient care, FFPE samples of gliomas from 71 patients were submitted for targeted genomic profiling to one commonly used commercial vendor, Foundation Medicine. Genomic alterations were determined for the following grades or groups of gliomas; Grade I/II, Grade III, primary glioblastomas (GBMs), recurrent primary GBMs, and secondary GBMs. In addition, FFPE samples from the same patients were independently assessed with conventional methods such as immunohistochemistry (IHC), Quantitative real-time PCR (qRT-PCR), or Fluorescence in situ hybridization (FISH) for three genetic alterations: IDH1 mutations, EGFR amplification, and EGFRvIII expression. A total of 100 altered genes were detected by the aforementioned targeted genomic profiling assay. The number of different genomic alterations was significantly different between the five groups of gliomas and consistent with the literature. CDKN2A/B, TP53, and TERT were the most common genomic alterations seen in primary GBMs, whereas IDH1, TP53, and PIK3CA were the most common in secondary GBMs. Targeted genomic profiling demonstrated 92.3%-100% concordance with conventional methods. The targeted genomic profiling report provided an average of 5.5 drugs, and listed an average of 8.4 clinical trials for the 71 glioma patients studied but only a third of the trials were appropriate for glioma patients. In this limited comparison study, this commercial next generation sequencing based-targeted genomic profiling showed a high concordance rate with conventional methods for the 3 genetic alterations and identified mutations expected for the type of glioma. While it may not be feasible to exhaustively independently validate a commercial genomic profiling assay, examination of a few markers provides some reassurance of its robustness. While potential targeted drugs are recommended based on genetic alterations, to date most targeted therapies have failed in glioblasomas so the usefulness of such recommendations will increase with development of novel and efficacious drugs. Copyright © 2017. Published by Elsevier Inc.
PTEN in the maintenance of genome integrity: From DNA replication to chromosome segregation.

PubMed

Hou, Sheng-Qi; Ouyang, Meng; Brandmaier, Andrew; Hao, Hongbo; Shen, Wen H

2017-10-01

Faithful DNA replication and accurate chromosome segregation are the key machineries of genetic transmission. Disruption of these processes represents a hallmark of cancer and often results from loss of tumor suppressors. PTEN is an important tumor suppressor that is frequently mutated or deleted in human cancer. Loss of PTEN has been associated with aneuploidy and poor prognosis in cancer patients. In mice, Pten deletion or mutation drives genomic instability and tumor development. PTEN deficiency induces DNA replication stress, confers stress tolerance, and disrupts mitotic spindle architecture, leading to accumulation of structural and numerical chromosome instability. Therefore, PTEN guards the genome by controlling multiple processes of chromosome inheritance. Here, we summarize current understanding of the PTEN function in promoting high-fidelity transmission of genetic information. We also discuss the PTEN pathways of genome maintenance and highlight potential targets for cancer treatment. © 2017 WILEY Periodicals, Inc.
Kissing-loop interaction between 5′ and 3′ ends of tick-borne Langat virus genome ‘bridges the gap’ between mosquito- and tick-borne flaviviruses in mechanisms of viral RNA cyclization: applications for virus attenuation and vaccine development

PubMed Central

Tsetsarkin, Konstantin A.; Liu, Guangping; Shen, Kui; Pletnev, Alexander G.

2016-01-01

Insertion of microRNA target sequences into the flavivirus genome results in selective tissue-specific attenuation and host-range restriction of live attenuated vaccine viruses. However, previous strategies for miRNA-targeting did not incorporate a mechanism to prevent target elimination under miRNA-mediated selective pressure, restricting their use in vaccine development. To overcome this limitation, we developed a new approach for miRNA-targeting of tick-borne flavivirus (Langat virus, LGTV) in the duplicated capsid gene region (DCGR). Genetic stability of viruses with DCGR was ensured by the presence of multiple cis-acting elements within the N-terminal capsid coding region, including the stem-loop structure (5′SL6) at the 3′ end of the promoter. We found that the 5′SL6 functions as a structural scaffold for the conserved hexanucleotide motif at its tip and engages in a complementary interaction with the region present in the 3′ NCR to enhance viral RNA replication. The resulting kissing-loop interaction, common in tick-borne flaviviruses, supports a single pair of cyclization elements (CYC) and functions as a homolog of the second pair of CYC that is present in the majority of mosquito-borne flaviviruses. Placing miRNA targets into the DCGR results in superior attenuation of LGTV in the CNS and does not interfere with development of protective immunity in immunized mice. PMID:26850640
Large protein as a potential target for use in rabies diagnostics.

PubMed

Santos Katz, I S; Dias, M H; Lima, I F; Chaves, L B; Ribeiro, O G; Scheffer, K C; Iwai, L K

Rabies is a zoonotic viral disease that remains a serious threat to public health worldwide. The rabies lyssavirus (RABV) genome encodes five structural proteins, multifunctional and significant for pathogenicity. The large protein (L) presents well-conserved genomic regions, which may be a good alternative to generate informative datasets for development of new methods for rabies diagnosis. This paper describes the development of a technique for the identification of L protein in several RABV strains from different hosts, demonstrating that MS-based proteomics is a potential method for antigen identification and a good alternative for rabies diagnosis.
Expression of exogenous DNA methyltransferases: application in molecular and cell biology.

PubMed

Dyachenko, O V; Tarlachkov, S V; Marinitch, D V; Shevchuk, T V; Buryanov, Y I

2014-02-01

DNA methyltransferases might be used as powerful tools for studies in molecular and cell biology due to their ability to recognize and modify nitrogen bases in specific sequences of the genome. Methylation of the eukaryotic genome using exogenous DNA methyltransferases appears to be a promising approach for studies on chromatin structure. Currently, the development of new methods for targeted methylation of specific genetic loci using DNA methyltransferases fused with DNA-binding proteins is especially interesting. In the present review, expression of exogenous DNA methyltransferase for purposes of in vivo analysis of the functional chromatin structure along with investigation of the functional role of DNA methylation in cell processes are discussed, as well as future prospects for application of DNA methyltransferases in epigenetic therapy and in plant selection.
Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.

PubMed

Huang, Xin; Li, Hao-ming

2009-08-05

Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

PubMed

Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

2015-01-01

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Formation of new chromatin domains determines pathogenicity of genomic duplications.

PubMed

Franke, Martin; Ibrahim, Daniel M; Andrey, Guillaume; Schwarzer, Wibke; Heinrich, Verena; Schöpflin, Robert; Kraft, Katerina; Kempfer, Rieke; Jerković, Ivana; Chan, Wing-Lee; Spielmann, Malte; Timmermann, Bernd; Wittler, Lars; Kurth, Ingo; Cambiaso, Paola; Zuffardi, Orsetta; Houge, Gunnar; Lambie, Lindsay; Brancati, Francesco; Pombo, Ana; Vingron, Martin; Spitz, Francois; Mundlos, Stefan

2016-10-13

Chromosome conformation capture methods have identified subchromosomal structures of higher-order chromatin interactions called topologically associated domains (TADs) that are separated from each other by boundary regions. By subdividing the genome into discrete regulatory units, TADs restrict the contacts that enhancers establish with their target genes. However, the mechanisms that underlie partitioning of the genome into TADs remain poorly understood. Here we show by chromosome conformation capture (capture Hi-C and 4C-seq methods) that genomic duplications in patient cells and genetically modified mice can result in the formation of new chromatin domains (neo-TADs) and that this process determines their molecular pathology. Duplications of non-coding DNA within the mouse Sox9 TAD (intra-TAD) that cause female to male sex reversal in humans, showed increased contact of the duplicated regions within the TAD, but no change in the overall TAD structure. In contrast, overlapping duplications that extended over the next boundary into the neighbouring TAD (inter-TAD), resulted in the formation of a new chromatin domain (neo-TAD) that was isolated from the rest of the genome. As a consequence of this insulation, inter-TAD duplications had no phenotypic effect. However, incorporation of the next flanking gene, Kcnj2, in the neo-TAD resulted in ectopic contacts of Kcnj2 with the duplicated part of the Sox9 regulatory region, consecutive misexpression of Kcnj2, and a limb malformation phenotype. Our findings provide evidence that TADs are genomic regulatory units with a high degree of internal stability that can be sculptured by structural genomic variations. This process is important for the interpretation of copy number variations, as these variations are routinely detected in diagnostic tests for genetic disease and cancer. This finding also has relevance in an evolutionary setting because copy-number differences are thought to have a crucial role in the evolution of genome complexity.
Identification, characterization and expression analysis of pigeonpea miRNAs in response to Fusarium wilt.

PubMed

Hussain, Khalid; Mungikar, Kanak; Kulkarni, Abhijeet; Kamble, Avinash

2018-05-05

Upon confrontation with unfavourable conditions, plants invoke a very complex set of biochemical and physiological reactions and alter gene expression patterns to combat the situations. MicroRNAs (miRNAs), a class of small non-coding RNA, contribute extensively in regulation of gene expression through translation inhibition or degradation of their target mRNAs during such conditions. Therefore, identification of miRNAs and their targets holds importance in understanding the regulatory networks triggered during stress. Structure and sequence similarity based in silico prediction of miRNAs in Cajanus cajan L. (Pigeonpea) draft genome sequence has been carried out earlier. These annotations also appear in related GenBank genome sequence entries. However, there are no reports available on context dependent miRNA expression and their targets in pigeonpea. Therefore, in the present study we addressed these questions computationally, using pigeonpea EST sequence information. We identified five novel pigeonpea miRNA precursors, their mature forms and targets. Interestingly, only one of these miRNAs (miR169i-3p) was identified earlier in draft genome sequence. We then validated expression of these miRNAs, experimentally. It was also observed that these miRNAs show differential expression patterns in response to Fusarium inoculation indicating their biotic stress responsive nature. Overall these results will help towards better understanding the regulatory network of defense during pigeonpea -pathogen interactions and role of miRNAs in the process. Copyright © 2018 Elsevier B.V. All rights reserved.
Antisense antibiotics: a brief review of novel target discovery and delivery.

PubMed

Bai, Hui; Xue, Xiaoyan; Hou, Zheng; Zhou, Ying; Meng, Jingru; Luo, Xiaoxing

2010-06-01

The nightmare of multi-drug resistant bacteria will still haunt if no panacea is ever found. Efforts on seeking desirable natural products with bactericidal property and screening chemically modified derivatives of traditional antibiotics have lagged behind the emergence of new multi-drug resistant bacteria. The concept of using antisense antibiotics, now as revolutionary as is on threshold has experienced ups and downs in the past decade. In the past five years, however, significant technology advances in the fields of microbial genomics, structural modification of oligonucleotides and efficient delivery system have led to fundamental progress in the research and in vivo application of this paradigm. The wealthy information provided in the microbial genomics era has allowed the identification and/or validation of a number of essential genes that may serve as possible targets for antisense inhibition; antisense oligodeoxynucleotides (ODNs) based on the 3rd generation of modified structures, e.g., peptide nucleic acids (PNAs) and phosphorodiamidate morpholino oligomers (PMOs) have shown great potency in gene expression inhibition in a sequence-specific and dosedependent manner at low micromolar concentrations; and cell penetrating peptide mediated delivery system has enabled the effective display of intracellular antisense inhibition of targeted genes both in vitro and in vivo. The new methods show promise in the discovery of novel gene-specific antisense antibiotics that will be useful in the future battle against drug-resistant bacterial infections. This review describes this promising paradigm, the targets that have been identified and the recent technologies on which it is delivered.
Frnakenstein: multiple target inverse RNA folding.

PubMed

Lyngsø, Rune B; Anderson, James W J; Sizikova, Elena; Badugu, Amarendra; Hyland, Tomas; Hein, Jotun

2012-10-09

RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein.
Frnakenstein: multiple target inverse RNA folding

PubMed Central

2012-01-01

Background RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. Results In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Conclusions Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein. PMID:23043260
A computational method for predicting regulation of human microRNAs on the influenza virus genome

PubMed Central

2013-01-01

Background While it has been suggested that host microRNAs (miRNAs) may downregulate viral gene expression as an antiviral defense mechanism, such a mechanism has not been explored in the influenza virus for human flu studies. As it is difficult to conduct related experiments on humans, computational studies can provide some insight. Although many computational tools have been designed for miRNA target prediction, there is a need for cross-species prediction, especially for predicting viral targets of human miRNAs. However, finding putative human miRNAs targeting influenza virus genome is still challenging. Results We developed machine-learning features and conducted comprehensive data training for predicting interactions between H1N1 genome segments and host miRNA. We defined our seed region as the first ten nucleotides from the 5' end of the miRNA to the 3' end of the miRNA and integrated various features including the number of consecutive matching bases in the seed region of 10 bases, a triplet feature in seed regions, thermodynamic energy, penalty of bulges and wobbles at binding sites, and the secondary structure of viral RNA for the prediction. Conclusions Compared to general predictive models, our model fully takes into account the conservation patterns and features of viral RNA secondary structures, and greatly improves the prediction accuracy. Our model identified some key miRNAs including hsa-miR-489, hsa-miR-325, hsa-miR-876-3p and hsa-miR-2117, which target HA, PB2, MP and NS of H1N1, respectively. Our study provided an interesting hypothesis concerning the miRNA-based antiviral defense mechanism against influenza virus in human, i.e., the binding between human miRNA and viral RNAs may not result in gene silencing but rather may block the viral RNA replication. PMID:24565017
TARGET Publication Guidelines | Office of Cancer Genomics

Cancer.gov

Like other NCI large-scale genomics initiatives, TARGET is a community resource project and data are made available rapidly after validation for use by other researchers. To act in accord with the Fort Lauderdale principles and support the continued prompt public release of large-scale genomic data prior to publication, researchers who plan to prepare manuscripts containing descriptions of TARGET pediatric cancer data that would be of comparable scope to an initial TARGET disease-specific comprehensive, global analysis publication, and journal editors who receive such manuscripts, are
Translating genomic discoveries to the clinic in pediatric oncology.

PubMed

Glade Bender, Julia; Verma, Anupam; Schiffman, Joshua D

2015-02-01

The present study describes the recent advances in the identification of targetable genomic alterations in pediatric cancers, along with the progress and associated challenges in translating these findings into therapeutic benefit. Each field within pediatric cancer has rapidly and comprehensively begun to define genomic targets in tumors that potentially can improve the clinical outcome of patients, including hematologic malignancies (leukemia and lymphoma), solid malignancies (neuroblastoma, rhabdomyosarcoma, Ewing sarcoma, and osteosarcoma), and brain tumors (gliomas, ependymomas, and medulloblastomas). Although each tumor has specific and sometimes overlapping genomic targets, the translation to the clinic of new targeted trials and precision medicine protocols is still in its infancy. The first clinical tumor profiling studies in pediatric oncology have demonstrated feasibility and patient enthusiasm for the personalized medicine paradigm, but have yet to demonstrate clinical utility. Complexities influencing implementation include rapidly evolving sequencing technologies, tumor heterogeneity, and lack of access to targeted therapies. The return of incidental findings from the germline also remains a challenge, with evolving policy statements and accepted standards. The translation of genomic discoveries to the clinic in pediatric oncology continues to move forward at a brisk pace. Early adoption of genomics for tumor classification, risk stratification, and initial trials of targeted therapeutic agents has led to powerful results. As our experience grows in the integration of genomic and clinical medicine, the outcome for children with cancer should continue to improve.
A Roadmap for Tick-Borne Flavivirus Research in the "Omics" Era.

PubMed

Grabowski, Jeffrey M; Hill, Catherine A

2017-01-01

Tick-borne flaviviruses (TBFs) affect human health globally. Human vaccines provide protection against some TBFs, and antivirals are available, yet TBF-specific control strategies are limited. Advances in genomics offer hope to understand the viral complement transmitted by ticks, and to develop disruptive, data-driven technologies for virus detection, treatment, and control. The genome assemblies of Ixodes scapularis , the North American tick vector of the TBF, Powassan virus, and other tick vectors, are providing insights into tick biology and pathogen transmission and serve as nucleation points for expanded genomic research. Systems biology has yielded insights to the response of tick cells to viral infection at the transcript and protein level, and new protein targets for vaccines to limit virus transmission. Reverse vaccinology approaches have moved candidate tick antigenic epitopes into vaccine development pipelines. Traditional drug and in silico screening have identified candidate antivirals, and target-based approaches have been developed to identify novel acaricides. Yet, additional genomic resources are required to expand TBF research. Priorities include genome assemblies for tick vectors, "omic" studies involving high consequence pathogens and vectors, and emphasizing viral metagenomics, tick-virus metabolomics, and structural genomics of TBF and tick proteins. Also required are resources for forward genetics, including the development of tick strains with quantifiable traits, genetic markers and linkage maps. Here we review the current state of genomic research on ticks and tick-borne viruses with an emphasis on TBFs. We outline an ambitious 10-year roadmap for research in the "omics era," and explore key milestones needed to accomplish the goal of delivering three new vaccines, antivirals and acaricides for TBF control by 2030.
A Roadmap for Tick-Borne Flavivirus Research in the “Omics” Era

PubMed Central

Grabowski, Jeffrey M.; Hill, Catherine A.

2017-01-01

Tick-borne flaviviruses (TBFs) affect human health globally. Human vaccines provide protection against some TBFs, and antivirals are available, yet TBF-specific control strategies are limited. Advances in genomics offer hope to understand the viral complement transmitted by ticks, and to develop disruptive, data-driven technologies for virus detection, treatment, and control. The genome assemblies of Ixodes scapularis, the North American tick vector of the TBF, Powassan virus, and other tick vectors, are providing insights into tick biology and pathogen transmission and serve as nucleation points for expanded genomic research. Systems biology has yielded insights to the response of tick cells to viral infection at the transcript and protein level, and new protein targets for vaccines to limit virus transmission. Reverse vaccinology approaches have moved candidate tick antigenic epitopes into vaccine development pipelines. Traditional drug and in silico screening have identified candidate antivirals, and target-based approaches have been developed to identify novel acaricides. Yet, additional genomic resources are required to expand TBF research. Priorities include genome assemblies for tick vectors, “omic” studies involving high consequence pathogens and vectors, and emphasizing viral metagenomics, tick-virus metabolomics, and structural genomics of TBF and tick proteins. Also required are resources for forward genetics, including the development of tick strains with quantifiable traits, genetic markers and linkage maps. Here we review the current state of genomic research on ticks and tick-borne viruses with an emphasis on TBFs. We outline an ambitious 10-year roadmap for research in the “omics era,” and explore key milestones needed to accomplish the goal of delivering three new vaccines, antivirals and acaricides for TBF control by 2030. PMID:29312896
Analysis of whole genome sequences of 16 strains of rubella virus from the United States, 1961-2009.

PubMed

Abernathy, Emily; Chen, Min-hsin; Bera, Jayati; Shrivastava, Susmita; Kirkness, Ewen; Zheng, Qi; Bellini, William; Icenogle, Joseph

2013-01-25

Rubella virus is the causative agent of rubella, a mild rash illness, and a potent teratogenic agent when contracted by a pregnant woman. Global rubella control programs target the reduction and elimination of congenital rubella syndrome. Phylogenetic analysis of partial sequences of rubella viruses has contributed to virus surveillance efforts and played an important role in demonstrating that indigenous rubella viruses have been eliminated in the United States. Sixteen wild-type rubella viruses were chosen for whole genome sequencing. All 16 viruses were collected in the United States from 1961 to 2009 and are from 8 of the 13 known rubella genotypes. Phylogenetic analysis of 30 whole genome sequences produced a maximum likelihood tree giving high bootstrap values for all genotypes except provisional genotype 1a. Comparison of the 16 new complete sequences and 14 previously sequenced wild-type viruses found regions with clusters of variable amino acids. The 5' 250 nucleotides of the genome are more conserved than any other part of the genome. Genotype specific deletions in the untranslated region between the non-structural and structural open reading frames were observed for genotypes 2B and genotype 1G. No evidence was seen for recombination events among the 30 viruses. The analysis presented here is consistent with previous reports on the genetic characterization of rubella virus genomes. Conserved and variable regions were identified and additional evidence for genotype specific nucleotide deletions in the intergenic region was found. Phylogenetic analysis confirmed genotype groupings originally based on structural protein coding region sequences, which provides support for the WHO nomenclature for genetic characterization of wild-type rubella viruses.
Structural Characterization of H-1 Parvovirus: Comparison of Infectious Virions to Empty Capsids

PubMed Central

Halder, Sujata; Nam, Hyun-Joo; Govindasamy, Lakshmanan; Vogel, Michèle; Dinsart, Christiane; Salomé, Nathalie; McKenna, Robert

2013-01-01

The structure of single-stranded DNA (ssDNA) packaging H-1 parvovirus (H-1PV), which is being developed as an antitumor gene delivery vector, has been determined for wild-type (wt) virions and noninfectious (empty) capsids to 2.7- and 3.2-Å resolution, respectively, using X-ray crystallography. The capsid viral protein (VP) structure consists of an α-helix and an eight-stranded anti-parallel β-barrel with large loop regions between the strands. The β-barrel and loops form the capsid core and surface, respectively. In the wt structure, 600 nucleotides are ordered in an interior DNA binding pocket of the capsid. This accounts for ∼12% of the H-1PV genome. The wt structure is identical to the empty capsid structure, except for side chain conformation variations at the nucleotide binding pocket. Comparison of the H-1PV nucleotides to those observed in canine parvovirus and minute virus of mice, two members of the genus Parvovirus, showed both similarity in structure and analogous interactions. This observation suggests a functional role, such as in capsid stability and/or ssDNA genome recognition for encapsulation. The VP structure differs from those of other parvoviruses in surface loop regions that control receptor binding, tissue tropism, pathogenicity, and antibody recognition, including VP sequences reported to determine tumor cell tropism for oncotropic rodent parvoviruses. These structures of H-1PV provide insight into structural features that dictate capsid stabilization following genome packaging and three-dimensional information applicable for rational design of tumor-targeted recombinant gene delivery vectors. PMID:23449783
Whole genome sequences in pulse crops: a global community resource to expedite translational genomics and knowledge-based crop improvement.

PubMed

Bohra, Abhishek; Singh, Narendra P

2015-08-01

Unprecedented developments in legume genomics over the last decade have resulted in the acquisition of a wide range of modern genomic resources to underpin genetic improvement of grain legumes. The genome enabled insights direct investigators in various ways that primarily include unearthing novel structural variations, retrieving the lost genetic diversity, introducing novel/exotic alleles from wider gene pools, finely resolving the complex quantitative traits and so forth. To this end, ready availability of cost-efficient and high-density genotyping assays allows genome wide prediction to be increasingly recognized as the key selection criterion in crop breeding. Further, the high-dimensional measurements of agronomically significant phenotypes obtained by using new-generation screening techniques will empower reference based resequencing as well as allele mining and trait mapping methods to comprehensively associate genome diversity with the phenome scale variation. Besides stimulating the forward genetic systems, accessibility to precisely delineated genomic segments reveals novel candidates for reverse genetic techniques like targeted genome editing. The shifting paradigm in plant genomics in turn necessitates optimization of crop breeding strategies to enable the most efficient integration of advanced omics knowledge and tools. We anticipate that the crop improvement schemes will be bolstered remarkably with rational deployment of these genome-guided approaches, ultimately resulting in expanded plant breeding capacities and improved crop performance.

Synthetic Zinc Finger Proteins: The Advent of Targeted Gene Regulation and Genome Modification Technologies

PubMed Central

2015-01-01

Conspectus The understanding of gene regulation and the structure and function of the human genome increased dramatically at the end of the 20th century. Yet the technologies for manipulating the genome have been slower to develop. For instance, the field of gene therapy has been focused on correcting genetic diseases and augmenting tissue repair for more than 40 years. However, with the exception of a few very low efficiency approaches, conventional genetic engineering methods have only been able to add auxiliary genes to cells. This has been a substantial obstacle to the clinical success of gene therapies and has also led to severe unintended consequences in several cases. Therefore, technologies that facilitate the precise modification of cellular genomes have diverse and significant implications in many facets of research and are essential for translating the products of the Genomic Revolution into tangible benefits for medicine and biotechnology. To address this need, in the 1990s, we embarked on a mission to develop technologies for engineering protein–DNA interactions with the aim of creating custom tools capable of targeting any DNA sequence. Our goal has been to allow researchers to reach into genomes to specifically regulate, knock out, or replace any gene. To realize these goals, we initially focused on understanding and manipulating zinc finger proteins. In particular, we sought to create a simple and straightforward method that enables unspecialized laboratories to engineer custom DNA-modifying proteins using only defined modular components, a web-based utility, and standard recombinant DNA technology. Two significant challenges we faced were (i) the development of zinc finger domains that target sequences not recognized by naturally occurring zinc finger proteins and (ii) determining how individual zinc finger domains could be tethered together as polydactyl proteins to recognize unique locations within complex genomes. We and others have since used this modular assembly method to engineer artificial proteins and enzymes that activate, repress, or create defined changes to user-specified genes in human cells, plants, and other organisms. We have also engineered novel methods for externally controlling protein activity and delivery, as well as developed new strategies for the directed evolution of protein and enzyme function. This Account summarizes our work in these areas and highlights independent studies that have successfully used the modular assembly approach to create proteins with novel function. We also discuss emerging alternative methods for genomic targeting, including transcription activator-like effectors (TALEs) and CRISPR/Cas systems, and how they complement the synthetic zinc finger protein technology. PMID:24877793
ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling | Office of Cancer Genomics

Cancer.gov

Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives.
Efficient CRISPR/Cas9-mediated Targeted Mutagenesis in Populus in the First Generation

PubMed Central

Fan, Di; Liu, Tingting; Li, Chaofeng; Jiao, Bo; Li, Shuang; Hou, Yishu; Luo, Keming

2015-01-01

Recently, RNA-guided genome editing using the type II clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein (Cas) system has been applied to edit the plant genome in several herbaceous plant species. However, it remains unknown whether this system can be used for genome editing in woody plants. In this study, we describe the genome editing and targeted gene mutation in a woody species, Populus tomentosa Carr. via the CRISPR/Cas9 system. Four guide RNAs (gRNAs) were designed to target with distinct poplar genomic sites of the phytoene desaturase gene 8 (PtoPDS) which are followed by the protospacer-adjacent motif (PAM). After Agrobacterium-mediated transformation, obvious albino phenotype was observed in transgenic poplar plants. By analyzing the RNA-guided genome-editing events, 30 out of 59 PCR clones were homozygous mutants, 2 out of 59 were heterozygous mutants and the mutation efficiency at these target sites was estimated to be 51.7%. Our data demonstrate that the Cas9/sgRNA system can be exploited to precisely edit genomic sequence and effectively create knockout mutations in woody plants. PMID:26193631
Selecting soluble/foldable protein domains through single-gene or genomic ORF filtering: structure of the head domain of Burkholderia pseudomallei antigen BPSL2063.

PubMed

Gourlay, Louise J; Peano, Clelia; Deantonio, Cecilia; Perletti, Lucia; Pietrelli, Alessandro; Villa, Riccardo; Matterazzo, Elena; Lassaux, Patricia; Santoro, Claudio; Puccio, Simone; Sblattero, Daniele; Bolognesi, Martino

2015-11-01

The 1.8 Å resolution crystal structure of a conserved domain of the potential Burkholderia pseudomallei antigen and trimeric autotransporter BPSL2063 is presented as a structural vaccinology target for melioidosis vaccine development. Since BPSL2063 (1090 amino acids) hosts only one conserved domain, and the expression/purification of the full-length protein proved to be problematic, a domain-filtering library was generated using β-lactamase as a reporter gene to select further BPSL2063 domains. As a result, two domains (D1 and D2) were identified and produced in soluble form in Escherichia coli. Furthermore, as a general tool, a genomic open reading frame-filtering library from the B. pseudomallei genome was also constructed to facilitate the selection of domain boundaries from the entire ORFeome. Such an approach allowed the selection of three potential protein antigens that were also produced in soluble form. The results imply the further development of ORF-filtering methods as a tool in protein-based research to improve the selection and production of soluble proteins or domains for downstream applications such as X-ray crystallography.
Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

PubMed

Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

2017-01-01

Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.
Crystal Structure of the Minimal Cas9 from Campylobacter jejuni Reveals the Molecular Diversity in the CRISPR-Cas9 Systems.

PubMed

Yamada, Mari; Watanabe, Yuto; Gootenberg, Jonathan S; Hirano, Hisato; Ran, F Ann; Nakane, Takanori; Ishitani, Ryuichiro; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

2017-03-16

The RNA-guided endonuclease Cas9 generates a double-strand break at DNA target sites complementary to the guide RNA and has been harnessed for the development of a variety of new technologies, such as genome editing. Here, we report the crystal structures of Campylobacter jejuni Cas9 (CjCas9), one of the smallest Cas9 orthologs, in complex with an sgRNA and its target DNA. The structures provided insights into a minimal Cas9 scaffold and revealed the remarkable mechanistic diversity of the CRISPR-Cas9 systems. The CjCas9 guide RNA contains a triple-helix structure, which is distinct from known RNA triple helices, thereby expanding the natural repertoire of RNA triple helices. Furthermore, unlike the other Cas9 orthologs, CjCas9 contacts the nucleotide sequences in both the target and non-target DNA strands and recognizes the 5'-NNNVRYM-3' as the protospacer-adjacent motif. Collectively, these findings improve our mechanistic understanding of the CRISPR-Cas9 systems and may facilitate Cas9 engineering. Copyright © 2017 Elsevier Inc. All rights reserved.
Network-assisted target identification for haploinsufficiency and homozygous profiling screens

PubMed Central

Wang, Sheng

2017-01-01

Chemical genomic screens have recently emerged as a systematic approach to drug discovery on a genome-wide scale. Drug target identification and elucidation of the mechanism of action (MoA) of hits from these noisy high-throughput screens remain difficult. Here, we present GIT (Genetic Interaction Network-Assisted Target Identification), a network analysis method for drug target identification in haploinsufficiency profiling (HIP) and homozygous profiling (HOP) screens. With the drug-induced phenotypic fitness defect of the deletion of a gene, GIT also incorporates the fitness defects of the gene’s neighbors in the genetic interaction network. On three genome-scale yeast chemical genomic screens, GIT substantially outperforms previous scoring methods on target identification on HIP and HOP assays, respectively. Finally, we showed that by combining HIP and HOP assays, GIT further boosts target identification and reveals potential drug’s mechanism of action. PMID:28574983
Genomic deletions created upon LINE-1 retrotransposition.

PubMed

Gilbert, Nicolas; Lutz-Prigge, Sheila; Moran, John V

2002-08-09

LINE-1 (L1) retrotransposition continues to impact the human genome, yet little is known about how L1 integrates into DNA. Here, we developed a plasmid-based rescue system and have used it to recover 37 new L1 retrotransposition events from cultured human cells. Sequencing of the insertions revealed the usual L1 structural hallmarks; however, in four instances, retrotransposition generated large target site deletions. Remarkably, three of those resulted in the formation of chimeric L1s, containing the 5' end of an endogenous L1 fused precisely to our engineered L1. Thus, our data demonstrate multiple pathways for L1 integration in cultured cells, and show that L1 is not simply an insertional mutagen, but that its retrotransposition can result in significant deletions of genomic sequence.
TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.

PubMed

Ha, Gavin; Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A; Gilks, C Blake; Huntsman, David G; McAlpine, Jessica N; Aparicio, Samuel; Shah, Sohrab P

2014-11-01

The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. © 2014 Ha et al.; Published by Cold Spring Harbor Laboratory Press.
Translational Genomics: Practical Applications of the Genomic Revolution in Breast Cancer.

PubMed

Yates, Lucy R; Desmedt, Christine

2017-06-01

The genomic revolution has fundamentally changed our perception of breast cancer. It is now apparent from DNA-based massively parallel sequencing data that at the genomic level, every breast cancer is unique and shaped by the mutational processes to which it was exposed during its lifetime. More than 90 breast cancer driver genes have been identified as recurrently mutated, and many occur at low frequency across the breast cancer population. Certain cancer genes are associated with traditionally defined histologic subtypes, but genomic intertumoral heterogeneity exists even between cancers that appear the same under the microscope. Most breast cancers contain subclonal populations, many of which harbor driver alterations, and subclonal structure is typically remodeled over time, across metastasis and as a consequence of treatment interventions. Genomics is deepening our understanding of breast cancer biology, contributing to an accelerated phase of targeted drug development and providing insights into resistance mechanisms. Genomics is also providing tools necessary to deliver personalized cancer medicine, but a number of challenges must still be addressed. Clin Cancer Res; 23(11); 2630-9. ©2017 AACR See all articles in this CCR Focus section, "Breast Cancer Research: From Base Pairs to Populations." ©2017 American Association for Cancer Research.
Discovery of Nigri/nox and Panto/pox site-specific recombinase systems facilitates advanced genome engineering

PubMed Central

Karimova, Madina; Splith, Victoria; Karpinski, Janet; Pisabarro, M. Teresa; Buchholz, Frank

2016-01-01

Precise genome engineering is instrumental for biomedical research and holds great promise for future therapeutic applications. Site-specific recombinases (SSRs) are valuable tools for genome engineering due to their exceptional ability to mediate precise excision, integration and inversion of genomic DNA in living systems. The ever-increasing complexity of genome manipulations and the desire to understand the DNA-binding specificity of these enzymes are driving efforts to identify novel SSR systems with unique properties. Here, we describe two novel tyrosine site-specific recombination systems designated Nigri/nox and Panto/pox. Nigri originates from Vibrio nigripulchritudo (plasmid VIBNI_pA) and recombines its target site nox with high efficiency and high target-site selectivity, without recombining target sites of the well established SSRs Cre, Dre, Vika and VCre. Panto, derived from Pantoea sp. aB, is less specific and in addition to its native target site, pox also recombines the target site for Dre recombinase, called rox. This relaxed specificity allowed the identification of residues that are involved in target site selectivity, thereby advancing our understanding of how SSRs recognize their respective DNA targets. PMID:27444945
[sgRNA design for the CRISPR/Cas9 system and evaluation of its off-target effects].

PubMed

Xie, Sheng-song; Zhang, Yi; Zhang, Li-sheng; Li, Guang-lei; Zhao, Chang-zhi; Ni, Pan; Zhao, Shu-hong

2015-11-01

The third generation of CRISPR/Cas9-mediated genome editing technology has been successfully applied to genome modification of various species including animals, plants and microorganisms. How to improve the efficiency of CRISPR/Cas9 genome editing and reduce its off-target effects has been extensively explored in this field. Using sgRNA (Small guide RNA) with high efficiency and specificity is one of the critical factors for successful genome editing. Several software have been developed for sgRNA design and/or off-target evaluation, which have advantages and disadvantages respectively. In this review, we summarize characters of 16 kinds online and standalone software for sgRNA design and/or off-target evaluation and conduct a comparative analysis of these different kinds of software through developing 38 evaluation indexes. We also summarize 11 experimental approaches for testing genome editing efficiency and off-target effects as well as how to screen highly efficient and specific sgRNA.
Bioinformatics by Example: From Sequence to Target

NASA Astrophysics Data System (ADS)

Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

2002-12-01

With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.
CRISPR: From Prokaryotic Immune Systems to Plant Genome Editing Tools.

PubMed

Bandyopadhyay, Anindya; Mazumdar, Shamik; Yin, Xiaojia; Quick, William Paul

2017-01-01

The clustered regularly interspaced short palindromic repeats (CRISPR) system is a prokaryotic adaptive immune system that has the ability to identify specific locations on the bacteriophage (phage) genome to create breaks in it, and internalize the phage genome fragments in its own genome as CRISPR arrays for memory-dependent resistance. Although CRISPR has been used in the dairy industry for a long time, it recently gained importance in the field of genome editing because of its ability to precisely target locations in a genome. This system has further been modified to locate and target any region of a genome of choice due to modifications in the components of the system. By changing the nucleotide sequence of the 20-nucleotide target sequence in the guide RNA, targeting any location is possible. It has found an application in the modification of plant genomes with its ability to generate mutations and insertions, thus helping to create new varieties of plants. With the ability to introduce specific sequences into the plant genome after cleavage by the CRISPR system and subsequent DNA repair through homology-directed repair (HDR), CRISPR ensures that genome editing can be successfully applied in plants, thus generating stronger and more improved traits. Also, the use of the CRISPR editing system can generate plants that are transgene-free and have mutations that are stably inherited, thus helping to circumvent current GMO regulations.
The genome editing revolution: A CRISPR-Cas TALE off-target story.

PubMed

Stella, Stefano; Montoya, Guillermo

2016-07-01

In the last 10 years, we have witnessed a blooming of targeted genome editing systems and applications. The area was revolutionized by the discovery and characterization of the transcription activator-like effector proteins, which are easier to engineer to target new DNA sequences than the previously available DNA binding templates, zinc fingers and meganucleases. Recently, the area experimented a quantum leap because of the introduction of the clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein (Cas) system (clustered regularly interspaced short palindromic sequence). This ribonucleoprotein complex protects bacteria from invading DNAs, and it was adapted to be used in genome editing. The CRISPR ribonucleic acid (RNA) molecule guides to the specific DNA site the Cas9 nuclease to cleave the DNA target. Two years and more than 1000 publications later, the CRISPR-Cas system has become the main tool for genome editing in many laboratories. Currently the targeted genome editing technology has been used in many fields and may be a possible approach for human gene therapy. Furthermore, it can also be used to modifying the genomes of model organisms for studying human pathways or to improve key organisms for biotechnological applications, such as plants, livestock genome as well as yeasts and bacterial strains. © 2016 The Authors. BioEssays published by WILEY Periodicals, Inc.
[Overview of patents on targeted genome editing technologies and their implications for innovation and entrepreneurship education in universities].

PubMed

Fan, Xiang-yu; Lin, Yan-ping; Liao, Guo-jian; Xie, Jian-ping

2015-12-01

Zinc finger nuclease, transcription activator-like effector nuclease, and clustered regularly interspaced short palindromic repeats/Cas9 nuclease are important targeted genome editing technologies. They have great significance in scientific research and applications on aspects of functional genomics research, species improvement, disease prevention and gene therapy. There are past or ongoing disputes over ownership of the intellectual property behind every technology. In this review, we summarize the patents on these three targeted genome editing technologies in order to provide some reference for developing genome editing technologies with self-owned intellectual property rights and some implications for current innovation and entrepreneurship education in universities.
Engineered Cpf1 variants with altered PAM specificities.

PubMed

Gao, Linyi; Cox, David B T; Yan, Winston X; Manteiga, John C; Schneider, Martin W; Yamano, Takashi; Nishimasu, Hiroshi; Nureki, Osamu; Crosetto, Nicola; Zhang, Feng

2017-08-01

The RNA-guided endonuclease Cpf1 is a promising tool for genome editing in eukaryotic cells. However, the utility of the commonly used Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) and Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1) is limited by their requirement of a TTTV protospacer adjacent motif (PAM) in the DNA substrate. To address this limitation, we performed a structure-guided mutagenesis screen to increase the targeting range of Cpf1. We engineered two AsCpf1 variants carrying the mutations S542R/K607R and S542R/K548V/N552R, which recognize TYCV and TATV PAMs, respectively, with enhanced activities in vitro and in human cells. Genome-wide assessment of off-target activity using BLISS indicated that these variants retain high DNA-targeting specificity, which we further improved by introducing an additional non-PAM-interacting mutation. Introducing the identified PAM-interacting mutations at their corresponding positions in LbCpf1 similarly altered its PAM specificity. Together, these variants increase the targeting range of Cpf1 by approximately threefold in human coding sequences to one cleavage site per ∼11 bp.
Comparison of randomly cloned and whole genomic DNA probes for the detection of Porphyromonas gingivalis and Bacteroides forsythus

PubMed Central

Wong, M.; DiRienzo, J.M.; Lai, C.-H.; Listgarten, M. A.

2012-01-01

Whole genomic and randomly-cloned DNA probes for two fastidious periodontal pathogens, Porphyromonas gingivalis and Bacteroides forsythus were labeled with digoxigenin and detected by a colorimetric method. The specificity and sensitivity of the whole genomic and cloned probes were compared. The cloned probes were highly specific compared to the whole genomic probes. A significant degree of cross-reactivity with Bacteroides species. Capnocytophaga sp. and Prevotella sp. was observed with the whole genomic probes. The cloned probes were less sensitive than the whole genomic probes and required at least 106 target cells or a minimum of 10 ng of target DNA to be detected during hybridization. Although a ten-fold increase in sensitivity was obtained with the whole genomic probes, cross-hybridization to closely related species limits their reliability in identifying target bacteria in subgingival plaque samples. PMID:8636873
Functional precision medicine identifies novel druggable targets and therapeutic options in head and neck cancer. | Office of Cancer Genomics

Cancer.gov

Purpose: Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide with high mortality and a lack of targeted therapies. To identify and prioritize druggable targets, we performed genome analysis together with genome-scale siRNA and oncology drug profiling using low passage tumor cells derived from a patient with a treatmentresistant HPV-negative HNSCC.
Enhanced guide-RNA design and targeting analysis for precise CRISPR genome editing of single and consortia of industrially relevant and non-model organisms.

PubMed

Mendoza, Brian J; Trinh, Cong T

2018-01-01

Genetic diversity of non-model organisms offers a repertoire of unique phenotypic features for exploration and cultivation for synthetic biology and metabolic engineering applications. To realize this enormous potential, it is critical to have an efficient genome editing tool for rapid strain engineering of these organisms to perform novel programmed functions. To accommodate the use of CRISPR/Cas systems for genome editing across organisms, we have developed a novel method, named CRISPR Associated Software for Pathway Engineering and Research (CASPER), for identifying on- and off-targets with enhanced predictability coupled with an analysis of non-unique (repeated) targets to assist in editing any organism with various endonucleases. Utilizing CASPER, we demonstrated a modest 2.4% and significant 30.2% improvement (F-test, P < 0.05) over the conventional methods for predicting on- and off-target activities, respectively. Further we used CASPER to develop novel applications in genome editing: multitargeting analysis (i.e. simultaneous multiple-site modification on a target genome with a sole guide-RNA requirement) and multispecies population analysis (i.e. guide-RNA design for genome editing across a consortium of organisms). Our analysis on a selection of industrially relevant organisms revealed a number of non-unique target sites associated with genes and transposable elements that can be used as potential sites for multitargeting. The analysis also identified shared and unshared targets that enable genome editing of single or multiple genomes in a consortium of interest. We envision CASPER as a useful platform to enhance the precise CRISPR genome editing for metabolic engineering and synthetic biology applications. https://github.com/TrinhLab/CASPER. ctrinh@utk.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea

PubMed Central

2013-01-01

Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706
CRISPR/Cas9-mediated gene knockout screens and target identification via whole-genome sequencing uncover host genes required for picornavirus infection.

PubMed

Kim, Heon Seok; Lee, Kyungjin; Bae, Sangsu; Park, Jeongbin; Lee, Chong-Kyo; Kim, Meehyein; Kim, Eunji; Kim, Minju; Kim, Seokjoong; Kim, Chonsaeng; Kim, Jin-Soo

2017-06-23

Several groups have used genome-wide libraries of lentiviruses encoding small guide RNAs (sgRNAs) for genetic screens. In most cases, sgRNA expression cassettes are integrated into cells by using lentiviruses, and target genes are statistically estimated by the readout of sgRNA sequences after targeted sequencing. We present a new virus-free method for human gene knockout screens using a genome-wide library of CRISPR/Cas9 sgRNAs based on plasmids and target gene identification via whole-genome sequencing (WGS) confirmation of authentic mutations rather than statistical estimation through targeted amplicon sequencing. We used 30,840 pairs of individually synthesized oligonucleotides to construct the genome-scale sgRNA library, collectively targeting 10,280 human genes ( i.e. three sgRNAs per gene). These plasmid libraries were co-transfected with a Cas9-expression plasmid into human cells, which were then treated with cytotoxic drugs or viruses. Only cells lacking key factors essential for cytotoxic drug metabolism or viral infection were able to survive. Genomic DNA isolated from cells that survived these challenges was subjected to WGS to directly identify CRISPR/Cas9-mediated causal mutations essential for cell survival. With this approach, we were able to identify known and novel genes essential for viral infection in human cells. We propose that genome-wide sgRNA screens based on plasmids coupled with WGS are powerful tools for forward genetics studies and drug target discovery. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Mitotic control of human papillomavirus genome-containing cells is regulated by the function of the PDZ-binding motif of the E6 oncoprotein.

PubMed

Marsh, Elizabeth K; Delury, Craig P; Davies, Nicholas J; Weston, Christopher J; Miah, Mohammed A L; Banks, Lawrence; Parish, Joanna L; Higgs, Martin R; Roberts, Sally

2017-03-21

The function of a conserved PDS95/DLG1/ZO1 (PDZ) binding motif (E6 PBM) at the C-termini of E6 oncoproteins of high-risk human papillomavirus (HPV) types contributes to the development of HPV-associated malignancies. Here, using a primary human keratinocyte-based model of the high-risk HPV18 life cycle, we identify a novel link between the E6 PBM and mitotic stability. In cultures containing a mutant genome in which the E6 PBM was deleted there was an increase in the frequency of abnormal mitoses, including multinucleation, compared to cells harboring the wild type HPV18 genome. The loss of the E6 PBM was associated with a significant increase in the frequency of mitotic spindle defects associated with anaphase and telophase. Furthermore, cells carrying this mutant genome had increased chromosome segregation defects and they also exhibited greater levels of genomic instability, as shown by an elevated level of centromere-positive micronuclei. In wild type HPV18 genome-containing organotypic cultures, the majority of mitotic cells reside in the suprabasal layers, in keeping with the hyperplastic morphology of the structures. However, in mutant genome-containing structures a greater proportion of mitotic cells were retained in the basal layer, which were often of undefined polarity, thus correlating with their reduced thickness. We conclude that the ability of E6 to target cellular PDZ proteins plays a critical role in maintaining mitotic stability of HPV infected cells, ensuring stable episome persistence and vegetative amplification.
The evolutionary fate of the chloroplast and nuclear rps16 genes as revealed through the sequencing and comparative analyses of four novel legume chloroplast genomes from Lupinus.

PubMed

Keller, J; Rousseau-Gueutin, M; Martin, G E; Morice, J; Boutte, J; Coissac, E; Ourari, M; Aïnouche, M; Salmon, A; Cabello-Hurtado, F; Aïnouche, A

2017-08-01

The Fabaceae family is considered as a model system for understanding chloroplast genome evolution due to the presence of extensive structural rearrangements, gene losses and localized hypermutable regions. Here, we provide sequences of four chloroplast genomes from the Lupinus genus, belonging to the underinvestigated Genistoid clade. Notably, we found in Lupinus species the functional loss of the essential rps16 gene, which was most likely replaced by the nuclear rps16 gene that encodes chloroplast and mitochondrion targeted RPS16 proteins. To study the evolutionary fate of the rps16 gene, we explored all available plant chloroplast, mitochondrial and nuclear genomes. Whereas no plant mitochondrial genomes carry an rps16 gene, many plants still have a functional nuclear and chloroplast rps16 gene. Ka/Ks ratios revealed that both chloroplast and nuclear rps16 copies were under purifying selection. However, due to the dual targeting of the nuclear rps16 gene product and the absence of a mitochondrial copy, the chloroplast gene may be lost. We also performed comparative analyses of lupine plastomes (SNPs, indels and repeat elements), identified the most variable regions and examined their phylogenetic utility. The markers identified here will help to reveal the evolutionary history of lupines, Genistoids and closely related clades. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
HelmCoP: An Online Resource for Helminth Functional Genomics and Drug and Vaccine Targets Prioritization

PubMed Central

Taylor, Christina M.; Mitreva, Makedonka

2011-01-01

A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes). Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention), built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html. PMID:21760913
PAM-Dependent Target DNA Recognition and Cleavage by C2c1 CRISPR-Cas Endonuclease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Hui; Gao, Pu; Rajashankar, Kanagalaghatta R.

C2c1 is a newly identified guide RNA-mediated type V-B CRISPR-Cas endonuclease that site-specifically targets and cleaves both strands of target DNA. We have determined crystal structures of Alicyclobacillus acidoterrestris C2c1 (AacC2c1) bound to sgRNA as a binary complex and to target DNAs as ternary complexes, thereby capturing catalytically competent conformations of AacC2c1 with both target and non-target DNA strands independently positioned within a single RuvC catalytic pocket. Moreover, C2c1-mediated cleavage results in a staggered seven-nucleotide break of target DNA. crRNA adopts a pre-ordered five-nucleotide A-form seed sequence in the binary complex, with release of an inserted tryptophan, facilitating zippering upmore » of 20-bp guide RNA:target DNA heteroduplex on ternary complex formation. Notably, the PAM-interacting cleft adopts a “locked” conformation on ternary complex formation. Structural comparison of C2c1 ternary complexes with their Cas9 and Cpf1 counterparts highlights the diverse mechanisms adopted by these distinct CRISPR-Cas systems, thereby broadening and enhancing their applicability as genome editing tools.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Longenecker, Kenton L.; Stamper, Geoffrey F.; Hajduk, Philip J.

In a broad genomics analysis to find novel protein targets for antibiotic discovery, MurF was identified as an essential gene product for Streptococcus pneumonia that catalyzes a critical reaction in the biosynthesis of the peptidoglycan in the formation of the cell wall. Lacking close relatives in mammalian biology, MurF presents attractive characteristics as a potential drug target. Initial screening of the Abbott small-molecule compound collection identified several compounds for further validation as pharmaceutical leads. Here we report the integrated efforts of NMR and X-ray crystallography, which reveal the multidomain structure of a MurF-inhibitor complex in a compact conformation that differsmore » dramatically from related structures. The lead molecule is bound in the substrate-binding region and induces domain closure, suggestive of the domain arrangement for the as yet unobserved transition state conformation for MurF enzymes. The results form a basis for directed optimization of the compound lead by structure-based design to explore the suitability of MurF as a pharmaceutical target.« less
Viral Replication Complexes Are Targeted by LC3-Guided Interferon-Inducible GTPases.

PubMed

Biering, Scott B; Choi, Jayoung; Halstrom, Rachel A; Brown, Hailey M; Beatty, Wandy L; Lee, Sanghyun; McCune, Broc T; Dominici, Erin; Williams, Lelia E; Orchard, Robert C; Wilen, Craig B; Yamamoto, Masahiro; Coers, Jörn; Taylor, Gregory A; Hwang, Seungmin

2017-07-12

All viruses with positive-sense RNA genomes replicate on membranous structures in the cytoplasm called replication complexes (RCs). RCs provide an advantageous microenvironment for viral replication, but it is unknown how the host immune system counteracts these structures. Here we show that interferon-gamma (IFNG) disrupts the RC of murine norovirus (MNV) via evolutionarily conserved autophagy proteins and the induction of IFN-inducible GTPases, which are known to destroy the membrane of vacuoles containing bacteria, protists, or fungi. The MNV RC was marked by the microtubule-associated-protein-1-light-chain-3 (LC3) conjugation system of autophagy and then targeted by immunity-related GTPases (IRGs) and guanylate-binding proteins (GBPs) upon their induction by IFNG. Further, the LC3 conjugation system and the IFN-inducible GTPases were necessary to inhibit MNV replication in mice and human cells. These data suggest that viral RCs can be marked and antagonized by a universal immune defense mechanism targeting diverse pathogens replicating in cytosolic membrane structures. Copyright © 2017 Elsevier Inc. All rights reserved.
Directional genomic hybridization for chromosomal inversion discovery and detection.

PubMed

Ray, F Andrew; Zimmerman, Erin; Robinson, Bruce; Cornforth, Michael N; Bedford, Joel S; Goodwin, Edwin H; Bailey, Susan M

2013-04-01

Chromosomal rearrangements are a source of structural variation within the genome that figure prominently in human disease, where the importance of translocations and deletions is well recognized. In principle, inversions-reversals in the orientation of DNA sequences within a chromosome-should have similar detrimental potential. However, the study of inversions has been hampered by traditional approaches used for their detection, which are not particularly robust. Even with significant advances in whole genome approaches, changes in the absolute orientation of DNA remain difficult to detect routinely. Consequently, our understanding of inversions is still surprisingly limited, as is our appreciation for their frequency and involvement in human disease. Here, we introduce the directional genomic hybridization methodology of chromatid painting-a whole new way of looking at structural features of the genome-that can be employed with high resolution on a cell-by-cell basis, and demonstrate its basic capabilities for genome-wide discovery and targeted detection of inversions. Bioinformatics enabled development of sequence- and strand-specific directional probe sets, which when coupled with single-stranded hybridization, greatly improved the resolution and ease of inversion detection. We highlight examples of the far-ranging applicability of this cytogenomics-based approach, which include confirmation of the alignment of the human genome database and evidence that individuals themselves share similar sequence directionality, as well as use in comparative and evolutionary studies for any species whose genome has been sequenced. In addition to applications related to basic mechanistic studies, the information obtainable with strand-specific hybridization strategies may ultimately enable novel gene discovery, thereby benefitting the diagnosis and treatment of a variety of human disease states and disorders including cancer, autism, and idiopathic infertility.
A Circular Dichroism Reference Database for Membrane Proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallace,B.; Wien, F.; Stone, T.

2006-01-01

Membrane proteins are a major product of most genomes and the target of a large number of current pharmaceuticals, yet little information exists on their structures because of the difficulty of crystallising them; hence for the most part they have been excluded from structural genomics programme targets. Furthermore, even methods such as circular dichroism (CD) spectroscopy which seek to define secondary structure have not been fully exploited because of technical limitations to their interpretation for membrane embedded proteins. Empirical analyses of circular dichroism (CD) spectra are valuable for providing information on secondary structures of proteins. However, the accuracy of themore » results depends on the appropriateness of the reference databases used in the analyses. Membrane proteins have different spectral characteristics than do soluble proteins as a result of the low dielectric constants of membrane bilayers relative to those of aqueous solutions (Chen & Wallace (1997) Biophys. Chem. 65:65-74). To date, no CD reference database exists exclusively for the analysis of membrane proteins, and hence empirical analyses based on current reference databases derived from soluble proteins are not adequate for accurate analyses of membrane protein secondary structures (Wallace et al (2003) Prot. Sci. 12:875-884). We have therefore created a new reference database of CD spectra of integral membrane proteins whose crystal structures have been determined. To date it contains more than 20 proteins, and spans the range of secondary structures from mostly helical to mostly sheet proteins. This reference database should enable more accurate secondary structure determinations of membrane embedded proteins and will become one of the reference database options in the CD calculation server DICHROWEB (Whitmore & Wallace (2004) NAR 32:W668-673).« less
Implementing Genome-Driven Oncology

PubMed Central

Hyman, David M.; Taylor, Barry S.; Baselga, José

2017-01-01

Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282
Population structure of eleven Spanish ovine breeds and detection of selective sweeps with BayeScan and hapFLK

PubMed Central

Manunza, A.; Cardoso, T. F.; Noce, A.; Martínez, A.; Pons, A.; Bermejo, L. A.; Landi, V.; Sànchez, A.; Jordana, J.; Delgado, J. V.; Adán, S.; Capote, J.; Vidal, O.; Ugarte, E.; Arranz, J. J.; Calvo, J. H.; Casellas, J.; Amills, M.

2016-01-01

The goals of the current work were to analyse the population structure of 11 Spanish ovine breeds and to detect genomic regions that may have been targeted by selection. A total of 141 individuals were genotyped with the Infinium 50 K Ovine SNP BeadChip (Illumina). We combined this dataset with Spanish ovine data previously reported by the International Sheep Genomics Consortium (N = 229). Multidimensional scaling and Admixture analyses revealed that Canaria de Pelo and, to a lesser extent, Roja Mallorquina, Latxa and Churra are clearly differentiated populations, while the remaining seven breeds (Ojalada, Castellana, Gallega, Xisqueta, Ripollesa, Rasa Aragonesa and Segureña) share a similar genetic background. Performance of a genome scan with BayeScan and hapFLK allowed us identifying three genomic regions that are consistently detected with both methods i.e. Oar3 (150–154 Mb), Oar6 (4–49 Mb) and Oar13 (68–74 Mb). Neighbor-joining trees based on polymorphisms mapping to these three selective sweeps did not show a clustering of breeds according to their predominant productive specialization (except the local tree based on Oar13 SNPs). Such cryptic signatures of selection have been also found in the bovine genome, posing a considerable challenge to understand the biological consequences of artificial selection. PMID:27272025
Population structure of eleven Spanish ovine breeds and detection of selective sweeps with BayeScan and hapFLK.

PubMed

Manunza, A; Cardoso, T F; Noce, A; Martínez, A; Pons, A; Bermejo, L A; Landi, V; Sànchez, A; Jordana, J; Delgado, J V; Adán, S; Capote, J; Vidal, O; Ugarte, E; Arranz, J J; Calvo, J H; Casellas, J; Amills, M

2016-06-07

The goals of the current work were to analyse the population structure of 11 Spanish ovine breeds and to detect genomic regions that may have been targeted by selection. A total of 141 individuals were genotyped with the Infinium 50 K Ovine SNP BeadChip (Illumina). We combined this dataset with Spanish ovine data previously reported by the International Sheep Genomics Consortium (N = 229). Multidimensional scaling and Admixture analyses revealed that Canaria de Pelo and, to a lesser extent, Roja Mallorquina, Latxa and Churra are clearly differentiated populations, while the remaining seven breeds (Ojalada, Castellana, Gallega, Xisqueta, Ripollesa, Rasa Aragonesa and Segureña) share a similar genetic background. Performance of a genome scan with BayeScan and hapFLK allowed us identifying three genomic regions that are consistently detected with both methods i.e. Oar3 (150-154 Mb), Oar6 (4-49 Mb) and Oar13 (68-74 Mb). Neighbor-joining trees based on polymorphisms mapping to these three selective sweeps did not show a clustering of breeds according to their predominant productive specialization (except the local tree based on Oar13 SNPs). Such cryptic signatures of selection have been also found in the bovine genome, posing a considerable challenge to understand the biological consequences of artificial selection.
The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation.

PubMed

McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D; Rodionov, Dmitry A; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

2007-01-01

The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of approximately 50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
A massively parallel strategy for STR marker development, capture, and genotyping.

PubMed

Kistler, Logan; Johnson, Stephen M; Irwin, Mitchell T; Louis, Edward E; Ratan, Aakrosh; Perry, George H

2017-09-06

Short tandem repeat (STR) variants are highly polymorphic markers that facilitate powerful population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. Massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery. Here, we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without a reference genome, and an approach for highly parallel target STR recovery. We employed our approach to capture a panel of 5000 STRs from a test group of diademed sifakas (Propithecus diadema, n = 3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci-97.3-99.6% of STRs characterized with ≥10x non-redundant sequence coverage. We then tested our STR capture strategy on P. diadema fecal DNA, and report robust initial results and suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from flanking regions. Our method provides a cost-effective and scalable solution for rapid recovery of large STR and SNP datasets in any species without needing a reference genome, and can be used even with suboptimal DNA more easily acquired in conservation and ecological studies. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

DTIC Science & Technology

2014-06-01

Specifically, we combined the CRISPR genome editing system with a novel approach allowing efficient single cell cloning of Drosophila cells with the aim of...and culture these to produce cultures completely lacking wildtype sequence at the target locus. No robust methods existed to clone single Drosophila ...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . 65 samples that displayed synthetic lethality (15 genes) or synthetic
Global analysis of bacterial transcription factors to predict cellular target processes.

PubMed

Doerks, Tobias; Andrade, Miguel A; Lathe, Warren; von Mering, Christian; Bork, Peer

2004-03-01

Whole-genome sequences are now available for >100 bacterial species, giving unprecedented power to comparative genomics approaches. We have applied genome-context methods to predict target processes that are regulated by transcription factors (TFs). Of 128 orthologous groups of proteins annotated as TFs, to date, 36 are functionally uncharacterized; in our analysis we predict a probable cellular target process or biochemical pathway for half of these functionally uncharacterized TFs.
Genomic selection for quantitative adult plant stem rust resistance in wheat

USDA-ARS?s Scientific Manuscript database

Quantitative adult plant resistance (APR) to stem rust (Puccinia graminis f. sp. tritici) is an important breeding target in wheat (Triticum aestivum L.) and a potential target for genomic selection (GS). To evaluate the relative importance of known APR loci in applying genomic selection, we charact...
Genome editing: progress and challenges for medical applications.

PubMed

Carroll, Dana

2016-11-15

The development of the CRISPR-Cas platform for genome editing has greatly simplified the process of making targeted genetic modifications. Applications of genome editing are expected to have a substantial impact on human therapies through the development of better animal models, new target discovery, and direct therapeutic intervention.
CTD² Dashboard: a searchable web interface to connect validated results from the Cancer Target Discovery and Development Network* | Office of Cancer Genomics

Cancer.gov

The Cancer Target Discovery and Development (CTD2) Network aims to use functional genomics to accelerate the translation of high-throughput and high-content genomic and small-molecule data towards use in precision oncology.

Highly efficient gene targeting in Aspergillus oryzae industrial strains under ligD mutation introduced by genome editing: Strain-specific differences in the effects of deleting EcdR, the negative regulator of sclerotia formation.

PubMed

Nakamura, Hidetoshi; Katayama, Takuya; Okabe, Tomoya; Iwashita, Kazuhiro; Fujii, Wataru; Kitamoto, Katsuhiko; Maruyama, Jun-Ichi

2017-07-11

Numerous strains of Aspergillus oryzae are industrially used for Japanese traditional fermentation and for the production of enzymes and heterologous proteins. In A. oryzae, deletion of the ku70 or ligD genes involved in non-homologous end joining (NHEJ) has allowed high gene targeting efficiency. However, this strategy has been mainly applied under the genetic background of the A. oryzae wild strain RIB40, and it would be laborious to delete the NHEJ genes in many A. oryzae industrial strains, probably due to their low gene targeting efficiency. In the present study, we generated ligD mutants from the A. oryzae industrial strains by employing the CRISPR/Cas9 system, which we previously developed as a genome editing method. Uridine/uracil auxotrophic strains were generated by deletion of the pyrG gene, which was subsequently used as a selective marker. We examined the gene targeting efficiency with the ecdR gene, of which deletion was reported to induce sclerotia formation under the genetic background of the strain RIB40. As expected, the deletion efficiencies were high, around 60~80%, in the ligD mutants of industrial strains. Intriguingly, the effects of the ecdR deletion on sclerotia formation varied depending on the strains, and we found sclerotia-like structures under the background of the industrial strains, which have never been reported to form sclerotia. The present study demonstrates that introducing ligD mutation by genome editing is an effective method allowing high gene targeting efficiency in A. oryzae industrial strains.
Genomic characterisation of the effector complement of the potato cyst nematode Globodera pallida.

PubMed

Thorpe, Peter; Mantelin, Sophie; Cock, Peter Ja; Blok, Vivian C; Coke, Mirela C; Eves-van den Akker, Sebastian; Guzeeva, Elena; Lilley, Catherine J; Smant, Geert; Reid, Adam J; Wright, Kathryn M; Urwin, Peter E; Jones, John T

2014-10-23

The potato cyst nematode Globodera pallida has biotrophic interactions with its host. The nematode induces a feeding structure - the syncytium - which it keeps alive for the duration of the life cycle and on which it depends for all nutrients required to develop to the adult stage. Interactions of G. pallida with the host are mediated by effectors, which are produced in two sets of gland cells. These effectors suppress host defences, facilitate migration and induce the formation of the syncytium. The recent completion of the G. pallida genome sequence has allowed us to identify the effector complement from this species. We identify 128 orthologues of effectors from other nematodes as well as 117 novel effector candidates. We have used in situ hybridisation to confirm gland cell expression of a subset of these effectors, demonstrating the validity of our effector identification approach. We have examined the expression profiles of all effector candidates using RNAseq; this analysis shows that the majority of effectors fall into one of three clusters of sequences showing conserved expression characteristics (invasive stage nematode only, parasitic stage only or invasive stage and adult male only). We demonstrate that further diversity in the effector pool is generated by alternative splicing. In addition, we show that effectors target a diverse range of structures in plant cells, including the peroxisome. This is the first identification of effectors from any plant pathogen that target this structure. This is the first genome scale search for effectors, combined to a life-cycle expression analysis, for any plant-parasitic nematode. We show that, like other phylogenetically unrelated plant pathogens, plant parasitic nematodes deploy hundreds of effectors in order to parasitise plants, with different effectors required for different phases of the infection process.
First genome sequences of Achromobacter phages reveal new members of the N4 family.

PubMed

Wittmann, Johannes; Dreiseikelmann, Brigitte; Rohde, Manfred; Meier-Kolthoff, Jan P; Bunk, Boyke; Rohde, Christine

2014-01-27

Multi-resistant Achromobacter xylosoxidans has been recognized as an emerging pathogen causing nosocomially acquired infections during the last years. Phages as natural opponents could be an alternative to fight such infections. Bacteriophages against this opportunistic pathogen were isolated in a recent study. This study shows a molecular analysis of two podoviruses and reveals first insights into the genomic structure of Achromobacter phages so far. Growth curve experiments and adsorption kinetics were performed for both phages. Adsorption and propagation in cells were visualized by electron microscopy. Both phage genomes were sequenced with the PacBio RS II system based on single molecule, real-time (SMRT) technology and annotated with several bioinformatic tools. To further elucidate the evolutionary relationships between the phage genomes, a phylogenomic analysis was conducted using the genome Blast Distance Phylogeny approach (GBDP). In this study, we present the first detailed analysis of genome sequences of two Achromobacter phages so far. Phages JWAlpha and JWDelta were isolated from two different waste water treatment plants in Germany. Both phages belong to the Podoviridae and contain linear, double-stranded DNA with a length of 72329 bp and 73659 bp, respectively. 92 and 89 putative open reading frames were identified for JWAlpha and JWDelta, respectively, by bioinformatic analysis with several tools. The genomes have nearly the same organization and could be divided into different clusters for transcription, replication, host interaction, head and tail structure and lysis. Detailed annotation via protein comparisons with BLASTP revealed strong similarities to N4-like phages. Analysis of the genomes of Achromobacter phages JWAlpha and JWDelta and comparisons of different gene clusters with other phages revealed that they might be strongly related to other N4-like phages, especially of the Escherichia group. Although all these phages show a highly conserved genomic structure and partially strong similarities at the amino acid level, some differences could be identified. Those differences, e.g. the existence of specific genes for replication or host interaction in some N4-like phages, seem to be interesting targets for further examination of function and specific mechanisms, which might enlighten the mechanism of phage establishment in the host cell after infection.
RNA-dependent RNA polymerases from flaviviruses and Picornaviridae.

PubMed

Lescar, Julien; Canard, Bruno

2009-12-01

Flaviviruses and picornaviruses are positive-strand RNA viruses that encode the RNA-dependent RNA polymerase (RdRp) required for replicating the viral genome in infected cells. Because of their specific and essential role in the virus life cycle, RdRps are prime targets for antiviral drugs. Recent structural data have shed light on the different strategies used by RdRps from flaviviruses and Picornaviridae to initiate RNA polymerization. New details about the catalytic mechanism, the role of metal ions, how these RdRps interact with other nonstructural (NS) viral and host-cell proteins as well as with the viral RNA genome have also been published. These advances contribute to give a more complete picture of the 3D structure and mechanism of a membrane-bound viral replication complex for these two classes of medically important human pathogens.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.

LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of themmore » have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).« less
Non-Homologous End Joining and Homology Directed DNA Repair Frequency of Double-Stranded Breaks Introduced by Genome Editing Reagents.

PubMed

Zaboikin, Michail; Zaboikina, Tatiana; Freter, Carl; Srinivasakumar, Narasimhachar

2017-01-01

Genome editing using transcription-activator like effector nucleases or RNA guided nucleases allows one to precisely engineer desired changes within a given target sequence. The genome editing reagents introduce double stranded breaks (DSBs) at the target site which can then undergo DNA repair by non-homologous end joining (NHEJ) or homology directed recombination (HDR) when a template DNA molecule is available. NHEJ repair results in indel mutations at the target site. As PCR amplified products from mutant target regions are likely to exhibit different melting profiles than PCR products amplified from wild type target region, we designed a high resolution melting analysis (HRMA) for rapid identification of efficient genome editing reagents. We also designed TaqMan assays using probes situated across the cut site to discriminate wild type from mutant sequences present after genome editing. The experiments revealed that the sensitivity of the assays to detect NHEJ-mediated DNA repair could be enhanced by selection of transfected cells to reduce the contribution of unmodified genomic DNA from untransfected cells to the DNA melting profile. The presence of donor template DNA lacking the target sequence at the time of genome editing further enhanced the sensitivity of the assays for detection of mutant DNA molecules by excluding the wild-type sequences modified by HDR. A second TaqMan probe that bound to an adjacent site, outside of the primary target cut site, was used to directly determine the contribution of HDR to DNA repair in the presence of the donor template sequence. The TaqMan qPCR assay, designed to measure the contribution of NHEJ and HDR in DNA repair, corroborated the results from HRMA. The data indicated that genome editing reagents can produce DSBs at high efficiency in HEK293T cells but a significant proportion of these are likely masked by reversion to wild type as a result of HDR. Supplying a donor plasmid to provide a template for HDR (that eliminates a PCR amplifiable target) revealed these cryptic DSBs and facilitated the determination of the true efficacy of genome editing reagents. The results indicated that in HEK293T cells, approximately 40% of the DSBs introduced by genome editing, were available for participation in HDR.
Thermodynamically optimal whole-genome tiling microarray design and validation.

PubMed

Cho, Hyejin; Chou, Hui-Hsien

2016-06-13

Microarray is an efficient apparatus to interrogate the whole transcriptome of species. Microarray can be designed according to annotated gene sets, but the resulted microarrays cannot be used to identify novel transcripts and this design method is not applicable to unannotated species. Alternatively, a whole-genome tiling microarray can be designed using only genomic sequences without gene annotations, and it can be used to detect novel RNA transcripts as well as known genes. The difficulty with tiling microarray design lies in the tradeoff between probe-specificity and coverage of the genome. Sequence comparison methods based on BLAST or similar software are commonly employed in microarray design, but they cannot precisely determine the subtle thermodynamic competition between probe targets and partially matched probe nontargets during hybridizations. Using the whole-genome thermodynamic analysis software PICKY to design tiling microarrays, we can achieve maximum whole-genome coverage allowable under the thermodynamic constraints of each target genome. The resulted tiling microarrays are thermodynamically optimal in the sense that all selected probes share the same melting temperature separation range between their targets and closest nontargets, and no additional probes can be added without violating the specificity of the microarray to the target genome. This new design method was used to create two whole-genome tiling microarrays for Escherichia coli MG1655 and Agrobacterium tumefaciens C58 and the experiment results validated the design.
Mechanism of foreign DNA selection in a bacterial adaptive immune system

PubMed Central

Sashital, Dipali G.; Wiedenheft, Blake; Doudna, Jennifer A.

2012-01-01

Summary In bacterial and archaeal CRISPR immune pathways, DNA sequences from invading bacteriophage or plasmids are integrated into CRISPR loci within the host genome, conferring immunity against subsequent infections. The ribonucleoprotein complex Cascade utilizes RNAs generated from these loci to target complementary “non-self” DNA sequences for destruction, while avoiding binding to “self” sequences within the CRISPR locus. Here we show that CasA, the largest protein subunit of Cascade, is required for non-self target recognition and binding. Combining a 2.3 Å crystal structure of CasA with cryo-EM structures of Cascade, we have identified a loop that is required for viral defense. This loop contacts a conserved 3-base pair motif that is required for non-self target selection. Our data suggest a model in which the CasA loop scans DNA for this short motif prior to target destabilization and binding, maximizing the efficiency of DNA surveillance by Cascade. PMID:22521690
Genomic diversity and population structure of three autochthonous Greek sheep breeds assessed with genome-wide DNA arrays.

PubMed

Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G

2018-06-01

In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.
High-throughput analysis of T-DNA location and structure using sequence capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
High-throughput analysis of T-DNA location and structure using sequence capture

DOE PAGES

Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...

2015-10-07

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
Peroxisome Proliferator-Activated Receptor Subtype- and Cell-Type-Specific Activation of Genomic Target Genes upon Adenoviral Transgene Delivery

PubMed Central

Nielsen, Ronni; Grøntved, Lars; Stunnenberg, Hendrik G.; Mandrup, Susanne

2006-01-01

Investigations of the molecular events involved in activation of genomic target genes by peroxisome proliferator-activated receptors (PPARs) have been hampered by the inability to establish a clean on/off state of the receptor in living cells. Here we show that the combination of adenoviral delivery and chromatin immunoprecipitation (ChIP) is ideal for dissecting these mechanisms. Adenoviral delivery of PPARs leads to a rapid and synchronous expression of the PPAR subtypes, establishment of transcriptional active complexes at genomic loci, and immediate activation of even silent target genes. We demonstrate that PPARγ2 possesses considerable ligand-dependent as well as independent transactivation potential and that agonists increase the occupancy of PPARγ2/retinoid X receptor at PPAR response elements. Intriguingly, by direct comparison of the PPARs (α, γ, and β/δ), we show that the subtypes have very different abilities to gain access to target sites and that in general the genomic occupancy correlates with the ability to activate the corresponding target gene. In addition, the specificity and potency of activation by PPAR subtypes are highly dependent on the cell type. Thus, PPAR subtype-specific activation of genomic target genes involves an intricate interplay between the properties of the subtype- and cell-type-specific settings at the individual target loci. PMID:16847324
An integrated CRISPR Bombyx mori genome editing system with improved efficiency and expanded target sites.

PubMed

Ma, Sanyuan; Liu, Yue; Liu, Yuanyuan; Chang, Jiasong; Zhang, Tong; Wang, Xiaogang; Shi, Run; Lu, Wei; Xia, Xiaojuan; Zhao, Ping; Xia, Qingyou

2017-04-01

Genome editing enabled unprecedented new opportunities for targeted genomic engineering of a wide variety of organisms ranging from microbes, plants, animals and even human embryos. The serial establishing and rapid applications of genome editing tools significantly accelerated Bombyx mori (B. mori) research during the past years. However, the only CRISPR system in B. mori was the commonly used SpCas9, which only recognize target sites containing NGG PAM sequence. In the present study, we first improve the efficiency of our previous established SpCas9 system by 3.5 folds. The improved high efficiency was also observed at several loci in both BmNs cells and B. mori embryos. Then to expand the target sites, we showed that two newly discovered CRISPR system, SaCas9 and AsCpf1, could also induce highly efficient site-specific genome editing in BmNs cells, and constructed an integrated CRISPR system. Genome-wide analysis of targetable sites was further conducted and showed that the integrated system cover 69,144,399 sites in B. mori genome, and one site could be found in every 6.5 bp. The efficiency and resolution of this CRISPR platform will probably accelerate both fundamental researches and applicable studies in B. mori, and perhaps other insects. Copyright © 2017 Elsevier Ltd. All rights reserved.
Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

DTIC Science & Technology

2015-09-01

assessed the specificity of mutation in Drosophila S2R+ cells. We generated a quantitative mutation reporter vector in which an sgRNA target sequence ...phosphatases (563 genes) in the Drosophila genome (Figure 4). 65 samples that displayed synthetic lethality (15 genes) or synthetic increases in viability...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . . Identified three hits (mRNA-Cap, Pitslre and CycT) that scored as
Dramatic Enhancement of Genome Editing by CRISPR/Cas9 Through Improved Guide RNA Design

PubMed Central

Farboud, Behnom; Meyer, Barbara J.

2015-01-01

Success with genome editing by the RNA-programmed nuclease Cas9 has been limited by the inability to predict effective guide RNAs and DNA target sites. Not all guide RNAs have been successful, and even those that were, varied widely in their efficacy. Here we describe and validate a strategy for Caenorhabditis elegans that reliably achieved a high frequency of genome editing for all targets tested in vivo. The key innovation was to design guide RNAs with a GG motif at the 3′ end of their target-specific sequences. All guides designed using this simple principle induced a high frequency of targeted mutagenesis via nonhomologous end joining (NHEJ) and a high frequency of precise DNA integration from exogenous DNA templates via homology-directed repair (HDR). Related guide RNAs having the GG motif shifted by only three nucleotides showed severely reduced or no genome editing. We also combined the 3′ GG guide improvement with a co-CRISPR/co-conversion approach. For this co-conversion scheme, animals were only screened for genome editing at designated targets if they exhibited a dominant phenotype caused by Cas9-dependent editing of an unrelated target. Combining the two strategies further enhanced the ease of mutant recovery, thereby providing a powerful means to obtain desired genetic changes in an otherwise unaltered genome. PMID:25695951
Translating human genetics into mouse: the impact of ultra-rapid in vivo genome editing.

PubMed

Aida, Tomomi; Imahashi, Risa; Tanaka, Kohichi

2014-01-01

Gene-targeted mutant animals, such as knockout or knockin mice, have dramatically improved our understanding of the functions of genes in vivo and the genetic diversity that characterizes health and disease. However, the generation of targeted mice relies on gene targeting in embryonic stem (ES) cells, which is a time-consuming, laborious, and expensive process. The recent groundbreaking development of several genome editing technologies has enabled the targeted alteration of almost any sequence in any cell or organism. These technologies have now been applied to mouse zygotes (in vivo genome editing), thereby providing new avenues for simple, convenient, and ultra-rapid production of knockout or knockin mice without the need for ES cells. Here, we review recent achievements in the production of gene-targeted mice by in vivo genome editing. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.
Systematic Identification of Combinatorial Drivers and Targets in Cancer Cell Lines

PubMed Central

Tabchy, Adel; Eltonsy, Nevine; Housman, David E.; Mills, Gordon B.

2013-01-01

There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance. PMID:23577104
Systematic identification of combinatorial drivers and targets in cancer cell lines.

PubMed

Tabchy, Adel; Eltonsy, Nevine; Housman, David E; Mills, Gordon B

2013-01-01

There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance.
CRISPR/Cas9-Based Multiplex Genome Editing in Monocot and Dicot Plants.

PubMed

Ma, Xingliang; Liu, Yao-Guang

2016-07-01

The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated genome targeting system has been applied to a variety of organisms, including plants. Compared to other genome-targeting technologies such as zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), the CRISPR/Cas9 system is easier to use and has much higher editing efficiency. In addition, multiple "single guide RNAs" (sgRNAs) with different target sequences can be designed to direct the Cas9 protein to multiple genomic sites for simultaneous multiplex editing. Here, we present a procedure for highly efficient multiplex genome targeting in monocot and dicot plants using a versatile and robust CRISPR/Cas9 vector system, emphasizing the construction of binary constructs with multiple sgRNA expression cassettes in one round of cloning using Golden Gate ligation. We also describe the genotyping of targeted mutations in transgenic plants by direct Sanger sequencing followed by decoding of superimposed sequencing chromatograms containing biallelic or heterozygous mutations using the Web-based tool DSDecode. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Genome editing technologies to fight infectious diseases.

PubMed

Trevisan, Marta; Palù, Giorgio; Barzon, Luisa

2017-11-01

Genome editing by programmable nucleases represents a promising tool that could be exploited to develop new therapeutic strategies to fight infectious diseases. These nucleases, such as zinc-finger nucleases, transcription activator-like effector nucleases, clustered regularly interspaced short palindromic repeat (CRISPR)-CRISPR-associated protein 9 (Cas9) and homing endonucleases, are molecular scissors that can be targeted at predetermined loci in order to modify the genome sequence of an organism. Areas covered: By perturbing genomic DNA at predetermined loci, programmable nucleases can be used as antiviral and antimicrobial treatment. This approach includes targeting of essential viral genes or viral sequences able, once mutated, to inhibit viral replication; repurposing of CRISPR-Cas9 system for lethal self-targeting of bacteria; targeting antibiotic-resistance and virulence genes in bacteria, fungi, and parasites; engineering arthropod vectors to prevent vector-borne infections. Expert commentary: While progress has been done in demonstrating the feasibility of using genome editing as antimicrobial strategy, there are still many hurdles to overcome, such as the risk of off-target mutations, the raising of escape mutants, and the inefficiency of delivery methods, before translating results from preclinical studies into clinical applications.

Development of Real Time PCR Using Novel Genomic Target for Detection of Multiple Salmonella Serovars from Milk and Chickens

USDA-ARS?s Scientific Manuscript database

Background: A highly sensitive and specific novel genomic and plasmid target-based PCR platform was developed to detect multiple Salmonella serovars (S. Heidelberg, S. Dublin, S. Hadar, S. Kentucky and S. Enteritidis). Through extensive genome mining of protein databases of these serovars and compar...
Hit and go CAS9 delivered through a lentiviral based self-limiting circuit.

PubMed

Petris, Gianluca; Casini, Antonio; Montagna, Claudia; Lorenzin, Francesca; Prandi, Davide; Romanel, Alessandro; Zasso, Jacopo; Conti, Luciano; Demichelis, Francesca; Cereseto, Anna

2017-05-22

In vivo application of the CRISPR-Cas9 technology is still limited by unwanted Cas9 genomic cleavages. Long-term expression of Cas9 increases the number of genomic loci non-specifically cleaved by the nuclease. Here we develop a Self-Limiting Cas9 circuit for Enhanced Safety and specificity (SLiCES) which consists of an expression unit for Streptococcus pyogenes Cas9 (SpCas9), a self-targeting sgRNA and a second sgRNA targeting a chosen genomic locus. The self-limiting circuit results in increased genome editing specificity by controlling Cas9 levels. For its in vivo utilization, we next integrate SLiCES into a lentiviral delivery system (lentiSLiCES) via circuit inhibition to achieve viral particle production. Upon delivery into target cells, the lentiSLiCES circuit switches on to edit the intended genomic locus while simultaneously stepping up its own neutralization through SpCas9 inactivation. By preserving target cells from residual nuclease activity, our hit and go system increases safety margins for genome editing.
Mitochondria and Mitochondrial ROS in Cancer: Novel Targets for Anticancer Therapy.

PubMed

Yang, Yuhui; Karakhanova, Svetlana; Hartwig, Werner; D'Haese, Jan G; Philippov, Pavel P; Werner, Jens; Bazhin, Alexandr V

2016-12-01

Mitochondria are indispensable for energy metabolism, apoptosis regulation, and cell signaling. Mitochondria in malignant cells differ structurally and functionally from those in normal cells and participate actively in metabolic reprogramming. Mitochondria in cancer cells are characterized by reactive oxygen species (ROS) overproduction, which promotes cancer development by inducing genomic instability, modifying gene expression, and participating in signaling pathways. Mitochondrial and nuclear DNA mutations caused by oxidative damage that impair the oxidative phosphorylation process will result in further mitochondrial ROS production, completing the "vicious cycle" between mitochondria, ROS, genomic instability, and cancer development. The multiple essential roles of mitochondria have been utilized for designing novel mitochondria-targeted anticancer agents. Selective drug delivery to mitochondria helps to increase specificity and reduce toxicity of these agents. In order to reduce mitochondrial ROS production, mitochondria-targeted antioxidants can specifically accumulate in mitochondria by affiliating to a lipophilic penetrating cation and prevent mitochondria from oxidative damage. In consistence with the oncogenic role of ROS, mitochondria-targeted antioxidants are found to be effective in cancer prevention and anticancer therapy. A better understanding of the role played by mitochondria in cancer development will help to reveal more therapeutic targets, and will help to increase the activity and selectivity of mitochondria-targeted anticancer drugs. In this review we summarized the impact of mitochondria on cancer and gave summary about the possibilities to target mitochondria for anticancer therapies. J. Cell. Physiol. 231: 2570-2581, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Pharmacological inhibition of feline immunodeficiency virus (FIV).

PubMed

Mohammadi, Hakimeh; Bienzle, Dorothee

2012-05-01

Feline immunodeficiency virus (FIV) is a member of the retroviridae family of viruses and causes an acquired immunodeficiency syndrome (AIDS) in domestic and non-domestic cats worldwide. Genome organization of FIV and clinical characteristics of the disease caused by the virus are similar to those of human immunodeficiency virus (HIV). Both viruses infect T lymphocytes, monocytes and macrophages, and their replication cycle in infected cells is analogous. Due to marked similarity in genomic organization, virus structure, virus replication and disease pathogenesis of FIV and HIV, infection of cats with FIV is a useful tool to study and develop novel drugs and vaccines for HIV. Anti-retroviral drugs studied extensively in HIV infection have targeted different steps of the virus replication cycle: (1) inhibition of virus entry into susceptible cells at the level of attachment to host cell surface receptors and co-receptors; (2) inhibition of fusion of the virus membrane with the cell membrane; (3) blockade of reverse transcription of viral genomic RNA; (4) interruption of nuclear translocation and viral DNA integration into host genomes; (5) prevention of viral transcript processing and nuclear export; and (6) inhibition of virion assembly and maturation. Despite much success of anti-retroviral therapy slowing disease progression in people, similar therapy has not been thoroughly investigated in cats. In this article we review current pharmacological approaches and novel targets for anti-lentiviral therapy, and critically assess potentially suitable applications against FIV infection in cats.
Careful with That Axe, Gene, Genome Perturbation after a PEG-Mediated Protoplast Transformation in Fusarium verticillioides.

PubMed

Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall'Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter

2017-05-31

Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1 , in F. verticillioides . A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1 -deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1 -mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome.
Careful with That Axe, Gene, Genome Perturbation after a PEG-Mediated Protoplast Transformation in Fusarium verticillioides

PubMed Central

Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall’Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter

2017-01-01

Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1, in F. verticillioides. A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1-deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1-mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome. PMID:28561789
Genome-wide specificity of DNA binding, gene regulation, and chromatin remodeling by TALE- and CRISPR/Cas9-based transcriptional activators.

PubMed

Polstein, Lauren R; Perez-Pinera, Pablo; Kocak, D Dewran; Vockley, Christopher M; Bledsoe, Peggy; Song, Lingyun; Safi, Alexias; Crawford, Gregory E; Reddy, Timothy E; Gersbach, Charles A

2015-08-01

Genome engineering technologies based on the CRISPR/Cas9 and TALE systems are enabling new approaches in science and biotechnology. However, the specificity of these tools in complex genomes and the role of chromatin structure in determining DNA binding are not well understood. We analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators in human cells using ChIP-seq to assess DNA-binding specificity and RNA-seq to measure the specificity of perturbing the transcriptome. Additionally, DNase-seq was used to assess genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these transcription factors are highly specific in both DNA binding and gene regulation and are able to open targeted regions of closed chromatin independent of gene activation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. © 2015 Polstein et al.; Published by Cold Spring Harbor Laboratory Press.
Data characterizing the chloroplast genomes of extinct and endangered Hawaiian endemic mints (Lamiaceae) and their close relatives.

PubMed

Welch, Andreanna J; Collins, Katherine; Ratan, Aakrosh; Drautz-Moses, Daniela I; Schuster, Stephan C; Lindqvist, Charlotte

2016-06-01

These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, "The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)" [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material.
A bioinformatics potpourri.

PubMed

Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

2018-01-19

The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.
The genome editing toolbox: a spectrum of approaches for targeted modification.

PubMed

Cheng, Joseph K; Alper, Hal S

2014-12-01

The increase in quality, quantity, and complexity of recombinant products heavily drives the need to predictably engineer model and complex (mammalian) cell systems. However, until recently, limited tools offered the ability to precisely manipulate their genomes, thus impeding the full potential of rational cell line development processes. Targeted genome editing can combine the advances in synthetic and systems biology with current cellular hosts to further push productivity and expand the product repertoire. This review highlights recent advances in targeted genome editing techniques, discussing some of their capabilities and limitations and their potential to aid advances in pharmaceutical biotechnology. Copyright © 2014 Elsevier Ltd. All rights reserved.
Efficient, footprint-free human iPSC genome editing by consolidation of Cas9/CRISPR and piggyBac technologies.

PubMed

Wang, Gang; Yang, Luhan; Grishin, Dennis; Rios, Xavier; Ye, Lillian Y; Hu, Yong; Li, Kai; Zhang, Donghui; Church, George M; Pu, William T

2017-01-01

Genome editing of human induced pluripotent stem cells (hiPSCs) offers unprecedented opportunities for in vitro disease modeling and personalized cell replacement therapy. The introduction of Cas9-directed genome editing has expanded adoption of this approach. However, marker-free genome editing using standard protocols remains inefficient, yielding desired targeted alleles at a rate of ∼1-5%. We developed a protocol based on a doxycycline-inducible Cas9 transgene carried on a piggyBac transposon to enable robust and highly efficient Cas9-directed genome editing, so that a parental line can be expeditiously engineered to harbor many separate mutations. Treatment with doxycycline and transfection with guide RNA (gRNA), donor DNA and piggyBac transposase resulted in efficient, targeted genome editing and concurrent scarless transgene excision. Using this approach, in 7 weeks it is possible to efficiently obtain genome-edited clones with minimal off-target mutagenesis and with indel mutation frequencies of 40-50% and homology-directed repair (HDR) frequencies of 10-20%.
Combining functional genomics and chemical biology to identify targets of bioactive compounds.

PubMed

Ho, Cheuk Hei; Piotrowski, Jeff; Dixon, Scott J; Baryshnikova, Anastasia; Costanzo, Michael; Boone, Charles

2011-02-01

Genome sequencing projects have revealed thousands of suspected genes, challenging researchers to develop efficient large-scale functional analysis methodologies. Determining the function of a gene product generally requires a means to alter its function. Genetically tractable model organisms have been widely exploited for the isolation and characterization of activating and inactivating mutations in genes encoding proteins of interest. Chemical genetics represents a complementary approach involving the use of small molecules capable of either inactivating or activating their targets. Saccharomyces cerevisiae has been an important test bed for the development and application of chemical genomic assays aimed at identifying targets and modes of action of known and uncharacterized compounds. Here we review yeast chemical genomic assays strategies for drug target identification. Copyright © 2010 Elsevier Ltd. All rights reserved.
Microfluidic droplet enrichment for targeted sequencing

PubMed Central

Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

2015-01-01

Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629
Catalog of genetic progression of human cancers: breast cancer.

PubMed

Desmedt, Christine; Yates, Lucy; Kulka, Janina

2016-03-01

With the rapid development of next-generation sequencing, deeper insights are being gained into the molecular evolution that underlies the development and clinical progression of breast cancer. It is apparent that during evolution, breast cancers acquire thousands of mutations including single base pair substitutions, insertions, deletions, copy number aberrations, and structural rearrangements. As a consequence, at the whole genome level, no two cancers are identical and few cancers even share the same complement of "driver" mutations. Indeed, two samples from the same cancer may also exhibit extensive differences due to constant remodeling of the genome over time. In this review, we summarize recent studies that extend our understanding of the genomic basis of cancer progression. Key biological insights include the following: subclonal diversification begins early in cancer evolution, being detectable even in in situ lesions; geographical stratification of subclonal structure is frequent in primary tumors and can include therapeutically targetable alterations; multiple distant metastases typically arise from a common metastatic ancestor following a "metastatic cascade" model; systemic therapy can unmask preexisting resistant subclones or influence further treatment sensitivity and disease progression. We conclude the review by describing novel approaches such as the analysis of circulating DNA and patient-derived xenografts that promise to further our understanding of the genomic changes occurring during cancer evolution and guide treatment decision making.
Detecting Role Errors in the Gene Hierarchy of the NCI Thesaurus

PubMed Central

Min, Hua; Cohen, Barry; Halper, Michael; Oren, Marc; Perl, Yehoshua

2008-01-01

Gene terminologies are playing an increasingly important role in the ever-growing field of genomic research. While errors in large, complex terminologies are inevitable, gene terminologies are even more susceptible to them due to the rapid growth of genomic knowledge and the nature of its discovery. It is therefore very important to establish quality-assurance protocols for such genomic-knowledge repositories. Different kinds of terminologies oftentimes require auditing methodologies adapted to their particular structures. In light of this, an auditing methodology tailored to the characteristics of the NCI Thesaurus’s (NCIT’s) Gene hierarchy is presented. The Gene hierarchy is of particular interest to the NCIT’s designers due to the primary role of genomics in current cancer research. This multiphase methodology focuses on detecting role-errors, such as missing roles or roles with incorrect or incomplete target structures, occurring within that hierarchy. The methodology is based on two kinds of abstraction networks, called taxonomies, that highlight the role distribution among concepts within the IS-A (subsumption) hierarchy. These abstract views tend to highlight portions of the hierarchy having a higher concentration of errors. The errors found during an application of the methodology are reported. Hypotheses pertaining to the efficacy of our methodology are investigated. PMID:19221606
Common structural and epigenetic changes in the genome of castration-resistant prostate cancer.

PubMed

Friedlander, Terence W; Roy, Ritu; Tomlins, Scott A; Ngo, Vy T; Kobayashi, Yasuko; Azameera, Aruna; Rubin, Mark A; Pienta, Kenneth J; Chinnaiyan, Arul; Ittmann, Michael M; Ryan, Charles J; Paris, Pamela L

2012-02-01

Progression of primary prostate cancer to castration-resistant prostate cancer (CRPC) is associated with numerous genetic and epigenetic alterations that are thought to promote survival at metastatic sites. In this study, we investigated gene copy number and CpG methylation status in CRPC to gain insight into specific pathophysiologic pathways that are active in this advanced form of prostate cancer. Our analysis defined and validated 495 genes exhibiting significant differences in CRPC in gene copy number, including gains in androgen receptor (AR) and losses of PTEN and retinoblastoma 1 (RB1). Significant copy number differences existed between tumors with or without AR gene amplification, including a common loss of AR repressors in AR-unamplified tumors. Simultaneous gene methylation and allelic deletion occurred frequently in RB1 and HSD17B2, the latter of which is involved in testosterone metabolism. Lastly, genomic DNA from most CRPC was hypermethylated compared with benign prostate tissue. Our findings establish a comprehensive methylation signature that couples epigenomic and structural analyses, thereby offering insights into the genomic alterations in CRPC that are associated with a circumvention of hormonal therapy. Genes identified in this integrated genomic study point to new drug targets in CRPC, an incurable disease state which remains the chief therapeutic challenge. ©2012 AACR.
Pharmacogenomics and its potential impact on drug and formulation development.

PubMed

Regnstrom, Karin; Burgess, Diane J

2005-01-01

Recent advances in genomic research have provided the basis for new insights into the importance of genetic and genomic markers during the different stages of drug development. A new field of research, pharmacogenomics, which studies the relationship between drug effects and the genome, has emerged. Structural pharmacogenomics maps the complete DNA sequences of whole genomes (genotypes) including individual variations, and functional pharmacogenomics assesses the expression levels of thousands of genes in one single experiment. Together, these two areas of pharmacogenomics have generated massive databases, which have become a challenge for the research field of informatics and have fostered a new branch of research, bioinformatics. If skillfully used, the databases generated by pharmacogenomics together with data mining on the Web promise to improve the drug development process in a variety of areas: identification of drug targets, evaluation of toxicity, classification of diseases, evaluation of formulations, assessment of drug response and treatment, post-marketing applications, and development of personalized medicines.
Disruption of Specific RNA-RNA Interactions in a Double-Stranded RNA Virus Inhibits Genome Packaging and Virus Infectivity

PubMed Central

Fajardo, Teodoro; Sung, Po-Yu; Roy, Polly

2015-01-01

Bluetongue virus (BTV) causes hemorrhagic disease in economically important livestock. The BTV genome is organized into ten discrete double-stranded RNA molecules (S1-S10) which have been suggested to follow a sequential packaging pathway from smallest to largest segment during virus capsid assembly. To substantiate and extend these studies, we have investigated the RNA sorting and packaging mechanisms with a new experimental approach using inhibitory oligonucleotides. Putative packaging signals present in the 3’untranslated regions of BTV segments were targeted by a number of nuclease resistant oligoribonucleotides (ORNs) and their effects on virus replication in cell culture were assessed. ORNs complementary to the 3’ UTR of BTV RNAs significantly inhibited virus replication without affecting protein synthesis. Same ORNs were found to inhibit complex formation when added to a novel RNA-RNA interaction assay which measured the formation of supramolecular complexes between and among different RNA segments. ORNs targeting the 3’UTR of BTV segment 10, the smallest RNA segment, were shown to be the most potent and deletions or substitution mutations of the targeted sequences diminished the RNA complexes and abolished the recovery of viable viruses using reverse genetics. Cell-free capsid assembly/RNA packaging assay also confirmed that the inhibitory ORNs could interfere with RNA packaging and further substitution mutations within the putative RNA packaging sequence have identified the recognition sequence concerned. Exchange of 3’UTR between segments have further demonstrated that RNA recognition was segment specific, most likely acting as part of the secondary structure of the entire genomic segment. Our data confirm that genome packaging in this segmented dsRNA virus occurs via the formation of supramolecular complexes formed by the interaction of specific sequences located in the 3’ UTRs. Additionally, the inhibition of packaging in-trans with inhibitory ORNs suggests this that interaction is a bona fide target for the design of compounds with antiviral activity. PMID:26646790
Disruption of Specific RNA-RNA Interactions in a Double-Stranded RNA Virus Inhibits Genome Packaging and Virus Infectivity.

PubMed

Fajardo, Teodoro; Sung, Po-Yu; Roy, Polly

2015-12-01

Bluetongue virus (BTV) causes hemorrhagic disease in economically important livestock. The BTV genome is organized into ten discrete double-stranded RNA molecules (S1-S10) which have been suggested to follow a sequential packaging pathway from smallest to largest segment during virus capsid assembly. To substantiate and extend these studies, we have investigated the RNA sorting and packaging mechanisms with a new experimental approach using inhibitory oligonucleotides. Putative packaging signals present in the 3'untranslated regions of BTV segments were targeted by a number of nuclease resistant oligoribonucleotides (ORNs) and their effects on virus replication in cell culture were assessed. ORNs complementary to the 3' UTR of BTV RNAs significantly inhibited virus replication without affecting protein synthesis. Same ORNs were found to inhibit complex formation when added to a novel RNA-RNA interaction assay which measured the formation of supramolecular complexes between and among different RNA segments. ORNs targeting the 3'UTR of BTV segment 10, the smallest RNA segment, were shown to be the most potent and deletions or substitution mutations of the targeted sequences diminished the RNA complexes and abolished the recovery of viable viruses using reverse genetics. Cell-free capsid assembly/RNA packaging assay also confirmed that the inhibitory ORNs could interfere with RNA packaging and further substitution mutations within the putative RNA packaging sequence have identified the recognition sequence concerned. Exchange of 3'UTR between segments have further demonstrated that RNA recognition was segment specific, most likely acting as part of the secondary structure of the entire genomic segment. Our data confirm that genome packaging in this segmented dsRNA virus occurs via the formation of supramolecular complexes formed by the interaction of specific sequences located in the 3' UTRs. Additionally, the inhibition of packaging in-trans with inhibitory ORNs suggests this that interaction is a bona fide target for the design of compounds with antiviral activity.
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

PubMed

Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

2008-09-01

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.

The role of protein structural analysis in the next generation sequencing era.

PubMed

Yue, Wyatt W; Froese, D Sean; Brennan, Paul E

2014-01-01

Proteins are macromolecules that serve a cell's myriad processes and functions in all living organisms via dynamic interactions with other proteins, small molecules and cellular components. Genetic variations in the protein-encoding regions of the human genome account for >85% of all known Mendelian diseases, and play an influential role in shaping complex polygenic diseases. Proteins also serve as the predominant target class for the design of small molecule drugs to modulate their activity. Knowledge of the shape and form of proteins, by means of their three-dimensional structures, is therefore instrumental to understanding their roles in disease and their potentials for drug development. In this chapter we outline, with the wide readership of non-structural biologists in mind, the various experimental and computational methods available for protein structure determination. We summarize how the wealth of structure information, contributed to a large extent by the technological advances in structure determination to date, serves as a useful tool to decipher the molecular basis of genetic variations for disease characterization and diagnosis, particularly in the emerging era of genomic medicine, and becomes an integral component in the modern day approach towards rational drug development.
Baculovirus-based genome editing in primary cells.

PubMed

Mansouri, Maysam; Ehsaei, Zahra; Taylor, Verdon; Berger, Philipp

2017-03-01

Genome editing in eukaryotes became easier in the last years with the development of nucleases that induce double strand breaks in DNA at user-defined sites. CRISPR/Cas9-based genome editing is currently one of the most powerful strategies. In the easiest case, a nuclease (e.g. Cas9) and a target defining guide RNA (gRNA) are transferred into a target cell. Non-homologous end joining (NHEJ) repair of the DNA break following Cas9 cleavage can lead to inactivation of the target gene. Specific repair or insertion of DNA with Homology Directed Repair (HDR) needs the simultaneous delivery of a repair template. Recombinant Lentivirus or Adenovirus genomes have enough capacity for a nuclease coding sequence and the gRNA but are usually too small to also carry large targeting constructs. We recently showed that a baculovirus-based multigene expression system (MultiPrime) can be used for genome editing in primary cells since it possesses the necessary capacity to carry the nuclease and gRNA expression constructs and the HDR targeting sequences. Here we present new Acceptor plasmids for MultiPrime that allow simplified cloning of baculoviruses for genome editing and we show their functionality in primary cells with limited life span and induced pluripotent stem cells (iPS). Copyright © 2017 Elsevier Inc. All rights reserved.
Characterization of the catalytic center of the Ebola virus L polymerase.

PubMed

Schmidt, Marie Luisa; Hoenen, Thomas

2017-10-01

Ebola virus (EBOV) causes a severe hemorrhagic fever in humans and non-human primates. While no licensed therapeutics are available, recently there has been tremendous progress in developing antivirals. Targeting the ribonucleoprotein complex (RNP) proteins, which facilitate genome replication and transcription, and particularly the polymerase L, is a promising antiviral approach since these processes are essential for the virus life cycle. However, until now little is known about L in terms of its structure and function, and in particular the catalytic center of the RNA-dependent RNA polymerase (RdRp) of L, which is one of the most promising molecular targets, has never been experimentally characterized. Using multiple sequence alignments with other negative sense single-stranded RNA viruses we identified the putative catalytic center of the EBOV RdRp. An L protein with mutations in this center was then generated and characterized using various life cycle modelling systems. These systems are based on minigenomes, i.e. miniature versions of the viral genome, in which the viral genes are exchanged against a reporter gene. When such minigenomes are coexpressed with RNP proteins in mammalian cells, the RNP proteins recognize them as authentic templates for replication and transcription, resulting in reporter activity reflecting these processes. Replication-competent minigenome systems indicated that our L catalytic domain mutant was impaired in genome replication and/or transcription, and by using replication-deficient minigenome systems, as well as a novel RT-qPCR-based genome replication assay, we showed that it indeed no longer supported either of these processes. However, it still showed similar expression to wild-type L, and retained its ability to be incorporated into inclusion bodies, which are the sites of EBOV genome replication. We have experimentally defined the catalytic center of the EBOV RdRp, and thus a promising antiviral target regulating an essential aspect of the EBOV life cycle.
Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease.

PubMed

Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M

2016-05-01

To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Clinical Actionability of Comprehensive Genomic Profiling for Management of Rare or Refractory Cancers

PubMed Central

Hirshfield, Kim M.; Tolkunov, Denis; Zhong, Hua; Ali, Siraj M.; Stein, Mark N.; Murphy, Susan; Vig, Hetal; Vazquez, Alexei; Glod, John; Moss, Rebecca A.; Belyi, Vladimir; Chan, Chang S.; Chen, Suzie; Goodell, Lauri; Foran, David; Yelensky, Roman; Palma, Norma A.; Sun, James X.; Miller, Vincent A.; Stephens, Philip J.; Ross, Jeffrey S.; Kaufman, Howard; Poplin, Elizabeth; Mehnert, Janice; Tan, Antoinette R.; Bertino, Joseph R.; Aisner, Joseph; DiPaola, Robert S.

2016-01-01

Background. The frequency with which targeted tumor sequencing results will lead to implemented change in care is unclear. Prospective assessment of the feasibility and limitations of using genomic sequencing is critically important. Methods. A prospective clinical study was conducted on 100 patients with diverse-histology, rare, or poor-prognosis cancers to evaluate the clinical actionability of a Clinical Laboratory Improvement Amendments (CLIA)-certified, comprehensive genomic profiling assay (FoundationOne), using formalin-fixed, paraffin-embedded tumors. The primary objectives were to assess utility, feasibility, and limitations of genomic sequencing for genomically guided therapy or other clinical purpose in the setting of a multidisciplinary molecular tumor board. Results. Of the tumors from the 92 patients with sufficient tissue, 88 (96%) had at least one genomic alteration (average 3.6, range 0–10). Commonly altered pathways included p53 (46%), RAS/RAF/MAPK (rat sarcoma; rapidly accelerated fibrosarcoma; mitogen-activated protein kinase) (45%), receptor tyrosine kinases/ligand (44%), PI3K/AKT/mTOR (phosphatidylinositol-4,5-bisphosphate 3-kinase; protein kinase B; mammalian target of rapamycin) (35%), transcription factors/regulators (31%), and cell cycle regulators (30%). Many low frequency but potentially actionable alterations were identified in diverse histologies. Use of comprehensive profiling led to implementable clinical action in 35% of tumors with genomic alterations, including genomically guided therapy, diagnostic modification, and trigger for germline genetic testing. Conclusion. Use of targeted next-generation sequencing in the setting of an institutional molecular tumor board led to implementable clinical action in more than one third of patients with rare and poor-prognosis cancers. Major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access. Early and serial sequencing in the clinical course and expanded access to genomically guided early-phase clinical trials and targeted agents may increase actionability. Implications for Practice: Identification of key factors that facilitate use of genomic tumor testing results and implementation of genomically guided therapy may lead to enhanced benefit for patients with rare or difficult to treat cancers. Clinical use of a targeted next-generation sequencing assay in the setting of an institutional molecular tumor board led to implementable clinical action in over one third of patients with rare and poor prognosis cancers. The major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access both on trial and off label. Approaches to increase actionability include early and serial sequencing in the clinical course and expanded access to genomically guided early phase clinical trials and targeted agents. PMID:27566247
Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

PubMed Central

2018-01-01

The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the genomes of 60 diverse F. graminearum isolates. We also assembled the first pan-genome for F. graminearum to clarify population-level differences in gene content potentially contributing to pathogen diversity. Bayesian and phylogenomic analyses revealed genetic structure associated with isolates that produce the novel NX-2 mycotoxin, suggesting a North American population that has remained genetically distinct from other endemic and introduced cereal-infecting populations. Genome scans uncovered distinct signatures of selection within populations, focused in high diversity, frequently recombining regions. These patterns suggested selection for genomic divergence at the trichothecene toxin gene cluster and thirteen additional regions containing genes potentially involved in pathogen specialization. Gene content differences further distinguished populations, in that 121 genes showed population-specific patterns of conservation. Genes that differentiated populations had predicted functions related to pathogenesis, secondary metabolism and antagonistic interactions, though a subset had unique roles in temperature and light sensitivity. Our results indicated that F. graminearum populations are distinguished by dozens of genes with signatures of selection and an array of dispensable accessory genes, suggesting that FHB pathogen populations may be equipped with different traits to exploit the agroecosystem. These findings provide insights into the evolutionary processes and genomic features contributing to population divergence in plant pathogens, and highlight candidate genes for future functional studies of pathogen specialization across evolutionarily and ecologically diverse fungi. PMID:29584736
Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

2009-06-18

Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes tomore » cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the molecular differences between cancer and normal that may be exploited to therapeutic benefit or that provide targets for molecular assays that may enable early cancer detection, and predict individual disease progression or response to treatment. This chapter reviews current and future directions in genome analysis and summarizes studies that provide insights into breast cancer pathophysiology or that suggest strategies to improve breast cancer management.« less
Genome-wide survey of DNA-binding proteins in Arabidopsis thaliana: analysis of distribution and functions.

PubMed

Malhotra, Sony; Sowdhamini, Ramanathan

2013-08-01

The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.
Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

PubMed

Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

2016-05-01

Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.
Engineering Molecular Immunity Against Plant Viruses.

PubMed

Zaidi, Syed Shan-E-Ali; Tashkandi, Manal; Mahfouz, Magdy M

2017-01-01

Genomic engineering has been used to precisely alter eukaryotic genomes at the single-base level for targeted gene editing, replacement, fusion, and mutagenesis, and plant viruses such as Tobacco rattle virus have been developed into efficient vectors for delivering genome-engineering reagents. In addition to altering the host genome, these methods can target pathogens to engineer molecular immunity. Indeed, recent studies have shown that clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) systems that target the genomes of DNA viruses can interfere with viral activity and limit viral symptoms in planta, demonstrating the utility of this system for engineering molecular immunity in plants. CRISPR/Cas9 can efficiently target single and multiple viral infections and confer plant immunity. Here, we discuss the use of site-specific nucleases to engineer molecular immunity against DNA and RNA viruses in plants. We also explore how to address the potential challenges encountered when producing plants with engineered resistance to single and mixed viral infections. © 2017 Elsevier Inc. All rights reserved.
Quantitative Tracking of Combinatorially Engineered Populations with Multiplexed Binary Assemblies.

PubMed

Zeitoun, Ramsey I; Pines, Gur; Grau, Willliam C; Gill, Ryan T

2017-04-21

Advances in synthetic biology and genomics have enabled full-scale genome engineering efforts on laboratory time scales. However, the absence of sufficient approaches for mapping engineered genomes at system-wide scales onto performance has limited the adoption of more sophisticated algorithms for engineering complex biological systems. Here we report on the development and application of a robust approach to quantitatively map combinatorially engineered populations at scales up to several dozen target sites. This approach works by assembling genome engineered sites with cell-specific barcodes into a format compatible with high-throughput sequencing technologies. This approach, called barcoded-TRACE (bTRACE) was applied to assess E. coli populations engineered by recursive multiplex recombineering across both 6-target sites and 31-target sites. The 31-target library was then tracked throughout growth selections in the presence and absence of isopentenol (a potential next-generation biofuel). We also use the resolution of bTRACE to compare the influence of technical and biological noise on genome engineering efforts.
Off-target Effects in CRISPR/Cas9-mediated Genome Engineering

PubMed Central

Zhang, Xiao-Hui; Tee, Louis Y; Wang, Xiao-Gang; Huang, Qun-Shan; Yang, Shi-Hua

2015-01-01

CRISPR/Cas9 is a versatile genome-editing technology that is widely used for studying the functionality of genetic elements, creating genetically modified organisms as well as preclinical research of genetic disorders. However, the high frequency of off-target activity (≥50%)—RGEN (RNA-guided endonuclease)-induced mutations at sites other than the intended on-target site—is one major concern, especially for therapeutic and clinical applications. Here, we review the basic mechanisms underlying off-target cutting in the CRISPR/Cas9 system, methods for detecting off-target mutations, and strategies for minimizing off-target cleavage. The improvement off-target specificity in the CRISPR/Cas9 system will provide solid genotype–phenotype correlations, and thus enable faithful interpretation of genome-editing data, which will certainly facilitate the basic and clinical application of this technology. PMID:26575098
Polymer modeling of the E. coli genome reveals the involvement of locus positioning and macrodomain structuring for the control of chromosome conformation and segregation

PubMed Central

Junier, Ivan; Boccard, Frédéric; Espéli, Olivier

2014-01-01

The mechanisms that control chromosome conformation and segregation in bacteria have not yet been elucidated. In Escherichia coli, the mere presence of an active process remains an open question. Here, we investigate the conformation and segregation pattern of the E. coli genome by performing numerical simulations on a polymer model of the chromosome. We analyze the roles of the intrinsic structuring of chromosomes and the forced localization of specific loci, which are observed in vivo. Specifically, we examine the segregation pattern of a chromosome that is divided into four structured macrodomains (MDs) and two non-structured regions. We find that strong osmotic-like organizational forces, which stem from the differential condensation levels of the chromosome regions, dictate the cellular disposition of the chromosome. Strikingly, the comparison of our in silico results with fluorescent imaging of the chromosome choreography in vivo reveals that in the presence of MDs the targeting of the origin and terminus regions to specific positions are sufficient to generate a segregation pattern that is indistinguishable from experimentally observed patterns. PMID:24194594
NMR in structural genomics to increase structural coverage of the protein universe: Delivered by Prof. Kurt Wüthrich on 7 July 2013 at the 38th FEBS Congress in St. Petersburg, Russia.

PubMed

Serrano, Pedro; Dutta, Samit K; Proudfoot, Andrew; Mohanty, Biswaranjan; Susac, Lukas; Martin, Bryan; Geralt, Michael; Jaroszewski, Lukasz; Godzik, Adam; Elsliger, Marc; Wilson, Ian A; Wüthrich, Kurt

2016-11-01

For more than a decade, the Joint Center for Structural Genomics (JCSG; www.jcsg.org) worked toward increased three-dimensional structure coverage of the protein universe. This coordinated quest was one of the main goals of the four high-throughput (HT) structure determination centers of the Protein Structure Initiative (PSI; www.nigms.nih.gov/Research/specificareas/PSI). To achieve the goals of the PSI, the JCSG made use of the complementarity of structure determination by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy to increase and diversify the range of targets entering the HT structure determination pipeline. The overall strategy, for both techniques, was to determine atomic resolution structures for representatives of large protein families, as defined by the Pfam database, which had no structural coverage and could make significant contributions to biological and biomedical research. Furthermore, the experimental structures could be leveraged by homology modeling to further expand the structural coverage of the protein universe and increase biological insights. Here, we describe what could be achieved by this structural genomics approach, using as an illustration the contributions from 20 NMR structure determinations out of a total of 98 JCSG NMR structures, which were selected because they are the first three-dimensional structure representations of the respective Pfam protein families. The information from this small sample is representative for the overall results from crystal and NMR structure determination in the JCSG. There are five new folds, which were classified as domains of unknown functions (DUF), three of the proteins could be functionally annotated based on three-dimensional structure similarity with previously characterized proteins, and 12 proteins showed only limited similarity with previous deposits in the Protein Data Bank (PDB) and were classified as DUFs. © 2016 Federation of European Biochemical Societies.
The discovery of zinc fingers and their development for practical applications in gene regulation and genome manipulation.

PubMed

Klug, Aaron

2010-02-01

A long-standing goal of molecular biologists has been to construct DNA-binding proteins for the control of gene expression. The classical Cys2His2 (C2H2) zinc finger design is ideally suited for such purposes. Discriminating between closely related DNA sequences both in vitro and in vivo, this naturally occurring design was adopted for engineering zinc finger proteins (ZFPs) to target genes specifically. Zinc fingers were discovered in 1985, arising from the interpretation of our biochemical studies on the interaction of the Xenopus protein transcription factor IIIA (TFIIIA) with 5S RNA. Subsequent structural studies revealed its three-dimensional structure and its interaction with DNA. Each finger constitutes a self-contained domain stabilized by a zinc (Zn) ion ligated to a pair of cysteines and a pair of histidines and also by an inner structural hydrophobic core. This discovery showed not only a new protein fold but also a novel principle of DNA recognition. Whereas other DNA-binding proteins generally make use of the 2-fold symmetry of the double helix, functioning as homo- or heterodimers, zinc fingers can be linked linearly in tandem to recognize nucleic acid sequences of varying lengths. This modular design offers a large number of combinatorial possibilities for the specific recognition of DNA (or RNA). It is therefore not surprising that the zinc finger is found widespread in nature, including 3% of the genes of the human genome. The zinc finger design can be used to construct DNA-binding proteins for specific intervention in gene expression. By fusing selected zinc finger peptides to repression or activation domains, genes can be selectively switched off or on by targeting the peptide to the desired gene target. It was also suggested that by combining an appropriate zinc finger peptide with other effector or functional domains, e.g. from nucleases or integrases to form chimaeric proteins, genomes could be modified or manipulated. The first example of the power of the method was published in 1994 when a three-finger protein was constructed to block the expression of a human oncogene transformed into a mouse cell line. The same paper also described how a reporter gene was activated by targeting an inserted 9-base pair (bp) sequence, which acts as the promoter. Thus, by fusing zinc finger peptides to repression or activation domains, genes can be selectively switched off or on. It was also suggested that, by combining zinc fingers with other effector or functional domains, e.g. from nucleases or integrases, to form chimaeric proteins, genomes could be manipulated or modified. Several applications of such engineered ZFPs are described here, including some of therapeutic importance, and also their adaptation for breeding improved crop plants.
The Flavivirus Protease As a Target for Drug Discovery

PubMed Central

Brecher, Matthew; Zhang, Jing; Li, Hongmin

2014-01-01

Many flaviviruses are significant human pathogens causing considerable disease burdens, including encephalitis and hemorrhagic fever, in the regions in which they are endemic. A paucity of treatments for flaviviral infections has driven interest in drug development targeting proteins essential to flavivirus replication, such as the viral protease. During viral replication, the flavivirus genome is translated as a single polyprotein precursor, which must be cleaved into individual proteins by a complex of the viral protease, NS3, and its cofactor, NS2B. Because this cleavage is an obligate step of the viral life-cycle, the flavivirus protease is an attractive target for antiviral drug development. In this review, we will survey recent drug development studies targeting the NS3 active site, as well as studies targeting an NS2B/NS3 interaction site determined from flavivirus protease crystal structures. PMID:24242363
The flavivirus protease as a target for drug discovery.

PubMed

Brecher, Matthew; Zhang, Jing; Li, Hongmin

2013-12-01

Many flaviviruses are significant human pathogens causing considerable disease burdens, including encephalitis and hemorrhagic fever, in the regions in which they are endemic. A paucity of treatments for flaviviral infections has driven interest in drug development targeting proteins essential to flavivirus replication, such as the viral protease. During viral replication, the flavivirus genome is translated as a single polyprotein precursor, which must be cleaved into individual proteins by a complex of the viral protease, NS3, and its cofactor, NS2B. Because this cleavage is an obligate step of the viral life-cycle, the flavivirus protease is an attractive target for antiviral drug development. In this review, we will survey recent drug development studies targeting the NS3 active site, as well as studies targeting an NS2B/NS3 interaction site determined from flavivirus protease crystal structures.
Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research.

PubMed

Talkowski, Michael E; Ernst, Carl; Heilbut, Adrian; Chiang, Colby; Hanscom, Carrie; Lindgren, Amelia; Kirby, Andrew; Liu, Shangtao; Muddukrishna, Bhavana; Ohsumi, Toshiro K; Shen, Yiping; Borowsky, Mark; Daly, Mark J; Morton, Cynthia C; Gusella, James F

2011-04-08

The contribution of balanced chromosomal rearrangements to complex disorders remains unclear because they are not detected routinely by genome-wide microarrays and clinical localization is imprecise. Failure to consider these events bypasses a potentially powerful complement to single nucleotide polymorphism and copy-number association approaches to complex disorders, where much of the heritability remains unexplained. To capitalize on this genetic resource, we have applied optimized sequencing and analysis strategies to test whether these potentially high-impact variants can be mapped at reasonable cost and throughput. By using a whole-genome multiplexing strategy, rearrangement breakpoints could be delineated at a fraction of the cost of standard sequencing. For rearrangements already mapped regionally by karyotyping and fluorescence in situ hybridization, a targeted approach enabled capture and sequencing of multiple breakpoints simultaneously. Importantly, this strategy permitted capture and unique alignment of up to 97% of repeat-masked sequences in the targeted regions. Genome-wide analyses estimate that only 3.7% of bases should be routinely omitted from genomic DNA capture experiments. Illustrating the power of these approaches, the rearrangement breakpoints were rapidly defined to base pair resolution and revealed unexpected sequence complexity, such as co-occurrence of inversion and translocation as an underlying feature of karyotypically balanced alterations. These findings have implications ranging from genome annotation to de novo assemblies and could enable sequencing screens for structural variations at a cost comparable to that of microarrays in standard clinical practice. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Genome-wide identification and analysis of the SBP-box family genes in apple (Malus × domestica Borkh.).

PubMed

Li, Jun; Hou, Hongmin; Li, Xiaoqin; Xiang, Jiang; Yin, Xiangjing; Gao, Hua; Zheng, Yi; Bassett, Carole L; Wang, Xiping

2013-09-01

SQUAMOSA promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors and play many crucial roles in plant development. In this study, 27 SBP-box gene family members were identified in the apple (Malus × domestica Borkh.) genome, 15 of which were suggested to be putative targets of MdmiR156. Plant SBPs were classified into eight groups according to the phylogenetic analysis of SBP-domain proteins. Gene structure, gene chromosomal location and synteny analyses of MdSBP genes within the apple genome demonstrated that tandem and segmental duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of the SBP-box gene family in apple. Additionally, synteny analysis between apple and Arabidopsis indicated that several paired homologs of MdSBP and AtSPL genes were located in syntenic genomic regions. Tissue-specific expression analysis of MdSBP genes in apple demonstrated their diversified spatiotemporal expression patterns. Most MdmiR156-targeted MdSBP genes, which had relatively high transcript levels in stems, leaves, apical buds and some floral organs, exhibited a more differential expression pattern than most MdmiR156-nontargeted MdSBP genes. Finally, expression analysis of MdSBP genes in leaves upon various plant hormone treatments showed that many MdSBP genes were responsive to different plant hormones, indicating that MdSBP genes may be involved in responses to hormone signaling during stress or in apple development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Urban landscape genomics identifies fine-scale gene flow patterns in an avian invasive.

PubMed

Low, G W; Chattopadhyay, B; Garg, K M; Irestedt, M; Ericson, Pgp; Yap, G; Tang, Q; Wu, S; Rheindt, F E

2018-01-01

Invasive species exert a serious impact on native fauna and flora and have been the target of many eradication and management efforts worldwide. However, a lack of data on population structure and history, exacerbated by the recency of many species introductions, limits the efficiency with which such species can be kept at bay. In this study we generated a novel genome of high assembly quality and genotyped 4735 genome-wide single nucleotide polymorphic (SNP) markers from 78 individuals of an invasive population of the Javan Myna Acridotheres javanicus across the island of Singapore. We inferred limited population subdivision at a micro-geographic level, a genetic patch size (~13-14 km) indicative of a pronounced dispersal ability, and barely an increase in effective population size since introduction despite an increase of four to five orders of magnitude in actual population size, suggesting that low population-genetic diversity following a bottleneck has not impeded establishment success. Landscape genomic analyses identified urban features, such as low-rise neighborhoods, that constitute pronounced barriers to gene flow. Based on our data, we consider an approach targeting the complete eradication of Javan Mynas across Singapore to be unfeasible. Instead, a mixed approach of localized mitigation measures taking into account urban geographic features and planning policy may be the most promising avenue to reducing the adverse impacts of this urban pest. Our study demonstrates how genomic methods can directly inform the management and control of invasive species, even in geographically limited datasets with high gene flow rates.

VaProS: a database-integration approach for protein/genome information retrieval.

PubMed

Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

2016-12-01

Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
The use of genomic coancestry matrices in the optimisation of contributions to maintain genetic diversity at specific regions of the genome.

PubMed

Gómez-Romano, Fernando; Villanueva, Beatriz; Fernández, Jesús; Woolliams, John A; Pong-Wong, Ricardo

2016-01-13

Optimal contribution methods have proved to be very efficient for controlling the rates at which coancestry and inbreeding increase and therefore, for maintaining genetic diversity. These methods have usually relied on pedigree information for estimating genetic relationships between animals. However, with the large amount of genomic information now available such as high-density single nucleotide polymorphism (SNP) chips that contain thousands of SNPs, it becomes possible to calculate more accurate estimates of relationships and to target specific regions in the genome where there is a particular interest in maximising genetic diversity. The objective of this study was to investigate the effectiveness of using genomic coancestry matrices for: (1) minimising the loss of genetic variability at specific genomic regions while restricting the overall loss in the rest of the genome; or (2) maximising the overall genetic diversity while restricting the loss of diversity at specific genomic regions. Our study shows that the use of genomic coancestry was very successful at minimising the loss of diversity and outperformed the use of pedigree-based coancestry (genetic diversity even increased in some scenarios). The results also show that genomic information allows a targeted optimisation to maintain diversity at specific genomic regions, whether they are linked or not. The level of variability maintained increased when the targeted regions were closely linked. However, such targeted management leads to an important loss of diversity in the rest of the genome and, thus, it is necessary to take further actions to constrain this loss. Optimal contribution methods also proved to be effective at restricting the loss of diversity in the rest of the genome, although the resulting rate of coancestry was higher than the constraint imposed. The use of genomic matrices when optimising contributions permits the control of genetic diversity and inbreeding at specific regions of the genome through the minimisation of partial genomic coancestry matrices. The formula used to predict coancestry in the next generation produces biased results and therefore it is necessary to refine the theory of genetic contributions when genomic matrices are used to optimise contributions.
CRISPR/Cas9-mediated gene targeting in Arabidopsis using sequential transformation.

PubMed

Miki, Daisuke; Zhang, Wenxin; Zeng, Wenjie; Feng, Zhengyan; Zhu, Jian-Kang

2018-05-17

Homologous recombination-based gene targeting is a powerful tool for precise genome modification and has been widely used in organisms ranging from yeast to higher organisms such as Drosophila and mouse. However, gene targeting in higher plants, including the most widely used model plant Arabidopsis thaliana, remains challenging. Here we report a sequential transformation method for gene targeting in Arabidopsis. We find that parental lines expressing the bacterial endonuclease Cas9 from the egg cell- and early embryo-specific DD45 gene promoter can improve the frequency of single-guide RNA-targeted gene knock-ins and sequence replacements via homologous recombination at several endogenous sites in the Arabidopsis genome. These heritable gene targeting can be identified by regular PCR. Our approach enables routine and fine manipulation of the Arabidopsis genome.
Herpes simplex virus VP16, but not ICP0, is required to reduce histone occupancy and enhance histone acetylation on viral genomes in U2OS osteosarcoma cells.

PubMed

Hancock, Meaghan H; Cliffe, Anna R; Knipe, David M; Smiley, James R

2010-02-01

The herpes simplex virus (HSV) genome rapidly becomes associated with histones after injection into the host cell nucleus. The viral proteins ICP0 and VP16 are required for efficient viral gene expression and have been implicated in reducing the levels of underacetylated histones on the viral genome, raising the possibility that high levels of underacetylated histones inhibit viral gene expression. The U2OS osteosarcoma cell line is permissive for replication of ICP0 and VP16 mutants and appears to lack an innate antiviral repression mechanism present in other cell types. We therefore used chromatin immunoprecipitation to determine whether U2OS cells are competent to load histones onto HSV DNA and, if so, whether ICP0 and/or VP16 are required to reduce histone occupancy and enhance acetylation in this cell type. High levels of underacetylated histone H3 accumulated at several locations on the viral genome in the absence of VP16 activation function; in contrast, an ICP0 mutant displayed markedly reduced histone levels and enhanced acetylation, similar to wild-type HSV. These results demonstrate that U2OS cells are competent to load underacetylated histones onto HSV DNA and uncover an unexpected role for VP16 in modulating chromatin structure at viral early and late loci. One interpretation of these findings is that ICP0 and VP16 affect viral chromatin structure through separate pathways, and the pathway targeted by ICP0 is defective in U2OS cells. We also show that HSV infection results in decreased histone levels on some actively transcribed genes within the cellular genome, demonstrating that viral infection alters cellular chromatin structure.
Herpes Simplex Virus VP16, but Not ICP0, Is Required To Reduce Histone Occupancy and Enhance Histone Acetylation on Viral Genomes in U2OS Osteosarcoma Cells▿ †

PubMed Central

Hancock, Meaghan H.; Cliffe, Anna R.; Knipe, David M.; Smiley, James R.

2010-01-01

The herpes simplex virus (HSV) genome rapidly becomes associated with histones after injection into the host cell nucleus. The viral proteins ICP0 and VP16 are required for efficient viral gene expression and have been implicated in reducing the levels of underacetylated histones on the viral genome, raising the possibility that high levels of underacetylated histones inhibit viral gene expression. The U2OS osteosarcoma cell line is permissive for replication of ICP0 and VP16 mutants and appears to lack an innate antiviral repression mechanism present in other cell types. We therefore used chromatin immunoprecipitation to determine whether U2OS cells are competent to load histones onto HSV DNA and, if so, whether ICP0 and/or VP16 are required to reduce histone occupancy and enhance acetylation in this cell type. High levels of underacetylated histone H3 accumulated at several locations on the viral genome in the absence of VP16 activation function; in contrast, an ICP0 mutant displayed markedly reduced histone levels and enhanced acetylation, similar to wild-type HSV. These results demonstrate that U2OS cells are competent to load underacetylated histones onto HSV DNA and uncover an unexpected role for VP16 in modulating chromatin structure at viral early and late loci. One interpretation of these findings is that ICP0 and VP16 affect viral chromatin structure through separate pathways, and the pathway targeted by ICP0 is defective in U2OS cells. We also show that HSV infection results in decreased histone levels on some actively transcribed genes within the cellular genome, demonstrating that viral infection alters cellular chromatin structure. PMID:19939931
Exome-wide DNA capture and next generation sequencing in domestic and wild species.

PubMed

Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

2011-07-05

Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.
MIPS PlantsDB: a database framework for comparative plant genome research.

PubMed

Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel

2013-01-01

The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.
MIPS PlantsDB: a database framework for comparative plant genome research

PubMed Central

Nussbaumer, Thomas; Martis, Mihaela M.; Roessner, Stephan K.; Pfeifer, Matthias; Bader, Kai C.; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel

2013-01-01

The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834–D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB. PMID:23203886
Progress of targeted genome modification approaches in higher plants.

PubMed

Cardi, Teodoro; Neal Stewart, C

2016-07-01

Transgene integration in plants is based on illegitimate recombination between non-homologous sequences. The low control of integration site and number of (trans/cis)gene copies might have negative consequences on the expression of transferred genes and their insertion within endogenous coding sequences. The first experiments conducted to use precise homologous recombination for gene integration commenced soon after the first demonstration that transgenic plants could be produced. Modern transgene targeting categories used in plant biology are: (a) homologous recombination-dependent gene targeting; (b) recombinase-mediated site-specific gene integration; (c) oligonucleotide-directed mutagenesis; (d) nuclease-mediated site-specific genome modifications. New tools enable precise gene replacement or stacking with exogenous sequences and targeted mutagenesis of endogeneous sequences. The possibility to engineer chimeric designer nucleases, which are able to target virtually any genomic site, and use them for inducing double-strand breaks in host DNA create new opportunities for both applied plant breeding and functional genomics. CRISPR is the most recent technology available for precise genome editing. Its rapid adoption in biological research is based on its inherent simplicity and efficacy. Its utilization, however, depends on available sequence information, especially for genome-wide analysis. We will review the approaches used for genome modification, specifically those for affecting gene integration and modification in higher plants. For each approach, the advantages and limitations will be noted. We also will speculate on how their actual commercial development and implementation in plant breeding will be affected by governmental regulations.
Target Capture during Mos1 Transposition*

PubMed Central

Pflieger, Aude; Jaillet, Jerôme; Petit, Agnès; Augé-Gouillou, Corinne; Renault, Sylvaine

2014-01-01

DNA transposition contributes to genomic plasticity. Target capture is a key step in the transposition process, because it contributes to the selection of new insertion sites. Nothing or little is known about how eukaryotic mariner DNA transposons trigger this step. In the case of Mos1, biochemistry and crystallography have deciphered several inverted terminal repeat-transposase complexes that are intermediates during transposition. However, the target capture complex is still unknown. Here, we show that the preintegration complex (i.e., the excised transposon) is the only complex able to capture a target DNA. Mos1 transposase does not support target commitment, which has been proposed to explain Mos1 random genomic integrations within host genomes. We demonstrate that the TA dinucleotide used as the target is crucial both to target recognition and in the chemistry of the strand transfer reaction. Bent DNA molecules are better targets for the capture when the target DNA is nicked two nucleotides apart from the TA. They improve strand transfer when the target DNA contains a mismatch near the TA dinucleotide. PMID:24269942
Target capture during Mos1 transposition.

PubMed

Pflieger, Aude; Jaillet, Jerôme; Petit, Agnès; Augé-Gouillou, Corinne; Renault, Sylvaine

2014-01-03

DNA transposition contributes to genomic plasticity. Target capture is a key step in the transposition process, because it contributes to the selection of new insertion sites. Nothing or little is known about how eukaryotic mariner DNA transposons trigger this step. In the case of Mos1, biochemistry and crystallography have deciphered several inverted terminal repeat-transposase complexes that are intermediates during transposition. However, the target capture complex is still unknown. Here, we show that the preintegration complex (i.e., the excised transposon) is the only complex able to capture a target DNA. Mos1 transposase does not support target commitment, which has been proposed to explain Mos1 random genomic integrations within host genomes. We demonstrate that the TA dinucleotide used as the target is crucial both to target recognition and in the chemistry of the strand transfer reaction. Bent DNA molecules are better targets for the capture when the target DNA is nicked two nucleotides apart from the TA. They improve strand transfer when the target DNA contains a mismatch near the TA dinucleotide.
Potential in vivo roles of nucleic acid triple-helices

PubMed Central

Buske, Fabian A

2011-01-01

The ability of double-stranded DNA to form a triple-helical structure by hydrogen bonding with a third strand is well established, but the biological functions of these structures remain largely unknown. There is considerable albeit circumstantial evidence for the existence of nucleic triplexes in vivo and their potential participation in a variety of biological processes including chromatin organization, DNA repair, transcriptional regulation and RNA processing has been investigated in a number of studies to date. There is also a range of possible mechanisms to regulate triplex formation through differential expression of triplex-forming RNAs, alteration of chromatin accessibility, sequence unwinding and nucleotide modifications. With the advent of next generation sequencing technology combined with targeted approaches to isolate triplexes, it is now possible to survey triplex formation with respect to their genomic context, abundance and dynamical changes during differentiation and development, which may open up new vistas in understanding genome biology and gene regulation. PMID:21525785
Draft versus finished sequence data for DNA and protein diagnostic signature development

PubMed Central

Gardner, Shea N.; Lam, Marisa W.; Smith, Jason R.; Torres, Clinton L.; Slezak, Tom R.

2005-01-01

Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10−3–10−5 (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. PMID:16243783
MODBASE, a database of annotated comparative protein structure models

PubMed Central

Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej

2002-01-01

MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
Daclatasvir Prevents Hepatitis C Virus Infectivity by Blocking Transfer of the Viral Genome to Assembly Sites.

PubMed

Boson, Bertrand; Denolly, Solène; Turlure, Fanny; Chamot, Christophe; Dreux, Marlène; Cosset, François-Loïc

2017-03-01

Daclatasvir is a direct-acting antiviral agent and potent inhibitor of NS5A, which is involved in replication of the hepatitis C virus (HCV) genome, presumably via membranous web shaping, and assembly of new virions, likely via transfer of the HCV RNA genome to viral particle assembly sites. Daclatasvir inhibits the formation of new membranous web structures and, ultimately, of replication complex vesicles, but also inhibits an early assembly step. We investigated the relationship between daclatasvir-induced clustering of HCV proteins, intracellular localization of viral RNAs, and inhibition of viral particle assembly. Cell-culture-derived HCV particles were produced from Huh7.5 hepatocarcinoma cells in presence of daclatasvir for short time periods. Infectivity and production of physical particles were quantified and producer cells were subjected to subcellular fractionation. Intracellular colocalization between core, E2, NS5A, NS4B proteins, and viral RNAs was quantitatively analyzed by confocal microscopy and by structured illumination microscopy. Short exposure of HCV-infected cells to daclatasvir reduced viral assembly and induced clustering of structural proteins with non-structural HCV proteins, including core, E2, NS4B, and NS5A. These clustered structures appeared to be inactive assembly platforms, likely owing to loss of functional connection with replication complexes. Daclatasvir greatly reduced delivery of viral genomes to these core clusters without altering HCV RNA colocalization with NS5A. In contrast, daclatasvir neither induced clustered structures nor inhibited HCV assembly in cells infected with a daclatasvir-resistant mutant (NS5A-Y93H), indicating that daclatasvir targets a mutual, specific function of NS5A inhibiting both processes. In addition to inhibiting replication complex biogenesis, daclatasvir prevents viral assembly by blocking transfer of the viral genome to assembly sites. This leads to clustering of HCV proteins because viral particles and replication complex vesicles cannot form or egress. This dual mode of action of daclatasvir could explain its efficacy in blocking HCV replication in cultured cells and in treatment of patients with HCV infection. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Crowd Sourcing a New Paradigm for Interactome Driven Drug Target Identification in Mycobacterium tuberculosis

PubMed Central

Rohira, Harsha; Bhat, Ashwini G.; Passi, Anurag; Mukherjee, Keya; Choudhary, Kumari Sonal; Kumar, Vikas; Arora, Anshula; Munusamy, Prabhakaran; Subramanian, Ahalyaa; Venkatachalam, Aparna; S, Gayathri; Raj, Sweety; Chitra, Vijaya; Verma, Kaveri; Zaheer, Salman; J, Balaganesh; Gurusamy, Malarvizhi; Razeeth, Mohammed; Raja, Ilamathi; Thandapani, Madhumohan; Mevada, Vishal; Soni, Raviraj; Rana, Shruti; Ramanna, Girish Muthagadhalli; Raghavan, Swetha; Subramanya, Sunil N.; Kholia, Trupti; Patel, Rajesh; Bhavnani, Varsha; Chiranjeevi, Lakavath; Sengupta, Soumi; Singh, Pankaj Kumar; Atray, Naresh; Gandhi, Swati; Avasthi, Tiruvayipati Suma; Nisthar, Shefin; Anurag, Meenakshi; Sharma, Pratibha; Hasija, Yasha; Dash, Debasis; Sharma, Arun; Scaria, Vinod; Thomas, Zakir; Chandra, Nagasuma; Brahmachari, Samir K.; Bhardwaj, Anshu

2012-01-01

A decade since the availability of Mycobacterium tuberculosis (Mtb) genome sequence, no promising drug has seen the light of the day. This not only indicates the challenges in discovering new drugs but also suggests a gap in our current understanding of Mtb biology. We attempt to bridge this gap by carrying out extensive re-annotation and constructing a systems level protein interaction map of Mtb with an objective of finding novel drug target candidates. Towards this, we synergized crowd sourcing and social networking methods through an initiative ‘Connect to Decode’ (C2D) to generate the first and largest manually curated interactome of Mtb termed ‘interactome pathway’ (IPW), encompassing a total of 1434 proteins connected through 2575 functional relationships. Interactions leading to gene regulation, signal transduction, metabolism, structural complex formation have been catalogued. In the process, we have functionally annotated 87% of the Mtb genome in context of gene products. We further combine IPW with STRING based network to report central proteins, which may be assessed as potential drug targets for development of drugs with least possible side effects. The fact that five of the 17 predicted drug targets are already experimentally validated either genetically or biochemically lends credence to our unique approach. PMID:22808064
Structural insight into SUMO chain recognition and manipulation by the ubiquitin ligase RNF4

PubMed Central

Xu, Yingqi; Plechanovová, Anna; Simpson, Peter; Marchant, Jan; Leidecker, Orsolya; Kraatz, Sebastian; Hay, Ronald T.; Matthews, Steve J.

2014-01-01

The small ubiquitin-like modifier (SUMO) can form polymeric chains that are important signals in cellular processes such as meiosis, genome maintenance and stress response. The SUMO-targeted ubiquitin ligase RNF4 engages with SUMO chains on linked substrates and catalyses their ubiquitination, which targets substrates for proteasomal degradation. Here we use a segmental labelling approach combined with solution nuclear magnetic resonance (NMR) spectroscopy and biochemical characterization to reveal how RNF4 manipulates the conformation of the SUMO chain, thereby facilitating optimal delivery of the distal SUMO domain for ubiquitin transfer. PMID:24969970
Generation of knock-in primary human T cells using Cas9 ribonucleoproteins

DOE PAGES

Schumann, Kathrin; Lin, Steven; Boyer, Eric; ...

2015-07-27

T-cell genome engineering holds great promise for cell-based therapies for cancer, HIV, primary immune deficiencies, and autoimmune diseases, but genetic manipulation of human T cells has been challenging. Improved tools are needed to efficiently “knock out” genes and “knock in” targeted genome modifications to modulate T-cell function and correct disease-associated mutations. CRISPR/Cas9 technology is facilitating genome engineering in many cell types, but in human T cells its efficiency has been limited and it has not yet proven useful for targeted nucleotide replacements. Here we report efficient genome engineering in human CD4 + T cells using Cas9:single-guide RNA ribonucleoproteins (Cas9 RNPs).more » Cas9 RNPs allowed ablation of CXCR4, a coreceptor for HIV entry. Cas9 RNP electroporation caused up to ~40% of cells to lose high-level cell-surface expression of CXCR4, and edited cells could be enriched by sorting based on low CXCR4 expression. Importantly, Cas9 RNPs paired with homology-directed repair template oligonucleotides generated a high frequency of targeted genome modifications in primary T cells. Targeted nucleotide replacement was achieved in CXCR4 and PD-1 ( PDCD1), a regulator of T-cell exhaustion that is a validated target for tumor immunotherapy. Deep sequencing of a target site confirmed that Cas9 RNPs generated knock-in genome modifications with up to ~20% efficiency, which accounted for up to approximately one-third of total editing events. These results establish Cas9 RNP technology for diverse experimental and therapeutic genome engineering applications in primary human T cells.« less
Generation of knock-in primary human T cells using Cas9 ribonucleoproteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schumann, Kathrin; Lin, Steven; Boyer, Eric

T-cell genome engineering holds great promise for cell-based therapies for cancer, HIV, primary immune deficiencies, and autoimmune diseases, but genetic manipulation of human T cells has been challenging. Improved tools are needed to efficiently “knock out” genes and “knock in” targeted genome modifications to modulate T-cell function and correct disease-associated mutations. CRISPR/Cas9 technology is facilitating genome engineering in many cell types, but in human T cells its efficiency has been limited and it has not yet proven useful for targeted nucleotide replacements. Here we report efficient genome engineering in human CD4 + T cells using Cas9:single-guide RNA ribonucleoproteins (Cas9 RNPs).more » Cas9 RNPs allowed ablation of CXCR4, a coreceptor for HIV entry. Cas9 RNP electroporation caused up to ~40% of cells to lose high-level cell-surface expression of CXCR4, and edited cells could be enriched by sorting based on low CXCR4 expression. Importantly, Cas9 RNPs paired with homology-directed repair template oligonucleotides generated a high frequency of targeted genome modifications in primary T cells. Targeted nucleotide replacement was achieved in CXCR4 and PD-1 ( PDCD1), a regulator of T-cell exhaustion that is a validated target for tumor immunotherapy. Deep sequencing of a target site confirmed that Cas9 RNPs generated knock-in genome modifications with up to ~20% efficiency, which accounted for up to approximately one-third of total editing events. These results establish Cas9 RNP technology for diverse experimental and therapeutic genome engineering applications in primary human T cells.« less
Genomic Copy Number Dictates a Gene-Independent Cell Response to CRISPR/Cas9 Targeting | Office of Cancer Genomics

Cancer.gov

The CRISPR/Cas9 system enables genome editing and somatic cell genetic screens in mammalian cells. We performed genome-scale loss-of-function screens in 33 cancer cell lines to identify genes essential for proliferation/survival and found a strong correlation between increased gene copy number and decreased cell viability after genome editing. Within regions of copy-number gain, CRISPR/Cas9 targeting of both expressed and unexpressed genes, as well as intergenic loci, led to significantly decreased cell proliferation through induction of a G2 cell-cycle arrest.

Genomes by design

PubMed Central

Haimovich, Adrian D.; Muir, Paul; Isaacs, Farren J.

2016-01-01

Next-generation DNA sequencing has revealed the complete genome sequences of numerous organisms, establishing a fundamental and growing understanding of genetic variation and phenotypic diversity. Engineering at the gene, network and whole-genome scale aims to introduce targeted genetic changes both to explore emergent phenotypes and to introduce new functionalities. Expansion of these approaches into massively parallel platforms establishes the ability to generate targeted genome modifications, elucidating causal links between genotype and phenotype, as well as the ability to design and reprogramme organisms. In this Review, we explore techniques and applications in genome engineering, outlining key advances and defining challenges. PMID:26260262
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

PubMed Central

Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

2016-01-01

Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
The Neisseria meningitidis CRISPR-Cas9 System Enables Specific Genome Editing in Mammalian Cells.

PubMed

Lee, Ciaran M; Cradick, Thomas J; Bao, Gang

2016-03-01

The clustered regularly-interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) system from Streptococcus pyogenes (Spy) has been successfully adapted for RNA-guided genome editing in a wide range of organisms. However, numerous reports have indicated that Spy CRISPR-Cas9 systems may have significant off-target cleavage of genomic DNA sequences differing from the intended on-target site. Here, we report the performance of the Neisseria meningitidis (Nme) CRISPR-Cas9 system that requires a longer protospacer-adjacent motif for site-specific cleavage, and present a comparison between the Spy and Nme CRISPR-Cas9 systems targeting the same protospacer sequence. The results with the native crRNA and tracrRNA as well as a chimeric single guide RNA for the Nme CRISPR-Cas9 system were also compared. Our results suggest that, compared with the Spy system, the Nme CRISPR-Cas9 system has similar or lower on-target cleavage activity but a reduced overall off-target effect on a genomic level when sites containing three or fewer mismatches are considered. Thus, the Nme CRISPR-Cas9 system may represent a safer alternative for precision genome engineering applications.
The Neisseria meningitidis CRISPR-Cas9 System Enables Specific Genome Editing in Mammalian Cells

PubMed Central

Lee, Ciaran M; Cradick, Thomas J; Bao, Gang

2016-01-01

The clustered regularly-interspaced short palindromic repeats (CRISPR)—CRISPR-associated (Cas) system from Streptococcus pyogenes (Spy) has been successfully adapted for RNA-guided genome editing in a wide range of organisms. However, numerous reports have indicated that Spy CRISPR-Cas9 systems may have significant off-target cleavage of genomic DNA sequences differing from the intended on-target site. Here, we report the performance of the Neisseria meningitidis (Nme) CRISPR-Cas9 system that requires a longer protospacer-adjacent motif for site-specific cleavage, and present a comparison between the Spy and Nme CRISPR-Cas9 systems targeting the same protospacer sequence. The results with the native crRNA and tracrRNA as well as a chimeric single guide RNA for the Nme CRISPR-Cas9 system were also compared. Our results suggest that, compared with the Spy system, the Nme CRISPR-Cas9 system has similar or lower on-target cleavage activity but a reduced overall off-target effect on a genomic level when sites containing three or fewer mismatches are considered. Thus, the Nme CRISPR-Cas9 system may represent a safer alternative for precision genome engineering applications. PMID:26782639
Development of CRISPR/Cas9 mediated virus resistance in agriculturally important crops.

PubMed

Khatodia, Surender; Bhatotia, Kirti; Tuteja, Narendra

2017-05-04

Clustered regulatory interspaced short palindromic repeats (CRISPR)/CRISPR associated nuclease 9 (Cas9) system of targeted genome editing has already revolutionized the plant science research. This is a RNA guided programmable endonuclease based system composed of 2 components, the Cas9 nuclease and an engineered guide RNA targeting any DNA sequence of the form N20-NGG for novel genome editing applications. The CRISPR/Cas9 technology of targeted genome editing has been recently applied for imparting virus resistance in plants. The robustness, wide adaptability, and easy engineering of this system has proved its potential as an antiviral tool for plants. Novel DNA free genome editing by using the preassembled Cas9/gRNA ribonucleoprotein complex for development of virus resistance in any plant species have been prospected for the future. Also, in this review we have discussed the reports of CRISPR/Cas9 mediated virus resistance strategy against geminiviruses by targeting the viral genome and transgene free strategy against RNA viruses by targeting the host plant factors. In conclusion, CRISPR/Cas9 technology will provide a more durable and broad spectrum viral resistance in agriculturally important crops which will eventually lead to public acceptance and commercialization in the near future.
A new age in functional genomics using CRISPR/Cas9 in arrayed library screening.

PubMed

Agrotis, Alexander; Ketteler, Robin

2015-01-01

CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development. What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA. Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions. For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene. In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening. In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening.
Structure-based functional annotation of putative conserved proteins having lyase activity from Haemophilus influenzae.

PubMed

Shahbaaz, Mohd; Ahmad, Faizan; Imtaiyaz Hassan, Md

2015-06-01

Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure-function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.
CRISPR/Cas9 for genome editing: progress, implications and challenges.

PubMed

Zhang, Feng; Wen, Yan; Guo, Xiong

2014-09-15

Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the elucidation of target gene function in biology and diseases. CRISPR/Cas9 comprises of a nonspecific Cas9 nuclease and a set of programmable sequence-specific CRISPR RNA (crRNA), which can guide Cas9 to cleave DNA and generate double-strand breaks at target sites. Subsequent cellular DNA repair process leads to desired insertions, deletions or substitutions at target sites. The specificity of CRISPR/Cas9-mediated DNA cleavage requires target sequences matching crRNA and a protospacer adjacent motif locating at downstream of target sequences. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The molecular epidemiological study of bovine leukemia virus infection in Myanmar cattle.

PubMed

Polat, Meripet; Moe, Hla Hla; Shimogiri, Takeshi; Moe, Kyaw Kyaw; Takeshima, Shin-Nosuke; Aida, Yoko

2017-02-01

Bovine leukemia virus (BLV) is the etiological agent of enzootic bovine leukosis, which is the most common neoplastic disease of cattle. BLV infects cattle worldwide and affects both health status and productivity. However, no studies have examined the distribution of BLV in Myanmar, and the genetic characteristics of Myanmar BLV strains are unknown. Therefore, the aim of this study was to detect BLV infection in Myanmar and examine genetic variability. Blood samples were obtained from 66 cattle from different farms in four townships of the Nay Pyi Taw Union Territory of central Myanmar. BLV provirus was detected by nested PCR and real-time PCR targeting BLV long terminal repeats. Results were confirmed by nested PCR targeting the BLV env-gp51 gene and real-time PCR targeting the BLV tax gene. Out of 66 samples, six (9.1 %) were positive for BLV provirus. A phylogenetic tree, constructed using five distinct partial and complete env-gp51 sequences from BLV strains isolated from three different townships, indicated that Myanmar strains were genotype-10. A phylogenetic tree constructed from whole genome sequences obtained by sequencing cloned, overlapping PCR products from two Myanmar strains confirmed the existence of genotype-10 in Myanmar. Comparative analysis of complete genome sequences identified genotype-10-specific amino acid substitutions in both structural and non-structural genes, thereby distinguishing genotype-10 strains from other known genotypes. This study provides information regarding BLV infection levels in Myanmar and confirms that genotype-10 is circulating in Myanmar.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

PubMed Central

Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

2009-01-01

Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
Diverse Class 2 CRISPR-Cas Effector Proteins for Genome Engineering Applications.

PubMed

Pyzocha, Neena K; Chen, Sidi

2018-02-16

CRISPR-Cas genome editing technologies have revolutionized modern molecular biology by making targeted DNA edits simple and scalable. These technologies are developed by domesticating naturally occurring microbial adaptive immune systems that display wide diversity of functionality for targeted nucleic acid cleavage. Several CRISPR-Cas single effector enzymes have been characterized and engineered for use in mammalian cells. The unique properties of the single effector enzymes can make a critical difference in experimental use or targeting specificity. This review describes known single effector enzymes and discusses their use in genome engineering applications.
Applications of CRISPR genome editing technology in drug target identification and validation.

PubMed

Lu, Quinn; Livi, George P; Modha, Sundip; Yusa, Kosuke; Macarrón, Ricardo; Dow, David J

2017-06-01

The analysis of pharmaceutical industry data indicates that the major reason for drug candidates failing in late stage clinical development is lack of efficacy, with a high proportion of these due to erroneous hypotheses about target to disease linkage. More than ever, there is a requirement to better understand potential new drug targets and their role in disease biology in order to reduce attrition in drug development. Genome editing technology enables precise modification of individual protein coding genes, as well as noncoding regulatory sequences, enabling the elucidation of functional effects in human disease relevant cellular systems. Areas covered: This article outlines applications of CRISPR genome editing technology in target identification and target validation studies. Expert opinion: Applications of CRISPR technology in target validation studies are in evidence and gaining momentum. Whilst technical challenges remain, we are on the cusp of CRISPR being applied in complex cell systems such as iPS derived differentiated cells and stem cell derived organoids. In the meantime, our experience to date suggests that precise genome editing of putative targets in primary cell systems is possible, offering more human disease relevant systems than conventional cell lines.
A bend, flip and trap mechanism for transposon integration

PubMed Central

Morris, Elizabeth R; Grey, Heather; McKenzie, Grant; Jones, Anita C; Richardson, Julia M

2016-01-01

Cut-and-paste DNA transposons of the mariner/Tc1 family are useful tools for genome engineering and are inserted specifically at TA target sites. A crystal structure of the mariner transposase Mos1 (derived from Drosophila mauritiana), in complex with transposon ends covalently joined to target DNA, portrays the transposition machinery after DNA integration. It reveals severe distortion of target DNA and flipping of the target adenines into extra-helical positions. Fluorescence experiments confirm dynamic base flipping in solution. Transposase residues W159, R186, F187 and K190 stabilise the target DNA distortions and are required for efficient transposon integration and transposition in vitro. Transposase recognises the flipped target adenines via base-specific interactions with backbone atoms, offering a molecular basis for TA target sequence selection. Our results will provide a template for re-designing mariner/Tc1 transposases with modified target specificities. DOI: http://dx.doi.org/10.7554/eLife.15537.001 PMID:27223327
A protocol for generating a high-quality genome-scale metabolic reconstruction.

PubMed

Thiele, Ines; Palsson, Bernhard Ø

2010-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.
A protocol for generating a high-quality genome-scale metabolic reconstruction

PubMed Central

Thiele, Ines; Palsson, Bernhard Ø.

2011-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have developed over the past 10 years. These reconstructions represent structured knowledge-bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates myriad computational biological studies including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics, and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge-bases. Here, we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction as well as common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process. PMID:20057383
Murine Hyperglycemic Vasculopathy and Cardiomyopathy: Whole-Genome Gene Expression Analysis Predicts Cellular Targets and Regulatory Networks Influenced by Mannose Binding Lectin

PubMed Central

Zou, Chenhui; La Bonte, Laura R.; Pavlov, Vasile I.; Stahl, Gregory L.

2012-01-01

Hyperglycemia, in the absence of type 1 or 2 diabetes, is an independent risk factor for cardiovascular disease. We have previously demonstrated a central role for mannose binding lectin (MBL)-mediated cardiac dysfunction in acute hyperglycemic mice. In this study, we applied whole-genome microarray data analysis to investigate MBL’s role in systematic gene expression changes. The data predict possible intracellular events taking place in multiple cellular compartments such as enhanced insulin signaling pathway sensitivity, promoted mitochondrial respiratory function, improved cellular energy expenditure and protein quality control, improved cytoskeleton structure, and facilitated intracellular trafficking, all of which may contribute to the organismal health of MBL null mice against acute hyperglycemia. Our data show a tight association between gene expression profile and tissue function which might be a very useful tool in predicting cellular targets and regulatory networks connected with in vivo observations, providing clues for further mechanistic studies. PMID:22375142
High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

PubMed

Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory

2017-12-01

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.
1.45 Å resolution structure of SRPN18 from the malaria vector Anopheles gambiae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meekins, David A.; Zhang, Xin; Battaile, Kevin P.

Serine protease inhibitors (serpins) in insects function within development, wound healing and immunity. The genome of the African malaria vector,Anopheles gambiae, encodes 23 distinct serpin proteins, several of which are implicated in disease-relevant physiological responses.A. gambiaeserpin 18 (SRPN18) was previously categorized as non-inhibitory based on the sequence of its reactive-center loop (RCL), a region responsible for targeting and initiating protease inhibition. The crystal structure ofA. gambiaeSRPN18 was determined to a resolution of 1.45 Å, including nearly the entire RCL in one of the two molecules in the asymmetric unit. The structure reveals that the SRPN18 RCL is extremely short andmore » constricted, a feature associated with noncanonical inhibitors or non-inhibitory serpin superfamily members. Furthermore, the SRPN18 RCL does not contain a suitable protease target site and contains a large number of prolines. The SRPN18 structure therefore reveals a unique RCL architecture among the highly conserved serpin fold.« less
Crystal Structure of the Human, FIC-Domain Containing Protein HYPE and Implications for Its Functions

PubMed Central

Bunney, Tom D.; Cole, Ambrose R.; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W.; Katan, Matilda

2014-01-01

Summary Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein, HYPE, which has remained poorly characterized. Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of autoAMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition. PMID:25435325
Crystal structure of the human, FIC-domain containing protein HYPE and implications for its functions.

PubMed

Bunney, Tom D; Cole, Ambrose R; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W; Katan, Matilda

2014-12-02

Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein,HYPE, which has remained poorly characterized.Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of auto AMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition.

1.45 Å resolution structure of SRPN18 from the malaria vector Anopheles gambiae

PubMed Central

Meekins, David A.; Zhang, Xin; Battaile, Kevin P.; Lovell, Scott; Michel, Kristin

2016-01-01

Serine protease inhibitors (serpins) in insects function within development, wound healing and immunity. The genome of the African malaria vector, Anopheles gambiae, encodes 23 distinct serpin proteins, several of which are implicated in disease-relevant physiological responses. A. gambiae serpin 18 (SRPN18) was previously categorized as non-inhibitory based on the sequence of its reactive-center loop (RCL), a region responsible for targeting and initiating protease inhibition. The crystal structure of A. gambiae SRPN18 was determined to a resolution of 1.45 Å, including nearly the entire RCL in one of the two molecules in the asymmetric unit. The structure reveals that the SRPN18 RCL is extremely short and constricted, a feature associated with noncanonical inhibitors or non-inhibitory serpin superfamily members. Furthermore, the SRPN18 RCL does not contain a suitable protease target site and contains a large number of prolines. The SRPN18 structure therefore reveals a unique RCL architecture among the highly conserved serpin fold. PMID:27917832
[Advances in CRISPR-Cas-mediated genome editing system in plants].

PubMed

Wang, Chun; Wang, Kejian

2017-10-25

Targeted genome editing technology is an important tool to study the function of genes and to modify organisms at the genetic level. Recently, CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) system has emerged as an efficient tool for specific genome editing in animals and plants. CRISPR-Cas system uses CRISPR-associated endonuclease and a guide RNA to generate double-strand breaks at the target DNA site, subsequently leading to genetic modifications. CRISPR-Cas system has received widespread attention for manipulating the genomes with simple, easy and high specificity. This review summarizes recent advances of diverse applications of the CRISPR-Cas toolkit in plant research and crop breeding, including expanding the range of genome editing, precise editing of a target base, and efficient DNA-free genome editing technology. This review also discusses the potential challenges and application prospect in the future, and provides a useful reference for researchers who are interested in this field.
RNA-dependent DNA endonuclease Cas9 of the CRISPR system: Holy Grail of genome editing?

PubMed

Gasiunas, Giedrius; Siksnys, Virginijus

2013-11-01

Tailor-made nucleases for precise genome modification, such as zinc finger or TALE nucleases, currently represent the state-of-the-art for genome editing. These nucleases combine a programmable protein module which guides the enzyme to the target site with a nuclease domain which cuts DNA at the addressed site. Reprogramming of these nucleases to cut genomes at specific locations requires major protein engineering efforts. RNA-guided DNA endonuclease Cas9 of the type II (clustered regularly interspaced short palindromic repeat) CRISPR-Cas system uses CRISPR RNA (crRNA) as a guide to locate the DNA target and the Cas9 protein to cut DNA. Easy programmability of the Cas9 endonuclease using customizable RNAs brings unprecedented flexibility and versatility for targeted genome modification. We highlight the potential of the Cas9 RNA-guided DNA endonuclease as a novel tool for genome surgery, and discuss possible constraints and future prospects. Copyright © 2013 Elsevier Ltd. All rights reserved.
Host and viral RNA-binding proteins involved in membrane targeting, replication and intercellular movement of plant RNA virus genomes

PubMed Central

Hyodo, Kiwamu; Kaido, Masanori; Okuno, Tetsuro

2014-01-01

Many plant viruses have positive-strand RNA [(+)RNA] as their genome. Therefore, it is not surprising that RNA-binding proteins (RBPs) play important roles during (+)RNA virus infection in host plants. Increasing evidence demonstrates that viral and host RBPs play critical roles in multiple steps of the viral life cycle, including translation and replication of viral genomic RNAs, and their intra- and intercellular movement. Although studies focusing on the RNA-binding activities of viral and host proteins, and their associations with membrane targeting, and intercellular movement of viral genomes have been limited to a few viruses, these studies have provided important insights into the molecular mechanisms underlying the replication and movement of viral genomic RNAs. In this review, we briefly overview the currently defined roles of viral and host RBPs whose RNA-binding activity have been confirmed experimentally in association with their membrane targeting, and intercellular movement of plant RNA virus genomes. PMID:25071804
Insertion and deletion polymorphisms of the ancient AluS family in the human genome.

PubMed

Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M

2017-01-01

Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted.
Evolution of coding and non-coding genes in HOX clusters of a marsupial.

PubMed

Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B

2012-06-18

The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Evolution of coding and non-coding genes in HOX clusters of a marsupial

PubMed Central

2012-01-01

Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672
Integrated genomic analysis identifies the mitotic checkpoint kinase WEE1 as a novel therapeutic target in medulloblastoma

PubMed Central

2014-01-01

Background Medulloblastoma is the most common type of malignant brain tumor that afflicts children. Although recent advances in chemotherapy and radiation have improved outcomes, high-risk patients do poorly with significant morbidity. Methods To identify new molecular targets, we performed an integrated genomic analysis using structural and functional methods. Gene expression profiling in 16 medulloblastoma patient samples and subsequent gene set enrichment analysis indicated that cell cycle-related kinases were associated with disease development. In addition a kinome-wide small interfering RNA (siRNA) screen was performed to identify kinases that, when inhibited, could prevent cell proliferation. The two genome-scale analyses were combined to identify key vulnerabilities in medulloblastoma. The inhibition of one of the identified targets was further investigated using RNAi and a small molecule inhibitor. Results Combining the two analyses revealed that mitosis-related kinases were critical determinants of medulloblastoma cell proliferation. RNA interference (RNAi)-mediated knockdown of WEE1 kinase and other mitotic kinases was sufficient to reduce medulloblastoma cell proliferation. These data prompted us to examine the effects of inhibiting WEE1 by RNAi and by a small molecule inhibitor of WEE1, MK-1775, in medulloblastoma cell lines. MK-1775 inhibited the growth of medulloblastoma cell lines, induced apoptosis and increased DNA damage at nanomolar concentrations. Further, MK-1775 was synergistic with cisplatin in reducing medulloblastoma cell proliferation and resulted in an associated increase in cell death. In vivo MK-1775 suppressed medulloblastoma tumor growth as a single agent. Conclusions Taken together, these findings highlight mitotic kinases and, in particular, WEE1 as a rational therapeutic target for medulloblastoma. PMID:24661910
Incidence of genome structure, DNA asymmetry, and cell physiology on T-DNA integration in chromosomes of the phytopathogenic fungus Leptosphaeria maculans.

PubMed

Bourras, Salim; Meyer, Michel; Grandaubert, Jonathan; Lapalu, Nicolas; Fudal, Isabelle; Linglin, Juliette; Ollivier, Benedicte; Blaise, Françoise; Balesdent, Marie-Hélène; Rouxel, Thierry

2012-08-01

The ever-increasing generation of sequence data is accompanied by unsatisfactory functional annotation, and complex genomes, such as those of plants and filamentous fungi, show a large number of genes with no predicted or known function. For functional annotation of unknown or hypothetical genes, the production of collections of mutants using Agrobacterium tumefaciens-mediated transformation (ATMT) associated with genotyping and phenotyping has gained wide acceptance. ATMT is also widely used to identify pathogenicity determinants in pathogenic fungi. A systematic analysis of T-DNA borders was performed in an ATMT-mutagenized collection of the phytopathogenic fungus Leptosphaeria maculans to evaluate the features of T-DNA integration in its particular transposable element-rich compartmentalized genome. A total of 318 T-DNA tags were recovered and analyzed for biases in chromosome and genic compartments, existence of CG/AT skews at the insertion site, and occurrence of microhomologies between the T-DNA left border (LB) and the target sequence. Functional annotation of targeted genes was done using the Gene Ontology annotation. The T-DNA integration mainly targeted gene-rich, transcriptionally active regions, and it favored biological processes consistent with the physiological status of a germinating spore. T-DNA integration was strongly biased toward regulatory regions, and mainly promoters. Consistent with the T-DNA intranuclear-targeting model, the density of T-DNA insertion correlated with CG skew near the transcription initiation site. The existence of microhomologies between promoter sequences and the T-DNA LB flanking sequence was also consistent with T-DNA integration to host DNA mediated by homologous recombination based on the microhomology-mediated end-joining pathway.
SSMART: Sequence-structure motif identification for RNA-binding proteins.

PubMed

Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

2018-06-11

RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.
[Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].

PubMed

Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong

2015-11-01

The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.
Direct observation of transcription activator-like effector (TALE) protein dynamics

NASA Astrophysics Data System (ADS)

Cuculis, Luke; Abil, Zhanar; Zhao, Huimin; Schroeder, Charles M.

2014-03-01

In this work, we describe a single molecule assay to probe the site-search dynamics of transcription activator-like effector (TALE) proteins along DNA. In modern genetics, the ability to selectively edit the human genome is an unprecedented development, driven by recent advances in targeted nuclease proteins. Specific gene editing can be accomplished using TALE proteins, which are programmable DNA-binding proteins that can be fused to a nuclease domain. In this way, TALENs are a leading technology that has shown great success in the genomic editing of pluripotent stem cells. A major hurdle facing clinical implementation, however, is the potential for deleterious off-target binding events. For these reasons, a molecular-level understanding of TALE binding and target sequence search on DNA is essential. To this end, we developed a single-molecule fluorescence imaging assay that provides a first-of-its-kind view of the 1-D diffusion of TALE proteins along stretched DNA. Taken together with co-crystal structures of DNA-bound TALEs, our results suggest a rotationally-coupled, major groove tracking model for diffusion. We further report diffusion constants for TALE proteins as a function of salt concentration, consistent with previously described models of 1-D protein diffusion.
Integrated genomic and molecular characterization of cervical cancer.

PubMed

2017-03-16

Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome

USDA-ARS?s Scientific Manuscript database

CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of human, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific t...
Contribution of transposable elements and distal enhancers to evolution of human-specific features of interphase chromatin architecture in embryonic stem cells.

PubMed

Glinsky, Gennadi V

2018-03-01

Transposable elements have made major evolutionary impacts on creation of primate-specific and human-specific genomic regulatory loci and species-specific genomic regulatory networks (GRNs). Molecular and genetic definitions of human-specific changes to GRNs contributing to development of unique to human phenotypes remain a highly significant challenge. Genome-wide proximity placement analysis of diverse families of human-specific genomic regulatory loci (HSGRL) identified topologically associating domains (TADs) that are significantly enriched for HSGRL and designated rapidly evolving in human TADs. Here, the analysis of HSGRL, hESC-enriched enhancers, super-enhancers (SEs), and specific sub-TAD structures termed super-enhancer domains (SEDs) has been performed. In the hESC genome, 331 of 504 (66%) of SED-harboring TADs contain HSGRL and 68% of SEDs co-localize with HSGRL, suggesting that emergence of HSGRL may have rewired SED-associated GRNs within specific TADs by inserting novel and/or erasing existing non-coding regulatory sequences. Consequently, markedly distinct features of the principal regulatory structures of interphase chromatin evolved in the hESC genome compared to mouse: the SED quantity is 3-fold higher and the median SED size is significantly larger. Concomitantly, the overall TAD quantity is increased by 42% while the median TAD size is significantly decreased (p = 9.11E-37) in the hESC genome. Present analyses illustrate a putative global role for transposable elements and HSGRL in shaping the human-specific features of the interphase chromatin organization and functions, which are facilitated by accelerated creation of novel transcription factor binding sites and new enhancers driven by targeted placement of HSGRL at defined genomic coordinates. A trend toward the convergence of TAD and SED architectures of interphase chromatin in the hESC genome may reflect changes of 3D-folding patterns of linear chromatin fibers designed to enhance both regulatory complexity and functional precision of GRNs by creating predominantly a single gene (or a set of functionally linked genes) per regulatory domain structures. Collectively, present analyses reveal critical evolutionary contributions of transposable elements and distal enhancers to creation of thousands primate- and human-specific elements of a chromatin folding code, which defines the 3D context of interphase chromatin both restricting and facilitating biological functions of GRNs.
Application of genome editing technologies to the study and treatment of hematological disease.

PubMed

Pellagatti, Andrea; Dolatshad, Hamid; Yip, Bon Ham; Valletta, Simona; Boultwood, Jacqueline

2016-01-01

Genome editing technologies have advanced significantly over the past few years, providing a fast and effective tool to precisely manipulate the genome at specific locations. The three commonly used genome editing technologies are Zinc Finger Nucleases (ZFNs), Transcription Activator-Like Effector Nucleases (TALENs), and the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated Cas9 (CRISPR/Cas9) system. ZFNs and TALENs consist of endonucleases fused to a DNA-binding domain, while the CRISPR/Cas9 system uses guide RNAs to target the bacterial Cas9 endonuclease to the desired genomic location. The double-strand breaks made by these endonucleases are repaired in the cells either by non-homologous end joining, resulting in the introduction of insertions/deletions, or, if a repair template is provided, by homology directed repair. The ZFNs, TALENs and CRISPR/Cas9 systems take advantage of these repair mechanisms for targeted genome modification and have been successfully used to manipulate the genome in human cells. These genome editing tools can be used to investigate gene function, to discover new therapeutic targets, and to develop disease models. Moreover, these genome editing technologies have great potential in gene therapy. Here, we review the latest advances in the application of genome editing technology to the study and treatment of hematological disorders. Copyright © 2015 Elsevier Ltd. All rights reserved.
OSLay: optimal syntenic layout of unfinished assemblies.

PubMed

Richter, Daniel C; Schuster, Stephan C; Huson, Daniel H

2007-07-01

The whole genome shotgun approach to genome sequencing results in a collection of contigs that must be ordered and oriented to facilitate efficient gap closure. We present a new tool OSLay that uses synteny between matching sequences in a target assembly and a reference assembly to layout the contigs (or scaffolds) in the target assembly. The underlying algorithm is based on maximum weight matching. The tool provides an interactive visualization of the computed layout and the result can be imported into the assembly editing tool Consed to support the design of primer pairs for gap closure. To enhance efficiency in the gap closure phase of a genome project it is crucial to know which contigs are adjacent in the target genome. Related genome sequences can be used to layout contigs in an assembly. OSLay is freely available from: http://www-ab.informatik.unituebingen.de/software/oslay.
Spatiotemporal genomic architecture informs precision oncology in glioblastoma

PubMed Central

Lee, Jin-Ku; Wang, Jiguang; Sa, Jason K.; Ladewig, Erik; Lee, Hae-Ock; Lee, In-Hee; Kang, Hyun Ju; Rosenbloom, Daniel S.; Camara, Pablo G.; Liu, Zhaoqi; van Nieuwenhuizen, Patrick; Jung, Sang Won; Choi, Seung Won; Kim, Junhyung; Chen, Andrew; Kim, Kyu-Tae; Shin, Sang; Seo, Yun Jee; Oh, Jin-Mi; Shin, Yong Jae; Park, Chul-Kee; Kong, Doo-Sik; Seol, Ho Jun; Blumberg, Andrew; Lee, Jung-Il; Iavarone, Antonio; Park, Woong-Yang; Rabadan, Raul; Nam, Do-Hyun

2017-01-01

Precision medicine in cancer proposes that genomic characterization of tumors can inform personalized targeted therapies1–5. This proposition, however, is complicated by spatial and temporal heterogeneity6–14. Here we study genomic and expression profiles across 127 multi-sector or longitudinal specimens from 52 glioblastoma (GBM) patients. Using bulk and single-cell data, we find that samples from the same tumor mass share genomic and expression signatures, while geographically separated multifocal tumors and/or long-term recurrent tumors are seeded from different clones. Chemical screening of patient-derived glioma cells (PDCs) shows that therapeutic response is associated to genetic similarity, and multifocal tumors enriched with PIK3CA mutations have a heterogeneous drug response pattern. Importantly, we show that targeting truncal events is more efficacious in reducing tumor burden. In summary, this work demonstrates that evolutionary inference from integrated genomic analysis in multi-sector biopsies can inform targeted therapeutic interventions for GBM patients. PMID:28263318
A resource for characterizing genome-wide binding and putative target genes of transcription factors expressed during secondary growth and wood formation in Populus

Treesearch

Lijun Liu; Trevor Ramsay; Matthew S. Zinkgraf; David Sundell; Nathaniel Robert Street; Vladimir Filkov; Andrew Groover

2015-01-01

Identifying transcription factor target genes is essential for modeling the transcriptional networks underlying developmental processes. Here we report a chromatin immunoprecipitation sequencing (ChIP-seq) resource consisting of genome-wide binding regions and associated putative target genes for four Populus homeodomain transcription factors...
Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

DOE PAGES

Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; ...

2014-09-01

Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

The CRISPR/Cas Genome-Editing Tool: Application in Improvement of Crops

PubMed Central

Khatodia, Surender; Bhatotia, Kirti; Passricha, Nishat; Khurana, S. M. P.; Tuteja, Narendra

2016-01-01

The Clustered Regularly Interspaced Short Palindromic Repeats associated Cas9/sgRNA system is a novel targeted genome-editing technique derived from bacterial immune system. It is an inexpensive, easy, most user friendly and rapidly adopted genome editing tool transforming to revolutionary paradigm. This technique enables precise genomic modifications in many different organisms and tissues. Cas9 protein is an RNA guided endonuclease utilized for creating targeted double-stranded breaks with only a short RNA sequence to confer recognition of the target in animals and plants. Development of genetically edited (GE) crops similar to those developed by conventional or mutation breeding using this potential technique makes it a promising and extremely versatile tool for providing sustainable productive agriculture for better feeding of rapidly growing population in a changing climate. The emerging areas of research for the genome editing in plants include interrogating gene function, rewiring the regulatory signaling networks and sgRNA library for high-throughput loss-of-function screening. In this review, we have described the broad applicability of the Cas9 nuclease mediated targeted plant genome editing for development of designer crops. The regulatory uncertainty and social acceptance of plant breeding by Cas9 genome editing have also been described. With this powerful and innovative technique the designer GE non-GM plants could further advance climate resilient and sustainable agriculture in the future and maximizing yield by combating abiotic and biotic stresses. PMID:27148329
Analyses of the probiotic property and stress resistance-related genes of Lactococcus lactis subsp. lactis NCDO 2118 through comparative genomics and in vitro assays

PubMed Central

Saraiva, Tessália D. L.; Silva, Wanderson M.; Pereira, Ulisses P.; Campos, Bruno C.; Benevides, Leandro J.; Rocha, Flávia S.; Figueiredo, Henrique C. P.; Azevedo, Vasco; Soares, Siomar C.

2017-01-01

Lactococcus lactis subsp. lactis NCDO 2118 was recently reported to alleviate colitis symptoms via its anti-inflammatory and immunomodulatory activities, which are exerted by exported proteins that are not produced by L. lactis subsp. lactis IL1403. Here, we used in vitro and in silico approaches to characterize the genomic structure, the safety aspects, and the immunomodulatory activity of this strain. Through comparative genomics, we identified genomic islands, phage regions, bile salt and acid stress resistance genes, bacteriocins, adhesion-related and antibiotic resistance genes, and genes encoding proteins that are putatively secreted, expressed in vitro and absent from IL1403. The high degree of similarity between all Lactococcus suggests that the Symbiotic Islands commonly shared by both NCDO 2118 and KF147 may be responsible for their close relationship and their adaptation to plants. The predicted bacteriocins may play an important role against the invasion of competing strains. The genes related to the acid and bile salt stresses may play important roles in gastrointestinal tract survival, whereas the adhesion proteins are important for persistence in the gut, culminating in the competitive exclusion of other bacteria. Finally, the five secreted and expressed proteins may be important targets for studies of new anti-inflammatory and immunomodulatory proteins. Altogether, the analyses performed here highlight the potential use of this strain as a target for the future development of probiotic foods. PMID:28384209
Pharmacological Inhibition of Feline Immunodeficiency Virus (FIV)

PubMed Central

Mohammadi, Hakimeh; Bienzle, Dorothee

2012-01-01

Feline immunodeficiency virus (FIV) is a member of the retroviridae family of viruses and causes an acquired immunodeficiency syndrome (AIDS) in domestic and non-domestic cats worldwide. Genome organization of FIV and clinical characteristics of the disease caused by the virus are similar to those of human immunodeficiency virus (HIV). Both viruses infect T lymphocytes, monocytes and macrophages, and their replication cycle in infected cells is analogous. Due to marked similarity in genomic organization, virus structure, virus replication and disease pathogenesis of FIV and HIV, infection of cats with FIV is a useful tool to study and develop novel drugs and vaccines for HIV. Anti-retroviral drugs studied extensively in HIV infection have targeted different steps of the virus replication cycle: (1) inhibition of virus entry into susceptible cells at the level of attachment to host cell surface receptors and co-receptors; (2) inhibition of fusion of the virus membrane with the cell membrane; (3) blockade of reverse transcription of viral genomic RNA; (4) interruption of nuclear translocation and viral DNA integration into host genomes; (5) prevention of viral transcript processing and nuclear export; and (6) inhibition of virion assembly and maturation. Despite much success of anti-retroviral therapy slowing disease progression in people, similar therapy has not been thoroughly investigated in cats. In this article we review current pharmacological approaches and novel targets for anti-lentiviral therapy, and critically assess potentially suitable applications against FIV infection in cats. PMID:22754645
Evaluating High-Throughput Ab Initio Gene Finders to Discover Proteins Encoded in Eukaryotic Pathogen Genomes Missed by Laboratory Techniques

PubMed Central

Goodswen, Stephen J.; Kennedy, Paul J.; Ellis, John T.

2012-01-01

Next generation sequencing technology is advancing genome sequencing at an unprecedented level. By unravelling the code within a pathogen’s genome, every possible protein (prior to post-translational modifications) can theoretically be discovered, irrespective of life cycle stages and environmental stimuli. Now more than ever there is a great need for high-throughput ab initio gene finding. Ab initio gene finders use statistical models to predict genes and their exon-intron structures from the genome sequence alone. This paper evaluates whether existing ab initio gene finders can effectively predict genes to deduce proteins that have presently missed capture by laboratory techniques. An aim here is to identify possible patterns of prediction inaccuracies for gene finders as a whole irrespective of the target pathogen. All currently available ab initio gene finders are considered in the evaluation but only four fulfil high-throughput capability: AUGUSTUS, GeneMark_hmm, GlimmerHMM, and SNAP. These gene finders require training data specific to a target pathogen and consequently the evaluation results are inextricably linked to the availability and quality of the data. The pathogen, Toxoplasma gondii, is used to illustrate the evaluation methods. The results support current opinion that predicted exons by ab initio gene finders are inaccurate in the absence of experimental evidence. However, the results reveal some patterns of inaccuracy that are common to all gene finders and these inaccuracies may provide a focus area for future gene finder developers. PMID:23226328
DNA-Free Genetically Edited Grapevine and Apple Protoplast Using CRISPR/Cas9 Ribonucleoproteins.

PubMed

Malnoy, Mickael; Viola, Roberto; Jung, Min-Hee; Koo, Ok-Jae; Kim, Seokjoong; Kim, Jin-Soo; Velasco, Riccardo; Nagamangala Kanchiswamy, Chidananda

2016-01-01

The combined availability of whole genome sequences and genome editing tools is set to revolutionize the field of fruit biotechnology by enabling the introduction of targeted genetic changes with unprecedented control and accuracy, both to explore emergent phenotypes and to introduce new functionalities. Although plasmid-mediated delivery of genome editing components to plant cells is very efficient, it also presents some drawbacks, such as possible random integration of plasmid sequences in the host genome. Additionally, it may well be intercepted by current process-based GMO regulations, complicating the path to commercialization of improved varieties. Here, we explore direct delivery of purified CRISPR/Cas9 ribonucleoproteins (RNPs) to the protoplast of grape cultivar Chardonnay and apple cultivar such as Golden delicious fruit crop plants for efficient targeted mutagenesis. We targeted MLO-7 , a susceptible gene in order to increase resistance to powdery mildew in grape cultivar and DIPM-1, DIPM-2 , and DIPM-4 in the apple to increase resistance to fire blight disease. Furthermore, efficient protoplast transformation, the molar ratio of Cas9 and sgRNAs were optimized for each grape and apple cultivar. The targeted mutagenesis insertion and deletion rate was analyzed using targeted deep sequencing. Our results demonstrate that direct delivery of CRISPR/Cas9 RNPs to the protoplast system enables targeted gene editing and paves the way to the generation of DNA-free genome edited grapevine and apple plants.
Nonviral Genome Editing Based on a Polymer-Derivatized CRISPR Nanocomplex for Targeting Bacterial Pathogens and Antibiotic Resistance.

PubMed

Kang, Yoo Kyung; Kwon, Kyu; Ryu, Jea Sung; Lee, Ha Neul; Park, Chankyu; Chung, Hyun Jung

2017-04-19

The overuse of antibiotics plays a major role in the emergence and spread of multidrug-resistant bacteria. A molecularly targeted, specific treatment method for bacterial pathogens can prevent this problem by reducing the selective pressure during microbial growth. Herein, we introduce a nonviral treatment strategy delivering genome editing material for targeting antibacterial resistance. We apply the CRISPR-Cas9 system, which has been recognized as an innovative tool for highly specific and efficient genome engineering in different organisms, as the delivery cargo. We utilize polymer-derivatized Cas9, by direct covalent modification of the protein with cationic polymer, for subsequent complexation with single-guide RNA targeting antibiotic resistance. We show that nanosized CRISPR complexes (= Cr-Nanocomplex) were successfully formed, while maintaining the functional activity of Cas9 endonuclease to induce double-strand DNA cleavage. We also demonstrate that the Cr-Nanocomplex designed to target mecA-the major gene involved in methicillin resistance-can be efficiently delivered into Methicillin-resistant Staphylococcus aureus (MRSA), and allow the editing of the bacterial genome with much higher efficiency compared to using native Cas9 complexes or conventional lipid-based formulations. The present study shows for the first time that a covalently modified CRISPR system allows nonviral, therapeutic genome editing, and can be potentially applied as a target specific antimicrobial.
Genome-wide Target Enrichment-aided Chip Design: a 66 K SNP Chip for Cashmere Goat.

PubMed

Qiao, Xian; Su, Rui; Wang, Yang; Wang, Ruijun; Yang, Ting; Li, Xiaokai; Chen, Wei; He, Shiyang; Jiang, Yu; Xu, Qiwu; Wan, Wenting; Zhang, Yaolei; Zhang, Wenguang; Chen, Jiang; Liu, Bin; Liu, Xin; Fan, Yixing; Chen, Duoyuan; Jiang, Huaizhi; Fang, Dongming; Liu, Zhihong; Wang, Xiaowen; Zhang, Yanjun; Mao, Danqing; Wang, Zhiying; Di, Ran; Zhao, Qianjun; Zhong, Tao; Yang, Huanming; Wang, Jian; Wang, Wen; Dong, Yang; Chen, Xiaoli; Xu, Xun; Li, Jinquan

2017-08-17

Compared with the commercially available single nucleotide polymorphism (SNP) chip based on the Bead Chip technology, the solution hybrid selection (SHS)-based target enrichment SNP chip is not only design-flexible, but also cost-effective for genotype sequencing. In this study, we propose to design an animal SNP chip using the SHS-based target enrichment strategy for the first time. As an update to the international collaboration on goat research, a 66 K SNP chip for cashmere goat was created from the whole-genome sequencing data of 73 individuals. Verification of this 66 K SNP chip with the whole-genome sequencing data of 436 cashmere goats showed that the SNP call rates was between 95.3% and 99.8%. The average sequencing depth for target SNPs were 40X. The capture regions were shown to be 200 bp that flank target SNPs. This chip was further tested in a genome-wide association analysis of cashmere fineness (fiber diameter). Several top hit loci were found marginally associated with signaling pathways involved in hair growth. These results demonstrate that the 66 K SNP chip is a useful tool in the genomic analyses of cashmere goats. The successful chip design shows that the SHS-based target enrichment strategy could be applied to SNP chip design in other species.
Scientific Approaches | Office of Cancer Clinical Proteomics Research

Cancer.gov

CPTAC employs two complementary scientific approaches, a "Targeting Genome to Proteome" (Targeting G2P) approach and a "Mapping Proteome to Genome" (Mapping P2G) approach, in order to address biological questions from data generated on a sample.
Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium.

PubMed

Ellis, Matthew J; Gillette, Michael; Carr, Steven A; Paulovich, Amanda G; Smith, Richard D; Rodland, Karin K; Townsend, R Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel C

2013-10-01

The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verification using targeted mass spectrometry methods. ©2013 AACR.
Efficient Genome Editing in Induced Pluripotent Stem Cells with Engineered Nucleases In Vitro.

PubMed

Termglinchan, Vittavat; Seeger, Timon; Chen, Caressa; Wu, Joseph C; Karakikes, Ioannis

2017-01-01

Precision genome engineering is rapidly advancing the application of the induced pluripotent stem cells (iPSCs) technology for in vitro disease modeling of cardiovascular diseases. Targeted genome editing using engineered nucleases is a powerful tool that allows for reverse genetics, genome engineering, and targeted transgene integration experiments to be performed in a precise and predictable manner. However, nuclease-mediated homologous recombination is an inefficient process. Herein, we describe the development of an optimized method combining site-specific nucleases and the piggyBac transposon system for "seamless" genome editing in pluripotent stem cells with high efficiency and fidelity in vitro.
Bactericidal effects of low-intensity extremely high frequency electromagnetic field: an overview with phenomenon, mechanisms, targets and consequences.

PubMed

Torgomyan, Heghine; Trchounian, Armen

2013-02-01

Low-intensity electromagnetic field (EMF) of extremely high frequencies is a widespread environmental factor. This field is used in telecommunication systems, therapeutic practices and food protection. Particularly, in medicine and food industries EMF is used for its bactericidal effects. The significant targets of cellular mechanisms for EMF effects at resonant frequencies in bacteria could be water (H(2)O), cell membrane and genome. The changes in H(2)O cluster structure and properties might be leading to increase of chemical activity or hydration of proteins and other cellular structures. These effects are likely to be specific and long-term. Moreover, cell membrane with its surface characteristics, substance transport and energy-conversing processes is also altered. Then, the genome is affected because the conformational changes in DNA and the transition of bacterial pro-phages from lysogenic to lytic state have been detected. The consequences for EMF interaction with bacteria are the changes in their sensitivity to different chemicals, including antibiotics. These effects are important to understand distinguishing role of bacteria in environment, leading to changed metabolic pathways in bacteria and their antibiotic resistance. This EMF may also affect the cell-to-cell interactions in bacterial populations, since bacteria might interact with each other through EMF of sub-extremely high frequency range.
Inhibition Mechanism of an Anti-CRISPR Suppressor AcrIIA4 Targeting SpyCas9.

PubMed

Yang, Hui; Patel, Dinshaw J

2017-07-06

Prokaryotic CRISPR-Cas adaptive immune systems utilize sequence-specific RNA-guided endonucleases to defend against infection by viruses, bacteriophages, and mobile elements, while these foreign genetic elements evolve diverse anti-CRISPR proteins to overcome the CRISPR-Cas-mediated defense of the host. Recently, AcrIIA2 and AcrIIA4, encoded by Listeria monocytogene prophages, were shown to block the endonuclease activity of type II-A Streptococcus pyogene Cas9 (SpyCas9). We now report the crystal structure of AcrIIA4 in complex with single-guide RNA-bound SpyCas9, thereby establishing that AcrIIA4 preferentially targets critical residues essential for PAM duplex recognition, as well as blocks target DNA access to key catalytic residues lining the RuvC pocket. These structural insights, validated by biochemical assays on key mutants, demonstrate that AcrIIA4 competitively occupies both PAM-interacting and non-target DNA strand cleavage catalytic pockets. Our studies provide insights into anti-CRISPR-mediated suppression mechanisms for inactivating SpyCas9, thereby broadening the applicability of CRISPR-Cas regulatory tools for genome editing. Published by Elsevier Inc.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

PubMed

Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya; Guigó, Roderic; Gingeras, Thomas R; Margulies, Elliott H; Weng, Zhiping; Snyder, Michael; Dermitzakis, Emmanouil T; Thurman, Robert E; Kuehn, Michael S; Taylor, Christopher M; Neph, Shane; Koch, Christoph M; Asthana, Saurabh; Malhotra, Ankit; Adzhubei, Ivan; Greenbaum, Jason A; Andrews, Robert M; Flicek, Paul; Boyle, Patrick J; Cao, Hua; Carter, Nigel P; Clelland, Gayle K; Davis, Sean; Day, Nathan; Dhami, Pawandeep; Dillon, Shane C; Dorschner, Michael O; Fiegler, Heike; Giresi, Paul G; Goldy, Jeff; Hawrylycz, Michael; Haydock, Andrew; Humbert, Richard; James, Keith D; Johnson, Brett E; Johnson, Ericka M; Frum, Tristan T; Rosenzweig, Elizabeth R; Karnani, Neerja; Lee, Kirsten; Lefebvre, Gregory C; Navas, Patrick A; Neri, Fidencio; Parker, Stephen C J; Sabo, Peter J; Sandstrom, Richard; Shafer, Anthony; Vetrie, David; Weaver, Molly; Wilcox, Sarah; Yu, Man; Collins, Francis S; Dekker, Job; Lieb, Jason D; Tullius, Thomas D; Crawford, Gregory E; Sunyaev, Shamil; Noble, William S; Dunham, Ian; Denoeud, France; Reymond, Alexandre; Kapranov, Philipp; Rozowsky, Joel; Zheng, Deyou; Castelo, Robert; Frankish, Adam; Harrow, Jennifer; Ghosh, Srinka; Sandelin, Albin; Hofacker, Ivo L; Baertsch, Robert; Keefe, Damian; Dike, Sujit; Cheng, Jill; Hirsch, Heather A; Sekinger, Edward A; Lagarde, Julien; Abril, Josep F; Shahab, Atif; Flamm, Christoph; Fried, Claudia; Hackermüller, Jörg; Hertel, Jana; Lindemeyer, Manja; Missal, Kristin; Tanzer, Andrea; Washietl, Stefan; Korbel, Jan; Emanuelsson, Olof; Pedersen, Jakob S; Holroyd, Nancy; Taylor, Ruth; Swarbreck, David; Matthews, Nicholas; Dickson, Mark C; Thomas, Daryl J; Weirauch, Matthew T; Gilbert, James; Drenkow, Jorg; Bell, Ian; Zhao, XiaoDong; Srinivasan, K G; Sung, Wing-Kin; Ooi, Hong Sain; Chiu, Kuo Ping; Foissac, Sylvain; Alioto, Tyler; Brent, Michael; Pachter, Lior; Tress, Michael L; Valencia, Alfonso; Choo, Siew Woh; Choo, Chiou Yu; Ucla, Catherine; Manzano, Caroline; Wyss, Carine; Cheung, Evelyn; Clark, Taane G; Brown, James B; Ganesh, Madhavan; Patel, Sandeep; Tammana, Hari; Chrast, Jacqueline; Henrichsen, Charlotte N; Kai, Chikatoshi; Kawai, Jun; Nagalakshmi, Ugrappa; Wu, Jiaqian; Lian, Zheng; Lian, Jin; Newburger, Peter; Zhang, Xueqing; Bickel, Peter; Mattick, John S; Carninci, Piero; Hayashizaki, Yoshihide; Weissman, Sherman; Hubbard, Tim; Myers, Richard M; Rogers, Jane; Stadler, Peter F; Lowe, Todd M; Wei, Chia-Lin; Ruan, Yijun; Struhl, Kevin; Gerstein, Mark; Antonarakis, Stylianos E; Fu, Yutao; Green, Eric D; Karaöz, Ulaş; Siepel, Adam; Taylor, James; Liefer, Laura A; Wetterstrand, Kris A; Good, Peter J; Feingold, Elise A; Guyer, Mark S; Cooper, Gregory M; Asimenos, George; Dewey, Colin N; Hou, Minmei; Nikolaev, Sergey; Montoya-Burgos, Juan I; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Huang, Haiyan; Zhang, Nancy R; Holmes, Ian; Mullikin, James C; Ureta-Vidal, Abel; Paten, Benedict; Seringhaus, Michael; Church, Deanna; Rosenbloom, Kate; Kent, W James; Stone, Eric A; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross C; Haussler, David; Miller, Webb; Sidow, Arend; Trinklein, Nathan D; Zhang, Zhengdong D; Barrera, Leah; Stuart, Rhona; King, David C; Ameur, Adam; Enroth, Stefan; Bieda, Mark C; Kim, Jonghwan; Bhinge, Akshay A; Jiang, Nan; Liu, Jun; Yao, Fei; Vega, Vinsensius B; Lee, Charlie W H; Ng, Patrick; Shahab, Atif; Yang, Annie; Moqtaderi, Zarmik; Zhu, Zhou; Xu, Xiaoqin; Squazzo, Sharon; Oberley, Matthew J; Inman, David; Singer, Michael A; Richmond, Todd A; Munn, Kyle J; Rada-Iglesias, Alvaro; Wallerman, Ola; Komorowski, Jan; Fowler, Joanna C; Couttet, Phillippe; Bruce, Alexander W; Dovey, Oliver M; Ellis, Peter D; Langford, Cordelia F; Nix, David A; Euskirchen, Ghia; Hartman, Stephen; Urban, Alexander E; Kraus, Peter; Van Calcar, Sara; Heintzman, Nate; Kim, Tae Hoon; Wang, Kun; Qu, Chunxu; Hon, Gary; Luna, Rosa; Glass, Christopher K; Rosenfeld, M Geoff; Aldred, Shelley Force; Cooper, Sara J; Halees, Anason; Lin, Jane M; Shulha, Hennady P; Zhang, Xiaoling; Xu, Mousheng; Haidar, Jaafar N S; Yu, Yong; Ruan, Yijun; Iyer, Vishwanath R; Green, Roland D; Wadelius, Claes; Farnham, Peggy J; Ren, Bing; Harte, Rachel A; Hinrichs, Angie S; Trumbower, Heather; Clawson, Hiram; Hillman-Jackson, Jennifer; Zweig, Ann S; Smith, Kayla; Thakkapallayil, Archana; Barber, Galt; Kuhn, Robert M; Karolchik, Donna; Armengol, Lluis; Bird, Christine P; de Bakker, Paul I W; Kern, Andrew D; Lopez-Bigas, Nuria; Martin, Joel D; Stranger, Barbara E; Woodroffe, Abigail; Davydov, Eugene; Dimas, Antigone; Eyras, Eduardo; Hallgrímsdóttir, Ingileif B; Huppert, Julian; Zody, Michael C; Abecasis, Gonçalo R; Estivill, Xavier; Bouffard, Gerard G; Guan, Xiaobin; Hansen, Nancy F; Idol, Jacquelyn R; Maduro, Valerie V B; Maskeri, Baishali; McDowell, Jennifer C; Park, Morgan; Thomas, Pamela J; Young, Alice C; Blakesley, Robert W; Muzny, Donna M; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Jiang, Huaiyang; Weinstock, George M; Gibbs, Richard A; Graves, Tina; Fulton, Robert; Mardis, Elaine R; Wilson, Richard K; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B; Chang, Jean L; Lindblad-Toh, Kerstin; Lander, Eric S; Koriabine, Maxim; Nefedov, Mikhail; Osoegawa, Kazutoyo; Yoshinaga, Yuko; Zhu, Baoli; de Jong, Pieter J

2007-06-14

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Genome-wide identification and characterization of WRKY gene family in Salix suchowensis.

PubMed

Bi, Changwei; Xu, Yiqing; Ye, Qiaolin; Yin, Tongming; Ye, Ning

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I-III), with five subgroups (IIa-IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon-intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants.
Genome-wide identification and characterization of WRKY gene family in Salix suchowensis

PubMed Central

Ye, Qiaolin; Yin, Tongming

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I–III), with five subgroups (IIa–IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon–intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants. PMID:27651997
PRED-CLASS: cascading neural networks for generalized protein classification and genome-wide applications.

PubMed

Pasquier, C; Promponas, V J; Hamodrakas, S J

2001-08-15

A cascading system of hierarchical, artificial neural networks (named PRED-CLASS) is presented for the generalized classification of proteins into four distinct classes-transmembrane, fibrous, globular, and mixed-from information solely encoded in their amino acid sequences. The architecture of the individual component networks is kept very simple, reducing the number of free parameters (network synaptic weights) for faster training, improved generalization, and the avoidance of data overfitting. Capturing information from as few as 50 protein sequences spread among the four target classes (6 transmembrane, 10 fibrous, 13 globular, and 17 mixed), PRED-CLASS was able to obtain 371 correct predictions out of a set of 387 proteins (success rate approximately 96%) unambiguously assigned into one of the target classes. The application of PRED-CLASS to several test sets and complete proteomes of several organisms demonstrates that such a method could serve as a valuable tool in the annotation of genomic open reading frames with no functional assignment or as a preliminary step in fold recognition and ab initio structure prediction methods. Detailed results obtained for various data sets and completed genomes, along with a web sever running the PRED-CLASS algorithm, can be accessed over the World Wide Web at http://o2.biol.uoa.gr/PRED-CLASS.
Intracellular generation of single-strand template increases the knock-in efficiency by combining CRISPR/Cas9 with AAV.

PubMed

Xiao, Qing; Min, Taishan; Ma, Shuangping; Hu, Lingna; Chen, Hongyan; Lu, Daru

2018-04-18

Targeted integration of transgenes facilitates functional genomic research and holds prospect for gene therapy. The established microhomology-mediated end-joining (MMEJ)-based strategy leads to the precise gene knock-in with easily constructed donor, yet the limited efficiency remains to be further improved. Here, we show that single-strand DNA (ssDNA) donor contributes to efficient increase of knock-in efficiency and establishes a method to achieve the intracellular linearization of long ssDNA donor. We identified that the CRISPR/Cas9 system is responsible for breaking double-strand DNA (dsDNA) of palindromic structure in inverted terminal repeats (ITRs) region of recombinant adeno-associated virus (AAV), leading to the inhibition of viral second-strand DNA synthesis. Combing Cas9 plasmids targeting genome and ITR with AAV donor delivery, the precise knock-in of gene cassette was achieved, with 13-14% of the donor insertion events being mediated by MMEJ in HEK 293T cells. This study describes a novel method to integrate large single-strand transgene cassettes into the genomes, increasing knock-in efficiency by 13.6-19.5-fold relative to conventional AAV-mediated method. It also provides a comprehensive solution to the challenges of complicated production and difficult delivery with large exogenous fragments.
Fine organization of genomic regions tagged to the 5S rDNA locus of the bread wheat 5B chromosome.

PubMed

Sergeeva, Ekaterina M; Shcherban, Andrey B; Adonina, Irina G; Nesterov, Michail A; Beletsky, Alexey V; Rakitin, Andrey L; Mardanov, Andrey V; Ravin, Nikolai V; Salina, Elena A

2017-11-14

The multigene family encoding the 5S rRNA, one of the most important structurally-functional part of the large ribosomal subunit, is an obligate component of all eukaryotic genomes. 5S rDNA has long been a favored target for cytological and phylogenetic studies due to the inherent peculiarities of its structural organization, such as the tandem arrays of repetitive units and their high interspecific divergence. The complex polyploid nature of the genome of bread wheat, Triticum aestivum, and the technically difficult task of sequencing clusters of tandem repeats mean that the detailed organization of extended genomic regions containing 5S rRNA genes remains unclear. This is despite the recent progress made in wheat genomic sequencing. Using pyrosequencing of BAC clones, in this work we studied the organization of two distinct 5S rDNA-tagged regions of the 5BS chromosome of bread wheat. Three BAC-clones containing 5S rDNA were identified in the 5BS chromosome-specific BAC-library of Triticum aestivum. Using the results of pyrosequencing and assembling, we obtained six 5S rDNA- containing contigs with a total length of 140,417 bp, and two sets (pools) of individual 5S rDNA sequences belonging to separate, but closely located genomic regions on the 5BS chromosome. Both regions are characterized by the presence of approximately 70-80 copies of 5S rDNA, however, they are completely different in their structural organization. The first region contained highly diverged short-type 5S rDNA units that were disrupted by multiple insertions of transposable elements. The second region contained the more conserved long-type 5S rDNA, organized as a single tandem array. FISH using probes specific to both 5S rDNA unit types showed differences in the distribution and intensity of signals on the chromosomes of polyploid wheat species and their diploid progenitors. A detailed structural organization of two closely located 5S rDNA-tagged genomic regions on the 5BS chromosome of bread wheat has been established. These two regions differ in the organization of both 5S rDNA and the neighboring sequences comprised of transposable elements, implying different modes of evolution for these regions.
CID-miRNA: A web server for prediction of novel miRNA precursors in human genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tyagi, Sonika; Vaz, Candida; Gupta, Vipin

2008-08-08

microRNAs (miRNA) are a class of non-protein coding functional RNAs that are thought to regulate expression of target genes by direct interaction with mRNAs. miRNAs have been identified through both experimental and computational methods in a variety of eukaryotic organisms. Though these approaches have been partially successful, there is a need to develop more tools for detection of these RNAs as they are also thought to be present in abundance in many genomes. In this report we describe a tool and a web server, named CID-miRNA, for identification of miRNA precursors in a given DNA sequence, utilising secondary structure-based filteringmore » systems and an algorithm based on stochastic context free grammar trained on human miRNAs. CID-miRNA analyses a given sequence using a web interface, for presence of putative miRNA precursors and the generated output lists all the potential regions that can form miRNA-like structures. It can also scan large genomic sequences for the presence of potential miRNA precursors in its stand-alone form. The web server can be accessed at (http://mirna.jnu.ac.in/cidmirna/)« less
Effect of gene polymorphisms on periodontal diseases

PubMed Central

Tarannum, Fouzia; Faizuddin, Mohamed

2012-01-01

Periodontal diseases are inflammatory diseases of supporting structures of the tooth. It results in the destruction of the supporting structures and most of the destructive processes involved are host derived. The processes leading to destruction and regeneration of the destroyed tissues are of great interest to both researchers and clinicians. The selective susceptibility of subjects for periodontitis has remained an enigma and wide varieties of risk factors have been implicated for the manifestation and progression of periodontitis. Genetic factors have been a new addition to the list of risk factors for periodontal diseases. With the availability of human genome sequence and the knowledge of the complement of the genes, it should be possible to identify the metabolic pathways involved in periodontal destruction and regeneration. Most forms of periodontitis represent a life-long account of interactions between the genome, behaviour, and environment. The current practical utility of genetic knowledge in periodontitis is limited. The information contained within the human genome can potentially lead to a better understanding of the control mechanisms modulating the production of inflammatory mediators as well as provides potential therapeutic targets for periodontal disease. Allelic variants at multiple gene loci probably influence periodontitis susceptibility. PMID:22754216

Design and Validation of CRISPR/Cas9 Systems for Targeted Gene Modification in Induced Pluripotent Stem Cells.

PubMed

Lee, Ciaran M; Zhu, Haibao; Davis, Timothy H; Deshmukh, Harshahardhan; Bao, Gang

2017-01-01

The CRISPR/Cas9 system is a powerful tool for precision genome editing. The ability to accurately modify genomic DNA in situ with single nucleotide precision opens up new possibilities for not only basic research but also biotechnology applications and clinical translation. In this chapter, we outline the procedures for design, screening, and validation of CRISPR/Cas9 systems for targeted modification of coding sequences in the human genome and how to perform genome editing in induced pluripotent stem cells with high efficiency and specificity.
[The application of genome editing in identification of plant gene function and crop breeding].

PubMed

Zhou, Xiang-chun; Xing, Yong-zhong

2016-03-01

Plant genome can be modified via current biotechnology with high specificity and excellent efficiency. Zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system are the key engineered nucleases used in the genome editing. Genome editing techniques enable gene targeted mutagenesis, gene knock-out, gene insertion or replacement at the target sites during the endogenous DNA repair process, including non-homologous end joining (NHEJ) and homologous recombination (HR), triggered by the induction of DNA double-strand break (DSB). Genome editing has been successfully applied in the genome modification of diverse plant species, such as Arabidopsis thaliana, Oryza sativa, and Nicotiana tabacum. In this review, we summarize the application of genome editing in identification of plant gene function and crop breeding. Moreover, we also discuss the improving points of genome editing in crop precision genetic improvement for further study.
[Genome editing of industrial microorganism].

PubMed

Zhu, Linjiang; Li, Qi

2015-03-01

Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.
Computer-based prediction of mitochondria-targeting peptides.

PubMed

Martelli, Pier Luigi; Savojardo, Castrense; Fariselli, Piero; Tasco, Gianluca; Casadio, Rita

2015-01-01

Computational methods are invaluable when protein sequences, directly derived from genomic data, need functional and structural annotation. Subcellular localization is a feature necessary for understanding the protein role and the compartment where the mature protein is active and very difficult to characterize experimentally. Mitochondrial proteins encoded on the cytosolic ribosomes carry specific patterns in the precursor sequence from where it is possible to recognize a peptide targeting the protein to its final destination. Here we discuss to which extent it is feasible to develop computational methods for detecting mitochondrial targeting peptides in the precursor sequences and benchmark our and other methods on the human mitochondrial proteins endowed with experimentally characterized targeting peptides. Furthermore, we illustrate our newly implemented web server and its usage on the whole human proteome in order to infer mitochondrial targeting peptides, their cleavage sites, and whether the targeting peptide regions contain or not arginine-rich recurrent motifs. By this, we add some other 2,800 human proteins to the 124 ones already experimentally annotated with a mitochondrial targeting peptide.
Genomics of coloration in natural animal populations.

PubMed

San-Jose, Luis M; Roulin, Alexandre

2017-07-05

Animal coloration has traditionally been the target of genetic and evolutionary studies. However, until very recently, the study of the genetic basis of animal coloration has been mainly restricted to model species, whereas research on non-model species has been either neglected or mainly based on candidate approaches, and thereby limited by the knowledge obtained in model species. Recent high-throughput sequencing technologies allow us to overcome previous limitations, and open new avenues to study the genetic basis of animal coloration in a broader number of species and colour traits, and to address the general relevance of different genetic structures and their implications for the evolution of colour. In this review, we highlight aspects where genome-wide studies could be of major utility to fill in the gaps in our understanding of the biology and evolution of animal coloration. The new genomic approaches have been promptly adopted to study animal coloration although substantial work is still needed to consider a larger range of species and colour traits, such as those exhibiting continuous variation or based on reflective structures. We argue that a robust advancement in the study of animal coloration will also require large efforts to validate the functional role of the genes and variants discovered using genome-wide tools.This article is part of the themed issue 'Animal coloration: production, perception, function and application'. © 2017 The Author(s).
Bioinformatics and variability in drug response: a protein structural perspective

PubMed Central

Lahti, Jennifer L.; Tang, Grace W.; Capriotti, Emidio; Liu, Tianyun; Altman, Russ B.

2012-01-01

Marketed drugs frequently perform worse in clinical practice than in the clinical trials on which their approval is based. Many therapeutic compounds are ineffective for a large subpopulation of patients to whom they are prescribed; worse, a significant fraction of patients experience adverse effects more severe than anticipated. The unacceptable risk–benefit profile for many drugs mandates a paradigm shift towards personalized medicine. However, prior to adoption of patient-specific approaches, it is useful to understand the molecular details underlying variable drug response among diverse patient populations. Over the past decade, progress in structural genomics led to an explosion of available three-dimensional structures of drug target proteins while efforts in pharmacogenetics offered insights into polymorphisms correlated with differential therapeutic outcomes. Together these advances provide the opportunity to examine how altered protein structures arising from genetic differences affect protein–drug interactions and, ultimately, drug response. In this review, we first summarize structural characteristics of protein targets and common mechanisms of drug interactions. Next, we describe the impact of coding mutations on protein structures and drug response. Finally, we highlight tools for analysing protein structures and protein–drug interactions and discuss their application for understanding altered drug responses associated with protein structural variants. PMID:22552919
The three-dimensional genome organization of Drosophila melanogaster through data integration.

PubMed

Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

2017-07-31

Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.
Target Site Recognition by a Diversity-Generating Retroelement

PubMed Central

Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

2011-01-01

Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

PubMed Central

Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

2015-01-01

The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically regulating important complex quantitative agronomic traits in chickpea. The numerous informative genome-wide SNPs, natural allelic diversity-led domestication pattern, and LD-based information generated in our study have got multidimensional applicability with respect to chickpea genomics-assisted breeding. PMID:25873920
Not all predicted CRISPR-Cas systems are equal: isolated cas genes and classes of CRISPR like elements.

PubMed

Zhang, Quan; Ye, Yuzhen

2017-02-06

The CRISPR-Cas systems in prokaryotes are RNA-guided immune systems that target and deactivate foreign nucleic acids. A typical CRISPR-Cas system consists of a CRISPR array of repeat and spacer units, and a locus of cas genes. The CRISPR and the cas locus are often located next to each other in the genomes. However, there is no quantitative estimate of the co-location. In addition, ad-hoc studies have shown that some non-CRISPR genomic elements contain repeat-spacer-like structures and are mistaken as CRISPRs. Using available genome sequences, we observed that a significant number of genomes have isolated cas loci and/or CRISPRs. We found that 11%, 22% and 28% of the type I, II and III cas loci are isolated (without CRISPRs in the same genomes at all or with CRISPRs distant in the genomes), respectively. We identified a large number of genomic elements that superficially reassemble CRISPRs but don't contain diverse spacers and have no companion cas genes. We called these elements false-CRISPRs and further classified them into groups, including tandem repeats and Staphylococcus aureus repeat (STAR)-like elements. This is the first systematic study to collect and characterize false-CRISPR elements. We demonstrated that false-CRISPRs could be used to reduce the false annotation of CRISPRs, therefore showing them to be useful for improving the annotation of CRISPR-Cas systems.
In the hunt for genomic markers of metabolic resistance to pyrethroids in the mosquito Aedes aegypti: An integrated next-generation sequencing approach.

PubMed

Faucon, Frederic; Gaude, Thierry; Dusfour, Isabelle; Navratil, Vincent; Corbel, Vincent; Juntarajumnong, Waraporn; Girod, Romain; Poupardin, Rodolphe; Boyer, Frederic; Reynaud, Stephane; David, Jean-Philippe

2017-04-01

The capacity of Aedes mosquitoes to resist chemical insecticides threatens the control of major arbovirus diseases worldwide. Until alternative control tools are widely deployed, monitoring insecticide resistance levels and identifying resistance mechanisms in field mosquito populations is crucial for implementing appropriate management strategies. Metabolic resistance to pyrethroids is common in Aedes aegypti but the monitoring of the dynamics of resistant alleles is impeded by the lack of robust genomic markers. In an attempt to identify the genomic bases of metabolic resistance to deltamethrin, multiple resistant and susceptible populations originating from various continents were compared using both RNA-seq and a targeted DNA-seq approach focused on the upstream regions of detoxification genes. Multiple detoxification enzymes were over transcribed in resistant populations, frequently associated with an increase in their gene copy number. Targeted sequencing identified potential promoter variations associated with their over transcription. Non-synonymous variations affecting detoxification enzymes were also identified in resistant populations. This study not only confirmed the role of gene copy number variations as a frequent cause of the over expression of detoxification enzymes associated with insecticide resistance in Aedes aegypti but also identified novel genomic resistance markers potentially associated with their cis-regulation and modifications of their protein structure conformation. As for gene transcription data, polymorphism patterns were frequently conserved within regions but differed among continents confirming the selection of different resistance factors worldwide. Overall, this study paves the way of the identification of a comprehensive set of genomic markers for monitoring the spatio-temporal dynamics of the variety of insecticide resistance mechanisms in Aedes aegypti.
Genome and transcriptome sequencing in prospective metastatic triple-negative breast cancer uncovers therapeutic vulnerabilities.

PubMed

Craig, David W; O'Shaughnessy, Joyce A; Kiefer, Jeffrey A; Aldrich, Jessica; Sinari, Shripad; Moses, Tracy M; Wong, Shukmei; Dinh, Jennifer; Christoforides, Alexis; Blum, Joanne L; Aitelli, Cristi L; Osborne, Cynthia R; Izatt, Tyler; Kurdoglu, Ahmet; Baker, Angela; Koeman, Julie; Barbacioru, Catalin; Sakarya, Onur; De La Vega, Francisco M; Siddiqui, Asim; Hoang, Linh; Billings, Paul R; Salhia, Bodour; Tolcher, Anthony W; Trent, Jeffrey M; Mousses, Spyro; Von Hoff, Daniel; Carpten, John D

2013-01-01

Triple-negative breast cancer (TNBC) is characterized by the absence of expression of estrogen receptor, progesterone receptor, and HER-2. Thirty percent of patients recur after first-line treatment, and metastatic TNBC (mTNBC) has a poor prognosis with median survival of one year. Here, we present initial analyses of whole genome and transcriptome sequencing data from 14 prospective mTNBC. We have cataloged the collection of somatic genomic alterations in these advanced tumors, particularly those that may inform targeted therapies. Genes mutated in multiple tumors included TP53, LRP1B, HERC1, CDH5, RB1, and NF1. Notable genes involved in focal structural events were CTNNA1, PTEN, FBXW7, BRCA2, WT1, FGFR1, KRAS, HRAS, ARAF, BRAF, and PGCP. Homozygous deletion of CTNNA1 was detected in 2 of 6 African Americans. RNA sequencing revealed consistent overexpression of the FOXM1 gene when tumor gene expression was compared with nonmalignant breast samples. Using an outlier analysis of gene expression comparing one cancer with all the others, we detected expression patterns unique to each patient's tumor. Integrative DNA/RNA analysis provided evidence for deregulation of mutated genes, including the monoallelic expression of TP53 mutations. Finally, molecular alterations in several cancers supported targeted therapeutic intervention on clinical trials with known inhibitors, particularly for alterations in the RAS/RAF/MEK/ERK and PI3K/AKT/mTOR pathways. In conclusion, whole genome and transcriptome profiling of mTNBC have provided insights into somatic events occurring in this difficult to treat cancer. These genomic data have guided patients to investigational treatment trials and provide hypotheses for future trials in this irremediable cancer.
Discovery of Anthelmintic Drug Targets and Drugs Using Chokepoints in Nematode Metabolic Pathways

PubMed Central

Taylor, Christina M.; Wang, Qi; Rosa, Bruce A.; Huang, Stanley Ching-Cheng; Powell, Kerrie; Schedl, Tim; Pearce, Edward J.; Abubucker, Sahar; Mitreva, Makedonka

2013-01-01

Parasitic roundworm infections plague more than 2 billion people (1/3 of humanity) and cause drastic losses in crops and livestock. New anthelmintic drugs are urgently needed as new drug resistance and environmental concerns arise. A “chokepoint reaction” is defined as a reaction that either consumes a unique substrate or produces a unique product. A chokepoint analysis provides a systematic method of identifying novel potential drug targets. Chokepoint enzymes were identified in the genomes of 10 nematode species, and the intersection and union of all chokepoint enzymes were found. By studying and experimentally testing available compounds known to target proteins orthologous to nematode chokepoint proteins in public databases, this study uncovers features of chokepoints that make them successful drug targets. Chemogenomic screening was performed on drug-like compounds from public drug databases to find existing compounds that target homologs of nematode chokepoints. The compounds were prioritized based on chemical properties frequently found in successful drugs and were experimentally tested using Caenorhabditis elegans. Several drugs that are already known anthelmintic drugs and novel candidate targets were identified. Seven of the compounds were tested in Caenorhabditis elegans and three yielded a detrimental phenotype. One of these three drug-like compounds, Perhexiline, also yielded a deleterious effect in Haemonchus contortus and Onchocerca lienalis, two nematodes with divergent forms of parasitism. Perhexiline, known to affect the fatty acid oxidation pathway in mammals, caused a reduction in oxygen consumption rates in C. elegans and genome-wide gene expression profiles provided an additional confirmation of its mode of action. Computational modeling of Perhexiline and its target provided structural insights regarding its binding mode and specificity. Our lists of prioritized drug targets and drug-like compounds have potential to expedite the discovery of new anthelmintic drugs with broad-spectrum efficacy. PMID:23935495
The CRISPR-Cas9 technology: Closer to the ultimate toolkit for targeted genome editing.

PubMed

Quétier, Francis

2016-01-01

The first period of plant genome editing was based on Agrobacterium; chemical mutagenesis by EMS (ethyl methanesulfonate) and ionizing radiations; each of these technologies led to randomly distributed genome modifications. The second period is associated with the discoveries of homing and meganuclease enzymes during the 80s and 90s, which were then engineered to provide efficient tools for targeted editing. From 2006 to 2012, a few crop plants were successfully and precisely modified using zinc-finger nucleases. A third wave of improvement in genome editing, which led to a dramatic decrease in off-target events, was achieved in 2009-2011 with the TALEN technology. The latest revolution surfaced in 2013 with the CRISPR-Cas9 system, whose high efficiency and technical ease of use is really impressive; scientists can use in-house kits or commercially available kits; the only two requirements are to carefully choose the location of the DNA double strand breaks to be induced and then to order an oligonucleotide. While this close-to- ultimate toolkit for targeted editing of genomes represents dramatic scientific progress which allows the development of more complex useful agronomic traits through synthetic biology, the social acceptance of genome editing remains regularly questioned by anti-GMO citizens and organizations. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins.

PubMed

Fujita, Toshitsugu; Yuno, Miyuki; Fujii, Hodaka

2016-04-01

The clustered regularly interspaced short palindromic repeats (CRISPR) system is widely used for various biological applications, including genome editing. We developed engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using CRISPR to isolate target genomic regions from cells for their biochemical characterization. In this study, we developed 'in vitro enChIP' using recombinant CRISPR ribonucleoproteins (RNPs) to isolate target genomic regions. in vitro enChIP has the great advantage over conventional enChIP of not requiring expression of CRISPR complexes in cells. We first showed that in vitro enChIP using recombinant CRISPR RNPs can be used to isolate target DNA from mixtures of purified DNA in a sequence-specific manner. In addition, we showed that this technology can be used to efficiently isolate target genomic regions, while retaining their intracellular molecular interactions, with negligible contamination from irrelevant genomic regions. Thus, in vitro enChIP technology is of potential use for sequence-specific isolation of DNA, as well as for identification of molecules interacting with genomic regions of interest in vivo in combination with downstream analysis. © 2016 The Authors. Genes to Cells published by Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.
Investigation of potential targets of Porphyromonas CRISPRs among the genomes of Porphyromonas species

PubMed Central

Shibasaki, Masaki; Maruyama, Fumito; Sekizaki, Tsutomu; Nakagawa, Ichiro

2017-01-01

The oral bacterial species Porphyromonas gingivalis, a periodontal pathogen, has plastic genomes that may be driven by homologous recombination with exogenous deoxyribonucleic acid (DNA) that is incorporated by natural transformation and conjugation. However, bacteriophages and plasmids, both of which are main resources of exogenous DNA, do not exist in the known P. gingivalis genomes. This could be associated with an adaptive immunity system conferred by clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated (cas) genes in P. gingivalis as well as innate immune systems such as a restriction-modification system. In a previous study, few immune targets were predicted for P. gingivalis CRISPR/Cas. In this paper, we analyzed 51 P. gingivalis genomes, which were newly sequenced, and publicly available genomes of 13 P. gingivalis and 46 other Porphyromonas species. We detected 6 CRISPR/Cas types (classified by sequence similarity of repeat) in P. gingivalis and 12 other types in the remaining species. The Porphyromonas CRISPR spacers with potential targets in the genus Porphyromonas were approximately 23 times more abundant than those with potential targets in other genus taxa (1,720/6,896 spacers vs. 74/6,896 spacers). Porphyromonas CRISPR/Cas may be involved in genome plasticity by exhibiting selective interference against intra- and interspecies nucleic acids. PMID:28837670
Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti.

PubMed

Häcker, Irina; Harrell Ii, Robert A; Eichner, Gerrit; Pilitt, Kristina L; O'Brochta, David A; Handler, Alfred M; Schetelig, Marc F

2017-03-07

Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting.
The Replication Focus Targeting Sequence (RFTS) Domain Is a DNA-competitive Inhibitor of Dnmt1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Syeda, Farisa; Fagan, Rebecca L.; Wean, Matthew

Dnmt1 (DNA methyltransferase 1) is the principal enzyme responsible for maintenance of cytosine methylation at CpG dinucleotides in the mammalian genome. The N-terminal replication focus targeting sequence (RFTS) domain of Dnmt1 has been implicated in subcellular localization, protein association, and catalytic function. However, progress in understanding its function has been limited by the lack of assays for and a structure of this domain. Here, we show that the naked DNA- and polynucleosome-binding activities of Dnmt1 are inhibited by the RFTS domain, which functions by virtue of binding the catalytic domain to the exclusion of DNA. Kinetic analysis with a fluorogenicmore » DNA substrate established the RFTS domain as a 600-fold inhibitor of Dnmt1 enzymatic activity. The crystal structure of the RFTS domain reveals a novel fold and supports a mechanism in which an RFTS-targeted Dnmt1-binding protein, such as Uhrf1, may activate Dnmt1 for DNA binding.« less
Harnessing CRISPR-Cas systems for bacterial genome editing.

PubMed

Selle, Kurt; Barrangou, Rodolphe

2015-04-01

Manipulation of genomic sequences facilitates the identification and characterization of key genetic determinants in the investigation of biological processes. Genome editing via clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) constitutes a next-generation method for programmable and high-throughput functional genomics. CRISPR-Cas systems are readily reprogrammed to induce sequence-specific DNA breaks at target loci, resulting in fixed mutations via host-dependent DNA repair mechanisms. Although bacterial genome editing is a relatively unexplored and underrepresented application of CRISPR-Cas systems, recent studies provide valuable insights for the widespread future implementation of this technology. This review summarizes recent progress in bacterial genome editing and identifies fundamental genetic and phenotypic outcomes of CRISPR targeting in bacteria, in the context of tool development, genome homeostasis, and DNA repair. Copyright © 2015 Elsevier Ltd. All rights reserved.
Recent Advances in Genome Editing Using CRISPR/Cas9.

PubMed

Ding, Yuduan; Li, Hong; Chen, Ling-Ling; Xie, Kabin

2016-01-01

The CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9 (CRISPR-associated nuclease 9) system is a versatile tool for genome engineering that uses a guide RNA (gRNA) to target Cas9 to a specific sequence. This simple RNA-guided genome-editing technology has become a revolutionary tool in biology and has many innovative applications in different fields. In this review, we briefly introduce the Cas9-mediated genome-editing method, summarize the recent advances in CRISPR/Cas9 technology, and discuss their implications for plant research. To date, targeted gene knockout using the Cas9/gRNA system has been established in many plant species, and the targeting efficiency and capacity of Cas9 has been improved by optimizing its expression and that of its gRNA. The CRISPR/Cas9 system can also be used for sequence-specific mutagenesis/integration and transcriptional control of target genes. We also discuss off-target effects and the constraint that the protospacer-adjacent motif (PAM) puts on CRISPR/Cas9 genome engineering. To address these problems, a number of bioinformatic tools are available to help design specific gRNAs, and new Cas9 variants and orthologs with high fidelity and alternative PAM specificities have been engineered. Owing to these recent efforts, the CRISPR/Cas9 system is becoming a revolutionary and flexible tool for genome engineering. Adoption of the CRISPR/Cas9 technology in plant research would enable the investigation of plant biology at an unprecedented depth and create innovative applications in precise crop breeding.

In trans paired nicking triggers seamless genome editing without double-stranded DNA cutting.

PubMed

Chen, Xiaoyu; Janssen, Josephine M; Liu, Jin; Maggio, Ignazio; 't Jong, Anke E J; Mikkers, Harald M M; Gonçalves, Manuel A F V

2017-09-22

Precise genome editing involves homologous recombination between donor DNA and chromosomal sequences subjected to double-stranded DNA breaks made by programmable nucleases. Ideally, genome editing should be efficient, specific, and accurate. However, besides constituting potential translocation-initiating lesions, double-stranded DNA breaks (targeted or otherwise) are mostly repaired through unpredictable and mutagenic non-homologous recombination processes. Here, we report that the coordinated formation of paired single-stranded DNA breaks, or nicks, at donor plasmids and chromosomal target sites by RNA-guided nucleases based on CRISPR-Cas9 components, triggers seamless homology-directed gene targeting of large genetic payloads in human cells, including pluripotent stem cells. Importantly, in addition to significantly reducing the mutagenicity of the genome modification procedure, this in trans paired nicking strategy achieves multiplexed, single-step, gene targeting, and yields higher frequencies of accurately edited cells when compared to the standard double-stranded DNA break-dependent approach.CRISPR-Cas9-based gene editing involves double-strand breaks at target sequences, which are often repaired by mutagenic non-homologous end-joining. Here the authors use Cas9 nickases to generate coordinated single-strand breaks in donor and target DNA for precise homology-directed gene editing.
Computational approach for elucidating interactions of cross-species miRNAs and their targets in Flaviviruses.

PubMed

Shinde, Santosh P; Banerjee, Amit Kumar; Arora, Neelima; Murty, U S N; Sripathi, Venkateswara Rao; Pal-Bhadra, Manika; Bhadra, Utpal

2015-03-01

Combating viral diseases has been a challenging task since time immemorial. Available molecular approaches are limited and not much effective for this daunting task. MicroRNA based therapies have shown promise in recent times. MicroRNAs are tiny non-coding RNAs that regulate translational repression of target mRNA in highly specific manner. In this study, we have determined the target regions for human and viral microRNAs in the conserved genomic regions of selected viruses of Flaviviridae family using miRanda and performed a comparative target selectivity analysis among them. Specific target regions were determined and they were compared extensively among themselves by exploring their position to determine the vicinity. Based on the multiplicity and cooperativity analysis, interaction maps were developed manually to represent the interactions between top-ranking miRNAs and genomes of the viruses considered in this study. Self-organizing map (SOM) was used to cluster the best-ranked microRNAs based on the vital physicochemical properties. This study will provide deep insight into the interrelation of the viral and human microRNAs interactions with the selected Flaviviridae genomes and will help to identify cross-species microRNA targets on the viral genome.
Discovery of the leinamycin family of natural products by mining actinobacterial genomes

PubMed Central

Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen

2017-01-01

Nature’s ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF–SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF–SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm-type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature’s rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature’s biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity. PMID:29229819
Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

PubMed

Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

2017-12-26

Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.
Mining of potential drug targets through the identification of essential and analogous enzymes in the genomes of pathogens of Glycine max, Zea mays and Solanum lycopersicum.

PubMed

Silva, Rangeline Azevedo da; Pereira, Leandro de Mattos; Silveira, Melise Chaves; Jardim, Rodrigo; Miranda, Antonio Basilio de

2018-01-01

Pesticides are one of the most widely used pest and disease control measures in plant crops and their indiscriminate use poses a direct risk to the health of populations and environment around the world. As a result, there is a great need for the development of new, less toxic molecules to be employed against plant pathogens. In this work, we employed an in silico approach to study the genes coding for enzymes of the genomes of three commercially important plants, soybean (Glycine max), tomato (Solanum lycopersicum) and corn (Zea mays), as well as 15 plant pathogens (4 bacteria and 11 fungi), focusing on revealing a set of essential and non-homologous isofunctional enzymes (NISEs) that could be prioritized as drug targets. By combining sequence and structural data, we obtained an initial set of 568 cases of analogy, of which 97 were validated and further refined, revealing a subset of 29 essential enzymatic activities with a total of 119 different structural forms, most belonging to central metabolic routes, including the carbohydrate metabolism, the metabolism of amino acids, among others. Further, another subset of 26 enzymatic activities possess a tertiary structure specific for the pathogen, not present in plants, men and Apis mellifera, which may be of importance for the development of specific enzymatic inhibitors against plant diseases that are less harmful to humans and the environment.
A programmable method for massively parallel targeted sequencing

PubMed Central

Hopmans, Erik S.; Natsoulis, Georges; Bell, John M.; Grimes, Susan M.; Sieh, Weiva; Ji, Hanlee P.

2014-01-01

We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy. PMID:24782526
Automated crystallographic system for high-throughput protein structure determination.

PubMed

Brunzelle, Joseph S; Shafaee, Padram; Yang, Xiaojing; Weigand, Steve; Ren, Zhong; Anderson, Wayne F

2003-07-01

High-throughput structural genomic efforts require software that is highly automated, distributive and requires minimal user intervention to determine protein structures. Preliminary experiments were set up to test whether automated scripts could utilize a minimum set of input parameters and produce a set of initial protein coordinates. From this starting point, a highly distributive system was developed that could determine macromolecular structures at a high throughput rate, warehouse and harvest the associated data. The system uses a web interface to obtain input data and display results. It utilizes a relational database to store the initial data needed to start the structure-determination process as well as generated data. A distributive program interface administers the crystallographic programs which determine protein structures. Using a test set of 19 protein targets, 79% were determined automatically.
Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement.

PubMed

Butler, J B; Vaillancourt, R E; Potts, B M; Lee, D J; King, G J; Baten, A; Shepherd, M; Freeman, J S

2017-05-22

Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomic information between these important taxa and provide further insights into the rate of structural change in tree genomes. We constructed three high density linkage maps for two Corymbia species (Corymbia citriodora subsp. variegata and Corymbia torelliana) which were used to compare genome structure between both species and Eucalyptus grandis. Genome structure was highly conserved between the Corymbia species. However, the comparison of Corymbia and E. grandis suggests large (from 1-13 MB) intra-chromosomal rearrangements have occurred on seven of the 11 chromosomes. Most rearrangements were supported through comparisons of the three independent Corymbia maps to the E. grandis genome sequence, and to other independently constructed Eucalyptus linkage maps. These are the first large scale chromosomal rearrangements discovered between eucalypts. Nonetheless, in the general context of plants, the genomic structure of the two genera was remarkably conserved; adding to a growing body of evidence that conservation of genome structure is common amongst woody angiosperms.
Genomic instability in human cancer: Molecular insights and opportunities for therapeutic attack and prevention through diet and nutrition

PubMed Central

Ferguson, Lynnette R.; Chen, Helen; Collins, Andrew R.; Connell, Marisa; Damia, Giovanna; Dasgupta, Santanu; Malhotra, Meenakshi; Meeker, Alan K.; Amedei, Amedeo; Amin, Amr; Ashraf, S. Salman; Aquilano, Katia; Azmi, Asfar S.; Bhakta, Dipita; Bilsland, Alan; Boosani, Chandra S.; Chen, Sophie; Ciriolo, Maria Rosa; Fujii, Hiromasa; Guha, Gunjan; Halicka, Dorota; Helferich, William G.; Keith, W. Nicol; Mohammed, Sulma I.; Niccolai, Elena; Yang, Xujuan; Honoki, Kanya; Parslow, Virginia R.; Prakash, Satya; Rezazadeh, Sarallah; Shackelford, Rodney E.; Sidransky, David; Tran, Phuoc T.; Yang, Eddy S.; Maxwell, Christopher A.

2015-01-01

Genomic instability can initiate cancer, augment progression, and influence the overall prognosis of the affected patient. Genomic instability arises from many different pathways, such as telomere damage, centrosome amplification, epigenetic modifications, and DNA damage from endogenous and exogenous sources, and can be perpetuating, or limiting, through the induction of mutations or aneuploidy, both enabling and catastrophic. Many cancer treatments induce DNA damage to impair cell division on a global scale but it is accepted that personalized treatments, those that are tailored to the particular patient and type of cancer, must also be developed. In this review, we detail the mechanisms from which genomic instability arises and can lead to cancer, as well as treatments and measures that prevent genomic instability or take advantage of the cellular defects caused by genomic instability. In particular, we identify and discuss five priority targets against genomic instability: (1) prevention of DNA damage; (2) enhancement of DNA repair; (3) targeting deficient DNA repair; (4) impairing centrosome clustering; and, (5) inhibition of telomerase activity. Moreover, we highlight vitamin D and B, selenium, carotenoids, PARP inhibitors, resveratrol, and isothiocyanates as priority approaches against genomic instability. The prioritized target sites and approaches were cross validated to identify potential synergistic effects on a number of important areas of cancer biology. PMID:25869442
Application of industrial scale genomics to discovery of therapeutic targets in heart failure.

PubMed

Mehraban, F; Tomlinson, J E

2001-12-01

In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.
WGE: a CRISPR database for genome engineering.

PubMed

Hodgkins, Alex; Farne, Anna; Perera, Sajith; Grego, Tiago; Parry-Smith, David J; Skarnes, William C; Iyer, Vivek

2015-09-15

The rapid development of CRISPR-Cas9 mediated genome editing techniques has given rise to a number of online and stand-alone tools to find and score CRISPR sites for whole genomes. Here we describe the Wellcome Trust Sanger Institute Genome Editing database (WGE), which uses novel methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks. WGE is open, extensible and can be set up to compute and present CRISPR sites for any genome. The WGE database is freely available at www.sanger.ac.uk/htgt/wge : vvi@sanger.ac.uk or skarnes@sanger.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Investigating the Biosynthesis of Natural Products from Marine Proteobacteria: A Survey of Molecules and Strategies

PubMed Central

Timmermans, Marshall L.; Paudel, Yagya P.; Ross, Avena C.

2017-01-01

The phylum proteobacteria contains a wide array of Gram-negative marine bacteria. With recent advances in genomic sequencing, genome analysis, and analytical chemistry techniques, a whole host of information is being revealed about the primary and secondary metabolism of marine proteobacteria. This has led to the discovery of a growing number of medically relevant natural products, including novel leads for the treatment of multidrug-resistant Staphylococcus aureus (MRSA) and cancer. Of equal interest, marine proteobacteria produce natural products whose structure and biosynthetic mechanisms differ from those of their terrestrial and actinobacterial counterparts. Notable features of secondary metabolites produced by marine proteobacteria include halogenation, sulfur-containing heterocycles, non-ribosomal peptides, and polyketides with unusual biosynthetic logic. As advances are made in the technology associated with functional genomics, such as computational sequence analysis, targeted DNA manipulation, and heterologous expression, it has become easier to probe the mechanisms for natural product biosynthesis. This review will focus on genomics driven approaches to understanding the biosynthetic mechanisms for natural products produced by marine proteobacteria. PMID:28762997
Investigating the Biosynthesis of Natural Products from Marine Proteobacteria: A Survey of Molecules and Strategies.

PubMed

Timmermans, Marshall L; Paudel, Yagya P; Ross, Avena C

2017-08-01

The phylum proteobacteria contains a wide array of Gram-negative marine bacteria. With recent advances in genomic sequencing, genome analysis, and analytical chemistry techniques, a whole host of information is being revealed about the primary and secondary metabolism of marine proteobacteria. This has led to the discovery of a growing number of medically relevant natural products, including novel leads for the treatment of multidrug-resistant Staphylococcus aureus (MRSA) and cancer. Of equal interest, marine proteobacteria produce natural products whose structure and biosynthetic mechanisms differ from those of their terrestrial and actinobacterial counterparts. Notable features of secondary metabolites produced by marine proteobacteria include halogenation, sulfur-containing heterocycles, non-ribosomal peptides, and polyketides with unusual biosynthetic logic. As advances are made in the technology associated with functional genomics, such as computational sequence analysis, targeted DNA manipulation, and heterologous expression, it has become easier to probe the mechanisms for natural product biosynthesis. This review will focus on genomics driven approaches to understanding the biosynthetic mechanisms for natural products produced by marine proteobacteria.
Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM)

PubMed Central

Skinnider, Michael A.; Dejong, Chris A.; Rees, Philip N.; Johnston, Chad W.; Li, Haoxin; Webster, Andrew L. H.; Wyatt, Morgan A.; Magarvey, Nathan A.

2015-01-01

Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. PMID:26442528
Social networks to biological networks: systems biology of Mycobacterium tuberculosis.

PubMed

Vashisht, Rohit; Bhardwaj, Anshu; Osdd Consortium; Brahmachari, Samir K

2013-07-01

Contextualizing relevant information to construct a network that represents a given biological process presents a fundamental challenge in the network science of biology. The quality of network for the organism of interest is critically dependent on the extent of functional annotation of its genome. Mostly the automated annotation pipelines do not account for unstructured information present in volumes of literature and hence large fraction of genome remains poorly annotated. However, if used, this information could substantially enhance the functional annotation of a genome, aiding the development of a more comprehensive network. Mining unstructured information buried in volumes of literature often requires manual intervention to a great extent and thus becomes a bottleneck for most of the automated pipelines. In this review, we discuss the potential of scientific social networking as a solution for systematic manual mining of data. Focusing on Mycobacterium tuberculosis, as a case study, we discuss our open innovative approach for the functional annotation of its genome. Furthermore, we highlight the strength of such collated structured data in the context of drug target prediction based on systems level analysis of pathogen.
Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954
A comprehensive overview of computational resources to aid in precision genome editing with engineered nucleases.

PubMed

Periwal, Vinita

2017-07-01

Genome editing with engineered nucleases (zinc finger nucleases, TAL effector nucleases s and Clustered regularly inter-spaced short palindromic repeats/CRISPR-associated) has recently been shown to have great promise in a variety of therapeutic and biotechnological applications. However, their exploitation in genetic analysis and clinical settings largely depends on their specificity for the intended genomic target. Large and complex genomes often contain highly homologous/repetitive sequences, which limits the specificity of genome editing tools and could result in off-target activity. Over the past few years, various computational approaches have been developed to assist the design process and predict/reduce the off-target activity of these nucleases. These tools could be efficiently used to guide the design of constructs for engineered nucleases and evaluate results after genome editing. This review provides a comprehensive overview of various databases, tools, web servers and resources for genome editing and compares their features and functionalities. Additionally, it also describes tools that have been developed to analyse post-genome editing results. The article also discusses important design parameters that could be considered while designing these nucleases. This review is intended to be a quick reference guide for experimentalists as well as computational biologists working in the field of genome editing with engineered nucleases. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A new comprehensive method for detection of livestock-related pathogenic viruses using a target enrichment system.

PubMed

Oba, Mami; Tsuchiaka, Shinobu; Omatsu, Tsutomu; Katayama, Yukie; Otomaru, Konosuke; Hirata, Teppei; Aoki, Hiroshi; Murata, Yoshiteru; Makino, Shinji; Nagai, Makoto; Mizutani, Tetsuya

2018-01-08

We tested usefulness of a target enrichment system SureSelect, a comprehensive viral nucleic acid detection method, for rapid identification of viral pathogens in feces samples of cattle, pigs and goats. This system enriches nucleic acids of target viruses in clinical/field samples by using a library of biotinylated RNAs with sequences complementary to the target viruses. The enriched nucleic acids are amplified by PCR and subjected to next generation sequencing to identify the target viruses. In many samples, SureSelect target enrichment method increased efficiencies for detection of the viruses listed in the biotinylated RNA library. Furthermore, this method enabled us to determine nearly full-length genome sequence of porcine parainfluenza virus 1 and greatly increased Breadth, a value indicating the ratio of the mapping consensus length in the reference genome, in pig samples. Our data showed usefulness of SureSelect target enrichment system for comprehensive analysis of genomic information of various viruses in field samples. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

PubMed Central

Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

2015-01-01

Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453
Nuclease-mediated genome editing: At the front-line of functional genomics technology.

PubMed

Sakuma, Tetsushi; Woltjen, Knut

2014-01-01

Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose. © 2014 The Authors Development, Growth & Differentiation © 2014 Japanese Society of Developmental Biologists.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.